[tor-bugs] #5047 [Obfsproxy]: Implement basic usage statistics in obfsproxy
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Fri Feb 10 17:37:11 UTC 2012
#5047: Implement basic usage statistics in obfsproxy
-------------------------+--------------------------------------------------
Reporter: karsten | Owner: asn
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Obfsproxy | Version:
Keywords: | Parent:
Points: | Actualpoints:
-------------------------+--------------------------------------------------
Comment(by atagar):
> def writestats(statsstart, ips, geoipfile, fingerprint):
Personally I found this very difficult to read, and I suspect most of the
work and temporary variables are unnecessary. Here's a completely untested
function that, I think, will do the same thing...
{{{
def writestats(statsstart, ips, geoip_path, fingerprint):
if not ips: return # nothing to write
# matches against lines from the maxmind geoip db, such as...
# 18939904,19005439,JP
geoip_line = re.compile('^([\d]*),([\d]*),([A-Z0-9]{2})$')
# mapping of country codes to their ips
locale_to_ip = {}
# Ips in the geoip file are sorted, so sorting our listing so we can
just
# compare to the first element.
ips.sort()
geoip_processing: # used for targeted breaks, double check syntax (I
don't use this much...)
with open(geoip_path) as geoip_file:
for line in geoip_file:
m = geoip_line.match(line)
# skip line if it's not an entry (probably a blank line or comment)
if not m: continue
ip_start, ip_end, locale = m.groups()
# It's possible that we come before the current entry, which means
the
# file doesn't contain our ip. Pop off ips until that's no longer
the
# case.
while ip_start > ips[0]:
if not "??" in locale_to_ip:
locale_to_ip["??"] = []
locale_to_ip["??"].append(ips.pop(0))
if not ips: break geoip_processing
if ip_end < ips[0]:
continue # nope, our entry is later
elif ip_start < ips[0] and ip_end > ips[0]:
# entry matches
if not locale in locale_to_ip:
locale_to_ip[locale] = []
locale_to_ip[locale].append(ips.pop(0))
if not ips: break geoip_processing
# any remaining ips after processing the file are unknown
while ips:
if not "??" in locale_to_ip:
locale_to_ip["??"] = []
locale_to_ip["??"].append(ips.pop(0))
# now move on with printing stuff, probably something like...
for locale in locale_to_ip:
connection_count = len(locale_to_ip[locale])
unique_connection_count = len(set(locale_to_ip[locale]))
}}}
Besides that looks good, just some nitpicks...
> starting = re.compile('^([\\d\-:TZ]* )?\[info\] Starting.$')
> exiting = re.compile('^([\\d\-:TZ]* )?\[info\] Exiting.$')
> connect = re.compile('^([\\d\-:TZ]* )?\[[a-z]*\] ([\\d\\.]+):[\\d]+ ' +
> '\([a-z0-9]+\): trying to connect to [\\d\\.:]+$')
Move the starting/ending/connect regexes into header constants? When
regexes get substantial like this it's also nice to have a comment with
examples for what they match against (greatly helps readability).
> for idx, ree in enumerate(re_list):
Neat, the enumerate builtin is new to me. In writestats you use it more
than you need to (since you never use the 'idx'), and personally I'd do
away with re_list here so it just uses a tuple inline. But es no importa.
> for line in open(obfsproxylog, 'r'):
You actually probably want...
{{{
with open(obfsproxylog) as obsfproxylog_file:
for line in obsfproxylog_file:
...
}}}
This will close the file when you're done. Also, 'r' is the default mode
for the open method.
> if statsstart > 0:
If you initialize statsstart to sys.maxint then this isn't needed.
> del ips[:]
Why not just do "ips = []"? In either case you're only removing
references.
> ipn = int("%02x%02x%02x%02x" %
> (int(st[0]), int(st[1]), int(st[2]), int(st[3])), 16)
Looks like the map function could help you here.
{{{
>>> st = ["1", "2", "3", "4"]
>>> map(int, st)
[1, 2, 3, 4]
}}}
Cheers! -Damian
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/5047#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list