[tor-bugs] #6266 [Tor]: maxmind geoip db is starting to label Tor relays as country "A1"
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Nov 30 13:52:42 UTC 2012
#6266: maxmind geoip db is starting to label Tor relays as country "A1"
------------------------+---------------------------------------------------
Reporter: arma | Owner:
Type: defect | Status: needs_review
Priority: normal | Milestone: Tor: 0.2.3.x-final
Component: Tor | Version:
Keywords: tor-client | Parent:
Points: | Actualpoints:
------------------------+---------------------------------------------------
Comment(by karsten):
Replying to [comment:14 ioerror]:
> i think ideally that blockfinder should do all of this work - please
consider adding code to blockfinder that does exactly what deanonymind
does and I'll merge it.
I disagree for two reasons. The first reason is that the current
`deanonymind.py` doesn't fit into blockfinder very well. `deanonymind.py`
takes two files as input (original MaxMind file and `geoip-manual`) and
produces three files as output (two modified MaxMind files and the `geoip`
file for tor). blockfinder is designed around a local IP-to-country
database cache with its usage modes being either to modify the cache or
request information from it. What we'd have to do to integrate
`deanonymind.py` is split it up into multiple modes to a) make a country
code "disappear" by automatically merging its entries with adjacent
entries, b) apply manual changes from a file, c) export to CSV in long and
short format. These changes are not impossible to make. However, I'm
currently lacking the developer time to make them.
The second and more important reason is that we should really avoid that
tor relies on blockfinder for ''creating'' the modified `geoip` file.
It's a great tool for ''verifying'' the output of those modifications, and
I can highly recommend it for that, but it should stay optional. The main
reason is that as many people as possible should be able to verify what
modifications we make to MaxMind's database. The current 194 lines of
Python in `deanonymind.py` and the 114 lines of documented manual changes
in `geoip-manual` are probably at the upper limit of what we can expect
enthusiastic community members to read and understand. And they can also
use diff or their favorite tool to do this verification, because we give
them all intermediate .csv files. Or they ''can'' use blockfinder if they
wish. But giving them the 844 lines of `blockfinder` to review, which
would probably grow far beyond 1000 lines when adding the A1-fixing
functionality, means that hardly anybody will check what's going on.
Note that this discussion is unrelated from using a database produced by
blockfinder as a general replacement for MaxMind's patched-up database.
Ideally, we'd take the RIR delegation files as input, maybe add LIR
information, run traceroutes and whatever else to confirm/contradict these
assignments, and basically produce our own IP-to-country database. I'm
willing to contribute more code to blockfinder to get closer to that,
e.g., a CSV export function. Please open blockfinder issues for features
you think are missing, and I might hack on them as time permits. But I
think that's a separate discussion. In my understanding, this is
something that can happen in 6--12 months from now, assuming we put enough
energy into it, but not earlier.
Unrelated to your concerns, I'd be interested in your thoughts on the
[https://gitweb.torproject.org/karsten/tor.git/blob/task-6266:/src/config
/geoip-manual geoip-manual file]. Can you review those manual changes?
If you have additional facts, it would be good to add them as comments.
And, of course, if you have contradicting facts, that would be even more
important to know.
Leaving in needs_review, because I still think this code should be
reviewed and merged into tor.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6266#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list