[tor-dev] Proposal: Check Maxmind GeoIP DB before distributing
Katharina Kohls
katharina.kohls at rub.de
Tue Jul 3 09:34:33 UTC 2018
Hi,
On 30.06.2018 13:53, Jaskaran Singh wrote:
> 5. Dealing with false positives
> Maxmind calculates geolocation of an IP addr using WHOIS records,
> Reverse DNS etc. It claims to have precision rate of 99.5% on country
> level. The other 0.5% is more likely to be those IP addresses for which
> neither WHOIS record nor Reverse DNS are setup.
>
> A very large percentage of Tor Nodes are run from datacenters, which
> usually have all their records set up. It's highly unlikely for an IP
> address belonging to a datacenter to be mapped to a wrong location.
>
> Hence, false positives would be very few, and can be safely ignored
> after a simple manual/scripted investigation.
We measured Tor relay locations a while ago using ICMP RTT measurements
from multiple server instances located in Europe, North America, Asia,
and Oceania. Using the minimum RTT for each connection*, we applied
multilateration for estimating the location of a relay. Even though this
approach is noisy because of varying network conditions and routes, we
still get a good estimate of the relay's actual position.
We compared our estimated ICMP relay locations with the GeoIP information:
- our test set consisted of a full consensus
- we conducted the measurements within 5 days and repeated reference
experiments a month later to test the stability of results
- we sent 500 pings per relay from 8 remote servers and repeated the
measurements multiple times
- we use the minimum RTT as input for the multilateration
Results can be summarized as follows:
- the median location error is in a range of 440km
- 287 outliers are more than 2654km away from the position that GeoIP
suggested. This represents ~4.6% of the tested relays
- the 75th percentile of nodes differs by more than 1000km
Currently we repeat the experiments with 16 instead of 8 servers and
work on improving the evaluation to improve the location estimate.
We cannot take these results as a ground truth, as a majority of GeoIP
locations already document the actual country and continent a relay is
in. Nevertheless, this is a good way to add an independent verification
step. The location error for the outliers is a proof that there are
nodes that actually run on a different continent and this is an
important security issue if users want to circumvent a certain country.
The same applies for the 75th percentile, which also leads to updated
country information for a significant set of relays.
We can conclude that yes, a large percentage of Tor nodes have OK
records. But the number of false positives is not that low and, from my
opinion, cannot be ignored. Besides an independent verification step,
for which I suggest timing measurements and multilateration, location
errors that lead to an updated country code should be considered as
update (or respective nodes should be flagged).
*this follows the motivation that no transmission can ever be faster
than a certain threshold, so the minimum RTT is the closest we can get
to this threshold
Cheers,
Katharina
More information about the tor-dev
mailing list