Proposal: MapAddress wilcards [*]
grarpamp
grarpamp at gmail.com
Fri Jun 12 19:36:27 UTC 2009
Probably should have gone to or-dev, not or-talk...
Many sites these days have multiple hosts in their domains. These
sites may have various administrative, logging or restrictive
policies. The same goes for the path to them if the user is unfortunate
enough to reside in strange lands.
As pure example, note myspace. They have myspace.com, full of
subdomains and hosts. There is also myspacecdn.com and a couple
others. They all work together to deliver the full content, images,
etc. This is common for load balancing, service segmentation and
so on.
Problem:
(A) Tor makes the use of MapAddress with sites that use multiple
hosts like these difficult and insufficient because:
1 - Each host requires another MapAddress statement.
2 - It is impossible to know all the hosts the site uses beforehand.
3 - The sites commonly change hosts on a whim.
And missing the mapping due to this could affect either the user
or the site in unintended ways. Mapping should be a bit smarter and
able to do the right thing. Users commonly desire to 'send all my
traffic for site x via exit y and make it just work'.
Solution:
(B) So the following feature is proposed. Allow wildcards in the
MapAddress function such that:
1 - MapAddress google.com=google.com.<exit>.exit
Is now, and should remain, single host specific as usual.
2 - MapAddress *.google.com=*.google.com.<exit>.exit
Matches any third level domain such as www.google.com, but obviously
not google.com itself, as that is handled by (1) above. The name
must have three levels to match.
3 - MapAddress **.google.com=**.google.com.<exit>.exit
Matches any third or deeper level domain such as a.b.c.d.google.com.
This is a sensible hack. It is meant to allow future expansion of
MapAddress to use some form of regex. Since '**' isn't really used
in regexes, it is a useful glob for this purpose of allowing
everything to match... which the user would _really_ want to have
happen easily, without resorting to the obvious further nonsense
in (4), which would be subject to the same problems in (A) above.
4 - MapAddress *.*.google.com=*.*.google.com.<exit>.exit
Matches any third and fourth domain. Only four level names would
match. This is a NON-proposal.
5 - *google.com
This is also a NON-proposal. It is too far down the path of some
form of regex for the quick fix this proposal is meant to be. And
it would obviously match all sorts of undesirables. The dots are
important in DNS.
Note that having globs on the right side of the '=' doesn't make
sense from a routing point of view, but it's not supposed to. It
is done so that scripts can continue to keep track and do things
like:
/bin/sh
# add map
dst=$src.$exit.exit
printf "authenticate \"foo\"\r\nmapaddress $src=$dst\r\nquit\r\n" |
# remove map
dst=$src
printf "authenticate \"foo\"\r\nmapaddress $src=$dst\r\nquit\r\n" |
And of course the below command should list the mappings as usual.
Both the static mapping that was entered, and the dynamic ones that
result from it...
getinfo address-mappings/all
google.com google.com.<exit>.exit NEVER
**.google.com **.google.com.<exit>.exit NEVER
google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:20"
mail.google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:22"
a.b.google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:24"
At this time, it is unimportant which rule the dynamic entry resulted
from as that is not denoted in the current versions of Tor. A simple
numeric tag in the first column of every rule would suffice for
that in the future.
More information about the tor-dev
mailing list