Fwd: [Wikitech-l] Planning to tighten TorBlock settings

Gregory Maxwell gmaxwell at gmail.com
Fri Apr 3 19:29:37 UTC 2009


On Fri, Apr 3, 2009 at 12:34 PM, Paul Syverson
<syverson at itd.nrl.navy.mil> wrote:
> On Fri, Apr 03, 2009 at 12:03:53PM -0400, Gregory Maxwell wrote:
>
>> To solve this issue I believe that TOR needs a strong pseudo-anonymous
>> system built in and available to users.  Something where Wikipedia can
>> block a single misbehaving user just as easily as they can without Tor
>> and that user can't simply mint 100 more accounts in a few minutes.
>> There have been various proposals for systems to accomplish this in
>> the past, but unless one is integrated and easily supported by
>> websites it will do no good.
>
> Building this into Tor would be pointless because, as you observed, it
> would not block misbehavior via the many other ways to be anonymous
> widely used by abusive people. Systems that require protection against
> abuse need some sort of authentication (possibly anonymous or
> pseudonymous) required for any access. Otherwise people can just
> manufacture identities at will and use them over any other path than
> Tor.  More accurately I should have said that it would be pointless
> to reduce overall abuse; it would just give one warm fuzzies that
> abuse via Tor was moved to the other mechanisms not designed to be
> as visible.

So I need to ask you to take a leap of faith here: I need you to simply accept
that:

* There are people involved in the Wikimedia project who care deeply
about TOR relevant issues like the freedom from censorship.
* That these people generally understand the issues, the technical
factors of how Tor works, and have spent a considerable amount of time
thinking about these issues, conducted measurements, etc.
* That these people understand the issues facing Wikipedia better than you do.

If you can't accept these things, at least for the purpose of
discussion, then there is no point to discussing anything further, you
will simply be ignored by the Wikipedia crowd and the TOR community
will continue to have virtually zero pull to influence their
behaviour.

With that out of the way…

I'm quite confident that it would not be pointless.

At any given moment the English Wikipedia is blocking write access
from a considerable portion of the total IPv4 address space.
Persistently write-blocked access includes almost every widely known
anonymization service (including the commercial offerings), the entire
address blocks of many of the lower priced colocation/shell providers,
many tens of thousands of open or partially open HTTP proxies, a large
number of ISP transparent proxies which have not been registered as
trusted XFF header sources. Frequently short term blocked ranges
include popular broadband providers that allocate from large
non-geographically specific pools, as well as entire universities and
branches of government.  Last I looked there was a good number of /16s
blocked.

The site administrators can selective grant block bypasses to users,
so if a regular good contributor is impacted when her entire
university gets blocks she can be exempted by simply requesting an
exemption.

Because of the aggressive use of both long and short term write access
blocking made possible by a general community indifference to
collateral damage against users who are not regular contributors (whom
can be exempted) the existing tools are *highly effective* at halting
repeated aggressively disruptive behaviour.

We must distinguish casually disruptive behaviour and aggressively
disruptive behaviour. For the former, someone scribbling nonsense into
an article, TOR is simply not a factor… these people are not going to
use Tor. If they had enough brains to get Tor installed they'd
probably come up with a more productive use of their time.  Few people
involved with Wikipedia think they are fighting that problem by
limiting Tor.  The latter type includes aggressive fairly intelligent
people who are willing to put in considerable time and effort. These
sorts of people will write their own attack software, and yes, they'll
use Tor.  These people produce damage in great disproportion to their
numbers.

Without Tor these people will have access to some number of yet
unblocked anonymization services, internet cafes, broadband provider
subnets, dialup accounts, etc.  They crop up cause some damage and the
avenue is closed with a (frequently permanent) write access block.
Once they have exhausted their available means finding new proxies is
time consuming and costly, especially since English Wikipedia has so
much exposure that all the low hanging fruit is blocked. The attacker
will usually give up, or at least be greatly slowed.

With Tor permitted they can simply continue to make their trouble
continually and without abatement for as long as they like … and can
do so at high rates of speed relative to the overall editing activity
on the site through massively parallel automated operation.

Prior to the existence of Tor awareness in mediawiki Tor exits were
blocked randomly as troublemakers cropped up on them. At any given
time only about 300 Wikipedia-reaching exits were blocked, which was
actually sufficient because most attackers were not quite smart enough
to figure out how to visit Wikipedia with named exits (which a serious
usability problem with Tor… because it's non-trivial to get exit
syntax working with vhosted sites).  ... but this also meant that
legit Tor users also couldn't edit either, as they were even less
likely to find and use working exits than the attackers.

> It is also not necessary to have a technical solution. Simply escrowing
> all edits coming via Tor until some editor is willing to check them
> would prevent abuse from ever getting to Wikipedia and would allow
> edits only at whatever rate Wikimedia chooses to devote resources.

This presumes a fairly narrow view of abuse, and I think a greatly
underestimated view of the total cost of escrowing.

Common forms of abuse include things like flooding history and logs
with abusive messages and the repeated posting of private information
which Wikipedia chooses not to disseminate for legal or ethical
reasons.  In fact these kinds of attacks probably now constitute a
majority of the activity from the aggressive attackers, as their goal
is usually to piss off the Wikipedia community, and these methods are
actually more effective to that end then screwing with the content. (A
particular perversion of the Wikipedia community, perhaps, but it is
what it is…)

Escrowing can't fix flooding, nor does escrowing make sense for
non-content materials.

Even for content escrowing isn't simple. When the most recent edit is
escrowed, hitting edit does what? Does it fork the non-escrowed text?
Does it subject the editor to the (possibly offensive or browser
crashing) escrowed material? If it was forked, now an approver must
not just approve but now must manually merge the text.  It becomes
messy quickly, and the Wikipedia community has already clearly sent
the message that however much it cares about Tor it does not care so
much as to do a lot of additional work to support it.

> Because of the incentives from failing ever to succeed via this
> path would mean that abusive submissions should be minimal; although
> that is moot from the perspective of Wikimedia resources devoted to
> it. I hashed this out with Jimbo several years ago, and he entirely
> agreed with me in the end.

That was then, this is now.  In any case, Jimmy touches on a lot of
areas but isn't deeply involved with dealing with these problems on a
daily basis, the people whom are have significantly different
positions.


> All that aside, my understanding is that precisely what you suggest
> has been designed, built, and proposed for integration to wikimedia
> more than once and was basically ignored: first Jason Holt's nym and
> then later the more fully featureful and developed nymble from some
> folks at Dartmouth.

s/wikimedia/mediawiki/

I'm aware of the nymble system. But it is completely integrated on the
Tor side. To make use of it you have to conduct some complicated dance
of copying around cryptographic blobs between websites, running
javascript cryptographic engines, etc.

Frankly, it was a dance that I'd never bother going through. It's
something which we could only reasonably expect troublemakers to use.
(And… perhaps a few Chinese dissidents— except their needs are
frequently met already by friends in other countries running closed
access https proxies for them)

(I believe it also, or at least one of them required the whole site to
be via HTTPS, which is pretty much a non-starter for economic reasons;
but I could be misremembering)

Neither of these systems are or can ever be a panacea: In the best
case they are a least a 2x force multiplier for attacker, though in
practice even more since they they also inhibit Wikipedia from
utilizing range blocks.  So allowing these systems systems would be a
compromise— once you factor in a decent amount of additional code to
maintain on the WP side, and that these systems were too complicated
for almost anyone except an attacker to use due to a lack of Tor
integration…  of course there hasn't been much interest in deploying
them.

If, instead, there was some magic torbutton nymthing that interfaced
to websites via a standard remote authentication mechanism like OpenID
which someone is actually going to maintain,  then perhaps you'd see
people actually using it. (and perhaps still not Wikimedia sites, but
at least there would be a greater chance).  It's certainly bound to be
more effective then some protest of blocking read access.



More information about the tor-talk mailing list