wikipedia vandalism

Roger Dingledine arma at mit.edu
Mon Jan 24 11:04:52 UTC 2005


On Sat, Jan 22, 2005 at 11:29:22PM +0100, Frank v Waveren wrote:
> Link fixed: http://mail.wikipedia.org/pipermail/wikien-l/2004-February/010605.html
> 
> > Could you summarize for us to reasons why
> > Wikipedia doesn't want to require users _from Tor IPs_ to create accounts
> > in order to edit pages?
> Because vandals will just create an account, vandalise with it, get
> blocked, create a new account, vandalise with it, etc.

Hi Frank,

I agree, Wikipedia's current "make an account" system does
not do what you need -- make it hard for vandals to do their
thing. Yahoo and other sites address this problem by using captchas
(http://en.wikipedia.org/wiki/Captcha) to add cost to creating an account.

You're in a pretty tricky situation, given that you want to keep allowing
account-less edits, yet there are literally millions of open proxies
and compromised machines out there that vandals can use. By blocking Tor
exit nodes, the people who want privacy via Tor can't edit wikipedia now,
but the vandals still have plenty of open proxies they can use.

Notice that there are actual legitimate people doing real edits over
Tor. You've heard from some of them in private mail lately, and others
posted on your "Talk" wiki. Even in the original thread about this on
the wikipedia list, there were people who didnt want an overall ban, e.g.,
http://mail.wikipedia.org/pipermail/wikien-l/2004-February/010666.html
http://mail.wikipedia.org/pipermail/wikien-l/2004-February/010613.html

One approach that's in the middle ground would be to require logins,
including captchas, and track edits by accounts like you do now. If
you notice abuse then roll back everything the account has done and
cancel it. If there's repeated unmanageable abuse from Tor, block
it for an hour or something until the guy gets bored. Alternatively,
you could preemptively put edits from certain accounts into a "has
to be manually approved" queue. Optionally, if the first N edits are
approved then further edits can be automatic as now. If you worry about
the barrier-to-entry for normal wikipedia users, all of this can be
selectively applied only to IPs that you flag as suspicious.

So, there are some solutions that could provide much better protection
without too much more work, if you decide that being able to get edits
from Tor users is worthwhile.

(You mention in your Talk wiki that you used to run a Tor node? I think
you might be confused about how Tor works, since you have never run a
Tor node to my knowledge. Most Tor users are just clients; I'm guessing
that's what you ran. We have probably upwards of 10k-20k people using Tor
currently, and I imagine some of them do, or would like to do, Wikipedia
edits. As a trivial example, I noticed a grammar problem while browsing
the entry on Svalbard today, and I can't fix it. Oh well.)

It seems the culmination of the thread from the February 2004 wikipedia
list is the statement
"In general, I like living in a world with anonymous proxies.  I wish
them well.  There are many valid uses for them.  But, writing on
Wikipedia is not one of the valid uses."

If this is truly the concensus of the wikipedia community -- that
wikipedia values equal access for all, except when it comes to people
who value their privacy -- then I guess the discussion is over. I think
in that case it ends in Wikipedia's loss, since you block a few IPs,
yet there are still many IPs left for vandals to use that you do not
block. As the Tor network grows, you will be blocking more and more
potentially useful users, yet not really impacting the number of IPs
available to the vandals.

Now, I think you are right, it's reasonable to block the whole Tor network
for the moment, while you take stock of your security assumptions and
see if you want to adapt into a system that can take into account that
humans do not map uniquely to IP addresses. Blocking Tor is not going
to solve your problem long-term -- your problem is deeper than that.

But it shouldn't be up to Tor to understand and manage application-level
authentication and authorization for every service on the Internet. While
this particular issue is being resolved, Tor servers operators
should do what they need to do; but I recommend that you don't
put any Wikipedia-specific entries in your exit policy, in order
to allow as many people as possible to safely read wikipedia. And
if you want to write to wikipedia from your Tor server's IP (which
you might want to do typically for plausible deniability, a form of
privacy), I recommend you use some other anonymizing service, such
as JAP (http://anon.inf.tu-dresden.de/index_en.html) or Anonymizer
(www.anonymizer.com). If wikipedia decides to block those, there are
plenty more where they came from. Hopefully at some point they'll realize
their energies can be better spent working on content for Wikipedia
rather than working on building and maintaining blacklists of many
thousands of IP addresses around the Internet.

I'm sorry to not have more useful answers. Wikipedia's security
assumptions don't leave us much wiggle room. :(

--Roger



More information about the tor-talk mailing list