[tor-commits] [bridgedb/master] First cobbled together social distributor proposal
isis at torproject.org
isis at torproject.org
Tue Feb 4 00:28:48 UTC 2014
commit 4e26736660fb285f8ffb8a6bb7b1c2ad58d72eed
Author: Isis Lovecruft <isis at torproject.org>
Date: Sat Oct 19 12:56:38 2013 +0000
First cobbled together social distributor proposal
---
doc/proposals/XXX-bridgedb-social-distribution.txt | 292 ++++++++++++++++++++
1 file changed, 292 insertions(+)
diff --git a/doc/proposals/XXX-bridgedb-social-distribution.txt b/doc/proposals/XXX-bridgedb-social-distribution.txt
new file mode 100644
index 0000000..199302c
--- /dev/null
+++ b/doc/proposals/XXX-bridgedb-social-distribution.txt
@@ -0,0 +1,292 @@
+# -*- coding: utf-8 ; mode: org -*-
+
+Filename: XXX-social-bridge-distribution.txt
+Title: Social Bridge Distribution
+Author: Isis Agora Lovecruft
+Created: 18 July 2013
+Status: Open
+
+* I. Overview
+
+ This proposal specifies a system for social distribution of the
+ centrally-stored bridges within BridgeDB. It is primarily based upon Part
+ IV of the rBridge paper, [0] utilising a coin-based incentivisation scheme
+ to ensure that malicious users and/or censoring entities are deterred from
+ blocking bridges, as well as socially-distributed invite tickets to prevent
+ such malicious users and/or censoring entities from joining the pool of
+ Tor clients who are able to receive distributed bridges.
+
+* II. Motivation and Problem Scope
+
+ As it currently stands, Tor bridges which are stored within BridgeDB may be
+ freely requested by any entity at nearly any time. While the openness, that
+ is to say public accessibility, of any anonymity system certainly
+ provisions its users with the protections of a more greatly diversified
+ anonymity set, the damages to usability, and the efficacy of such an
+ anonymity system for censorship circumvention, are devastatingly impacted
+ due to the equal capabilities of both a censoring/malicious entity and an
+ honest user to request new Tor bridges.
+
+ Thus far, very little has been done to protect the volunteered bridges from
+ eventually being blocked in various regions. This severely restricts the
+ overall usability of Tor for clients within these regions, who, arguably,
+ can be even more in need of the identity protections and free speech
+ enablement which Tor can provide, given their political contexts.
+
+** II.A. Current Tor bridge distribution mechanisms and known pitfalls:
+
+*** 1. HTTP(S) Distributor
+
+ At https://bridges.torproject.org, users may request new bridges, provided
+ that they are able to pass a CAPTCHA test. Requests through the HTTP(S)
+ Distributor are not allowed to be made from any current Tor exit relay,
+ and a hash of the user's actual IP address is used to place them within a
+ hash ring so that only a subset of the bridges allotted to the HTTP(S)
+ Distributor's pool may become known to a(n) adversary/user at that IP
+ address.
+
+**** 1.a. Known attacks/pitfalls:
+
+ 1) An adversary with a diverse and large IP address space can easily
+ retrieve some significant portion of the bridges in the HTTPS
+ Distributor pool.
+
+ 2) The relatively low cost of employing humans to solve CAPTCHAs is not
+ sufficient to deter adversaries with requisite economic resources from
+ doing so to obtain bridges. [XXX cost of employment]
+
+*** 2. Email Distributor
+
+ Clients may send email to bridges at bridges.torproject.org with the line
+ "get bridges" in the body of the email to obtain new bridges. Such emails
+ must be sent from a Gmail or Yahoo! account, which is required under the
+ assumption that such accounts are non-trivial to obtain.
+
+**** 2.a. Known attacks/pitfalls:
+
+ 1) Mechanisms for purchasing pre-registered Gmail accounts en masse
+ exists, charging between USD$0.25 and USD$0.70 per account. With
+ roughly 1000 bridges in the Email Distributor's pool, distributing 3
+ bridges per email response,
+
+* III. Design
+
+** III.A. Overview
+
+ As mentioned, most of this proposal is based upon §IV of the rBridge
+ paper, which is the non-privacy preserving portion of the paper. [0] The
+ reasons for deferring implementation of §V include:
+
+ - Adding a simpler out-of-band distribution of bridges. Requiring users to
+ copy+paste Bridge lines into their torrc is ridiculous.
+
+ - XXX
+
+ Modifications:
+
+ - Remove OT, keep blind signatures and Pedersen's Commitments.
+
+ XXX finishme
+
+** III.B. Threat Model
+
+ Modification: allow BridgeDB to be a malicious actor (protecting against it
+ at this point is too costly, instead we want to eliminate BridgeDB's
+ ability to obtain a social graph for Tor bridge users.)
+
+ XXX finishme
+
+** III.C. Data Formats
+
+*** 1. User Credential
+
+ A Credential is a signed document obtained from BridgeDB. It contains all
+ of the state required to verify honest client behavior, and is formatted
+ as a JSON object with the following format:
+
+ { "Bridges" : [
+ { "BridgeLine" : BridgeLine,
+ "LearnedTS" : TimeStamp,
+ "CreditsEarned" : INT
+ },
+ ...
+ ],
+ "CrenditialTS" : TimeStamp,
+ "TotalUnspentCredits" : INT
+ } NL
+
+ BridgeLine := <Bridge line from BridgeDB>
+ TimeStamp := INT
+ NumCredits := INT
+
+ The Timestamp in this case is the time which a user first learned the
+ existence of that bridge.
+
+ Example:
+
+ {'Bridges': [
+ {'BridgeLine': '1.2.3.4:6666 obfs3 adc83b19e793491b1c6ea0fd8b46cd9f32e592fc',
+ 'CreditsEarned': 5,
+ 'Timestamp': 1382078292.864117},
+ {'BridgeLine': '6.6.6.6:1234 d929c82d2ee727ccbea9c50c669a71075249899f',
+ 'CreditsEarned': 5,
+ 'LearnedTS': 1382078292.864117}],
+ 'CredentialTS': 982398423,
+ 'TotalUnspentCredits': 10}
+
+*** XXX other formats
+
+* IV. Databases
+
+** IV.A. Scalability Requirements
+
+ Databases SHOULD be implemented in a manner which is ammenable to using a
+ distributed storage system; this is necessary because certain types of data
+ MUST be stored permanently, such as the list of hashes of spent tokens, or
+ the list of hashes of used invite tickets.
+
+ Additionally, doing so promotes modularisation the components of BridgeDB,
+ such that the BridgeDistributor XXX can be separated from the backend
+ storage system, BridgeDB.
+
+*** 1. Distributed Database System
+
+ A distributed database system SHOULD be used for BridgeDB, in order to
+ scale resources as the number of Tor bridge users grows. This database
+ system, hereafter referred to as DDBS.
+
+ The DDBS MUST be capable of working within Twisted's asynchronous
+ framework. If possible, a Object-Relational Mapper (ORM) SHOULD be used to
+ abstract the database backend's structure and query syntax from the
+ Twisted Python classes which interact with it, so that the type of
+ database may be swapped out for another with less code refactoring.
+
+ The DDBM SHALL be used for persistent storage of complex data structures
+ such as the bridges, which MAY include additional information from both
+ the XXX @type-bridge-relay descriptors and the @type-bridge-extra-info
+ descriptors.
+
+ [#]: https://github.com/couchbase/couchbase-python-client#twisted-api
+
+**** 1.a. Data Structures which should be stored in a DDBS:
+
+ - RedactedDB - The Database of Blocked Bridges
+
+ The RedactedDB will hold entries of bridges which have been discovered
+ to be unreachable from BridgeDB network vantage point, or have been
+ reported unreachable by clients.
+
+ -
+
+*** 2. Relational Database Mapping Server
+
+ For simpler data structures which must be persistently stored, such as the
+ list of hashes of previously seen Invite Tickets, or the list of
+ previously spent Tokens, a Relational Database Mapping Server (RDBMS)
+ SHALL be used for optimisation of queries.
+
+ Redis and Memcached are two examples of RDBMS which are well tested and
+ are known to work well with Twisted. The major difference between the two
+ is that Memcached is volatile, while Redis supports command for
+ transferring objects into persistent on-disk storage. There are several
+ (see Twisted's MemCacheProtocol class [1] [2] or txyam [3] for Memcached,
+ and txredis [4] or txredisapi [5] for Redis). For non-Twisted Python Redis
+ APIs, there is redis-py, which provides a connection pool that could
+ likely be interfaced with from Twisted Python without too much
+ difficultly. [6]
+
+ In order to further decrease the need for lookups in the backend
+ databases, Bloom Filters can used to eliminate extraneous
+ queries. However, this optimization would only be beneficial for string
+ lookups, i.e. querying for a user's credential, and SHOULD NOT be used for
+ queries within any of the hash lists, i.e. the list of hashes of
+ previously seen invite tickets. [7] It might be possible to use Redis'
+ GETBIT and SETBIT commands to store a Bloom Filter within a Redis cache
+ system; [8] doing so would offload the severe memory requirements of
+ loading the Bloom Filter into memory in Python when inserting new entries,
+ reducing the time complexity to order O(1) from some (polynomial) time
+ complexity that is proportional to the integral of the number of bridge
+ users over the rate of change of bridge users over time.
+
+ XXX expire credentials [#] redis key datatype
+ [#]: http://redis.io/commands/pexpireat
+
+ XXX evaluation on data by calling the sha1 for a serverside Lua script [#]
+ [#]: http://redis.io/commands/evalsha
+
+**** 2.a. Data Structures which should be stored in a RDBMS
+
+ - User Credentials
+
+ - Invite Tickets
+
+ - Spent Credits
+
+* IV. Open Questions
+
+** IV.A. In which component of the Tor ecosystem should the client application code go?
+
+*** 1. Should this be done as a Pluggable Transport?
+
+ Considerations:
+
+**** a. It doesn't need to modify the user's application-level traffic
+
+ The clientside will eventually need to be able to build a circuit to the
+ BridgeDB backend, but it is not necessary that the clientside handle
+ any of the user's application level traffic. However, the clientside
+ system of rBridge must start when TBB (or tor) is started.
+
+**** b. It needs to be able to start tor.
+
+ This is necessary because the lines:
+ {{{
+ UseBridges 1
+ Bridge [...]
+ }}}
+ must be present before tor is started; tor will not reload these
+ settings via SIGHUP.
+
+**** c. TorLaucher is not the correct place for this functionality.
+
+ I am *not* adding this to TorLauncher. The clientside of rBridge will
+ eventually need to handle a lot of complicated new cryptographic
+ primitives, including commitments and zero-knowledge proofs. This is
+ dangerous enough, period, because there aren't really any libraries
+ for Pairing-Based Cryptography yet (though Tanya Lange has mentioned
+ to me that a student of theirs should have a good one finished some
+ time this year -- but I'm still going to count that as existing like
+ a unicorn). If I am to write this, I am doing it in
+ C/Python/Python-extensions. Not JS.
+
+***** c.i It could possibly launch TorLauncher
+
+ In other words, this thing edits the torrc according to it's state,
+ and then either launches tor (if the user wants to use an installed
+ tor binary) or launches TorLauncher if we're running TBB.
+
+**** d. Little-t tor is not the correct place for this either.
+
+ It might be possible, instead of (b) or (c), to add this to little-t
+ tor. However, I feel like the bridge distribution problem is a
+ separate to tor, which should be (more or less) strictly an
+ implementation of the onion-routing design. Additionally, I do not
+ wish to pile more code or maintenance upon eith Nick or Andrea, nor
+ do I wish to make little-t tor more monolithic.
+
+ I talked with Nick briefly about this at the Summer 2013 Tor Dev
+ meeting in München, and he agreed that little-t tor isn't where this
+ code should go.
+
+
+* References
+
+[0]: http://www-users.cs.umn.edu/~hopper/rbridge_ndss13.pdf
+[1]: https://twistedmatrix.com/documents/current/api/twisted.protocols.memcache.MemCacheProtocol.html
+[2]: http://stackoverflow.com/a/5162203
+[3]: http://findingscience.com/twisted/python/memcache/2012/06/09/txyam:-yet-another-memcached-twisted-client.html
+[4]: https://pypi.python.org/pypi/txredis
+[5]: https://github.com/fiorix/txredisapi
+[6]: https://github.com/andymccurdy/redis-py/
+[7]: http://www.dr-josiah.com/2012/03/why-we-didnt-use-bloom-filter.html
+[8]: http://redis.io/topics/data-types §"Strings"
More information about the tor-commits
mailing list