[tor-bugs] #4440 [Analysis]: Attempt an implementation of the relay-search database using MongoDB or CouchDB

Tue Nov 8 19:16:15 UTC 2011

#4440: Attempt an implementation of the relay-search database using MongoDB or
CouchDB
----------------------+-----------------------------------------------------
 Reporter:  karsten   |          Owner:  karsten
     Type:  task      |         Status:  new    
 Priority:  normal    |      Milestone:         
Component:  Analysis  |        Version:         
 Keywords:            |         Parent:         
   Points:            |   Actualpoints:         
----------------------+-----------------------------------------------------
 Our current relay-search function takes forever to return results.
 There's #2922 for improving the database schema to better support
 searching for single relays.  That ticket assumes that we'll continue to
 use PostgreSQL.

 Today I looked into MongoDB for the web server log analysis, and I wonder
 if MongoDB or CouchDB might be an alternative for implementing the relay-
 search database.

 Every consensus or descriptor could be represented as a document with
 references to other documents.  Indexes could make typical search queries
 fast.  We don't need complicated Map/Reduce functions, because we're only
 searching and looking up data, not aggregating anything.  (That's also the
 reason why I think this is worth trying out---replacing the metrics
 database that aggregates statistics with MongoDB/CouchDB may not make as
 much sense.)  Maybe we should run a simple comparison of the new
 PostgreSQL database that ExoneraTor uses, and that is highly optimized for
 searches, to an implementation using MongoDB or CouchDB.

 I don't know if such a solution will perform better than a PostgreSQL-
 based solution.  I think we should try to find out.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/4440>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online