[tor-commits] [gettor/master] Initial version for GetYor new spec

ilv at torproject.org ilv at torproject.org
Tue Sep 22 23:39:10 UTC 2015


commit bd2dd1ea00e79dcf60417f160d65ef062b0eef86
Author: ilv <ilv at users.noreply.github.com>
Date:   Fri May 16 02:09:36 2014 -0400

    Initial version for GetYor new spec
---
 spec/design/blacklist.txt |   74 ++++++++++++++++++++++++++++
 spec/design/core.txt      |  107 ++++++++++++++++++++++++++++++++++++++++
 spec/design/smtp.txt      |  118 +++++++++++++++++++++++++++++++++++++++++++++
 spec/design/twitter.txt   |   99 +++++++++++++++++++++++++++++++++++++
 spec/overview.txt         |   94 ++++++++++++++++++++++++++++++++++++
 5 files changed, 492 insertions(+)

diff --git a/spec/design/blacklist.txt b/spec/design/blacklist.txt
new file mode 100644
index 0000000..0cabaa2
--- /dev/null
+++ b/spec/design/blacklist.txt
@@ -0,0 +1,74 @@
+   Google Summer of Code 2014 GetTor Revamp - Blacklist module
+   Author: Israel Leiva - <israel.leiva at usach.cl>
+   Last update: 2014-05-16
+   Version: 0.01
+   Changes: First version
+ 
+1. Preface
+ 
+   Since GetTor was created it has been a collection of functions and
+   classes separated in various modules. As its main purpose was
+   to serve files over SMTP, almost all current files have SMTP-related
+   procedures, including black and white lists. The proposed design for 
+   the blacklist module intends to separate GetTor services from the 
+   blacklist procedures.
+   
+2. Blacklist module
+
+   The main functionalities the blacklist module should provide are:
+   
+      * Check if a given entry is blacklisted for a given service
+      * Add/update/remove entries.
+      * Provide a standard critera to prevent flood.
+        
+3. Design
+
+   The new design should consist of the following files, directories and 
+   functions:
+   
+   * lib/gettor/blacklist/: Directory for storing blacklisted users of 
+     different services.
+     
+        ----- service1.blacklist: users blocked for service1
+        ----- service2.blacklist: users blocked for service2
+   
+   * lib/gettor/blacklist.py: Blacklist module of GetTor.
+         
+      isBlacklisted(address, provider)
+         Check if the given address is in the blacklist of provider and
+         if it acomplish certain restrictions. For example, it could
+         check when was the last time it was updated on the blacklist.
+            
+   
+4. Roadmap
+
+   A possible example of how the blacklist module could work:
+   
+   a. A service receives a request and call the blacklist module. 
+   b. The blacklist module check the blacklist for that service.
+   c. If the user (hashed) is present in the blacklist, it checks when 
+      was the last time it was updated. If this date is more than X days 
+      ago, updates the entry with the current date and returns false. If 
+      not, returns true. If the user is not present in the file, it adds 
+      the user with the current date and returns false.
+   
+   
+5. Discussion
+
+5.1 Mistakes
+
+   Mistakes concerning the package requested should be consider. If I 
+   send a message asking for a Linux 'es' request but I write 'en' instead 
+   of 'es', should I be able to ask again? or wait until I'm no longer 
+   blacklisted?
+
+5.2 Users
+
+   The method presented above (Roadmap) should consider weekly or monthly
+   clean-up of the list.
+
+5.3 SorryMessage
+
+   May be when a user tries to send too many requests we could send him 
+   a message saying that he won't be able to ask for packages in the next
+   X days.
diff --git a/spec/design/core.txt b/spec/design/core.txt
new file mode 100644
index 0000000..447c93b
--- /dev/null
+++ b/spec/design/core.txt
@@ -0,0 +1,107 @@
+   Google Summer of Code 2014 GetTor Revamp - Code module
+   Author: Israel Leiva - <israel.leiva at usach.cl>
+   Last update: 2014-05-16
+   Version: 0.01
+   Changes: First version
+ 
+1. Preface
+ 
+   Since GetTor was created it has been a collection of functions and
+   classes separated in various modules. As its main purpose was
+   to serve files over SMTP, almost all current files have SMTP-related
+   procedures, including address normalization, message composition, etc.
+   The proposed design for the core module intends to separate GetTor 
+   main functionalities which are independent of the service that 
+   transports the bundles.
+   
+2. Core module
+
+   The main functionalities the core module should provide are:
+   
+      * Receive a request with OS version, architecture, bundle 
+        language, and respond with the respective links.
+      * Generate links, per request or at demand, depending on if the
+        former is accepted as part of the new design.
+      * Log anonymous transactions.
+        
+3. Design
+
+   The new design should consist of the following files, directories and 
+   functions:
+   
+      * core.conf: Configuration values, e.g. base directory.
+      
+      * providers/: Directory for providers configuration.
+   
+         ----- providersList.txt: list of valid providers. 
+         ----- provider1.conf: configuration for provider1.
+         ----- provider2.conf: configuration for provider2.
+      
+         All this data is added manually.
+    
+      * mirrors.txt: Contains official mirrors. One per line. Added manually.
+   
+      * logs/: Directory for logs. Added automatically.
+   
+         ----- yyyy-mm-dd.log: daily log of requests.
+   
+      * lib/gettor/core.py: Core module of GetTor.
+   
+         getLinks(os_version, arch, locale)
+            Returns links for os_version (in both archs) in locale language.
+            This will read the providers list and call __generateLinks() 
+            for each one of them, plus calling to __getMirrors().
+         
+            Example: getLinks('OSX', 'en')
+         
+         __generateLinks(options, provider)
+            Generate links for a specific provider according to the options
+            received (os_version, locale). This will try to import the 
+            provider module and call the uploadBundle function.
+         
+            Example (within the module): __generateLinks(options, 'dropbox')
+     
+         __getMirrors() 
+             Obtains mirrors from mirrors.txt. 
+          
+         __logRequest(options)
+             Log information about the request for future stats (e.g. which
+             OS from which service is the most required).
+ 
+      * lib/gettor/providers/provider.py: There should be one module per
+        provider with the uploadBundle public function. There should be 
+        at least three modules at the end of GSoC: dropbox.py, drive.py, 
+        github.py
+     
+          uploadBundle(options)
+             Calls the provider internal functions to upload the required 
+             bundle according to the options received. This internal 
+             functions will depend solely on the API requirements from 
+             the provider.
+   
+4. Roadmap
+
+   An example of how the core module work:
+   
+   a. The SMTP service receives a request. 
+   b. The SMTP calls getLinks() with the options sent by the user.
+   c. getLinks() calls to __generateLinks() and then to __getMirrors()
+   d. getLinks() constructs a message with the information obtained.
+   e. getLinks() calls to __logRequest().
+   f. getLinks() returns the message previously constructed.
+   g. The SMTP service creates a message with the links obtained and 
+      send it to the user.
+   
+5. Discussion
+   
+5.1 Cache 
+
+   The above design was thought for per request links generation. Another
+   way of doing this would be to maintain a cache of generated links and
+   call __generateLinks() depending on the cache last modified time.
+   Reading links from this cache should consider to check if the given 
+   links still exists.
+
+5.2 Logs
+
+   Should we mantain separate logs for successful and fail requests?
diff --git a/spec/design/smtp.txt b/spec/design/smtp.txt
new file mode 100644
index 0000000..e86e864
--- /dev/null
+++ b/spec/design/smtp.txt
@@ -0,0 +1,118 @@
+   Google Summer of Code 2014 GetTor Revamp - SMTP module
+   Author: Israel Leiva - <israel.leiva at usach.cl>
+   Last update: 2014-05-16
+   Version: 0.01
+   Changes: First version
+ 
+1. Preface
+ 
+   Since GetTor was created it has been a collection of functions and
+   classes separated in various modules. As its main purpose was
+   to serve files over SMTP, almost all current files have SMTP-related
+   procedures, including address normalization, message composition, etc.
+   The proposed design for the SMTP module intends to separate GetTor 
+   main functionalities from the services, in this case, SMTP.
+   
+2. SMTP module
+
+   The main functionalities the SMTP module should provide are:
+   
+      * Receive requests via mail. 
+      * Identify user instructions, such as ask for help or for a specific
+        bundle (OS version, language).
+      * Get the required links from the core module.
+      * Send different types of answers to the user.
+      * Manage black lists to avoid flood.
+      * Log requests for stats (anonymous).
+        
+3. Design
+
+   The new design should consist of the following files, directories and 
+   functions:
+   
+   * lib/gettor/services/smtp.py: SMTP module of GetTor.
+
+      __init__(configuration path)
+         Creates an object according to the configuration values.
+         
+      processEmail(email object)
+         Process emails received (by forwarding). 
+    
+      __parseEmail(email object)
+         Parse the raw email sent by processMail(). Check for multi-part
+         emails and then parse the text part. It also tries to get the
+         locale information from the user's address.
+         
+      __parseText(email object)
+         Parse the text part of an email looking for the package requested.
+         
+      __getFrom(email object)
+         Returns the from address of an email object.
+      
+      __getLocale(address)
+         Tries to get the locale information from an email address.
+         
+      __checkBlacklist(address)
+         Check if the given address is blacklisted by comparing the
+         hashed address. If address is not present, it's added. If present,
+         check for the date when it was added. Yet to define how many
+         days will be considered for blacklisting or if another method
+         will be used. For this it uses the blacklist module.
+         
+      __sendReply(address)
+         Sends a reply to the user with the links required. It asks for
+         the links to the core module.
+      
+      __sendDelayAlert(address)
+         If enabled (on configuration), sends a delay message to the user
+         letting him know that the links are on their way.
+         
+      __sendHelp(address)
+         Sends a message to the user with help instructions.
+         
+      __createEmail(from, to, subject, body)
+         Creates an email object.
+         
+      __logRequest(options)
+         Log information about the request for future stats (e.g. which
+         OS and language are the most required). If this is called
+         after a failure, a copy of the email should be stored.
+         
+   * BASE_DIR/logs/: Directory for logs. The BASE_DIR should be in the
+     configuration file.
+   
+         ----- yyyy-mm-dd.log: daily log of requests.
+            
+   
+4. Roadmap
+
+   An example of how the SMTP module should work:
+   
+   a. The SMTP service receives a request (via forwarding). 
+   b. The email sender is checked for blacklisting (comparing hashes).
+   c. The email is parsed, obtaining the package requested and the 
+      locale information.
+   d. If the email was asking for help, a help reply is sent.
+   e. If the email was invalid, the process break. This fail is logged
+      and the email that triggered it, too.
+   f. If the email was valid and the delay alert is set, then a reply 
+      informing the links are on their way is sent.
+   g. If the email was valid, the SMTP service asks for the links to the 
+      code module.
+   h. After that, a reply is sent to the user.
+   i. In all cases the request is logged (with no user information).
+   
+   
+5. Discussion
+   
+5.1 Email forwarding
+
+   Are we going to support forwarding emails as ForwardPackage() did in 
+   the old GetTor?
+
+5.2 Blacklist sublists
+
+   Now with less types of request (two if no forwarding is added), creating
+   sublists for different types of requests necessary to blacklist and 
+   email address? Or should we blacklist it if it floods anything?
+
diff --git a/spec/design/twitter.txt b/spec/design/twitter.txt
new file mode 100644
index 0000000..46a178a
--- /dev/null
+++ b/spec/design/twitter.txt
@@ -0,0 +1,99 @@
+   Google Summer of Code 2014 GetTor Revamp - Twitter module
+   Author: Israel Leiva - <israel.leiva at usach.cl>
+   Last update: 2014-05-16
+   Version: 0.01
+   Changes: First version
+ 
+1. Preface
+ 
+   Since GetTor was created it has been a collection of functions and
+   classes separated in various modules. As its main purpose was
+   to serve files over SMTP, almost all current files have SMTP-related
+   procedures, including address normalization, message composition, etc.
+   The proposed design for the Twitter module intends to separate GetTor 
+   main functionalities from the services, in this case, Twitter.
+   
+2. Twitter module
+
+   The main functionalities the Twitter module should provide are:
+   
+      * Receive requests via direct messages. 
+      * Identify user instructions, such as ask for help or for a specific
+        bundle (OS version, language).
+      * Get the required links from the core module.
+      * Send different types of answers to the user.
+      * Split answers to fit Twitter's format.
+      * Manage black lists to avoid flood.
+      * Log requests for stats (anonymous).
+        
+3. Design
+
+   The new design should consist of the following files, directories and 
+   functions:
+   
+   * lib/gettor/services/Twitter.py: Twitter module of GetTor.
+
+      __init__(configuration path)
+         Creates an object according to the configuration values.
+         
+      processDM(message)
+         Process direct messages received. 
+    
+      __parseDM(message)
+         Parse the direct message received. Check for the package requested
+         and the locale information.
+         
+      __getUser(message)
+         Gets the user from the message sent. 
+         
+      __checkBlacklist(user)
+         Check if the given user is blacklisted by comparing the
+         hashed user. Yet to define how many days will be considered for 
+         blacklisting or if another method will be used. For this it uses 
+         the blacklist module.
+         
+      __sendReply(user)
+         Sends a reply to the user with the links required. It asks for
+         the links to the core module and then split them.
+         
+      __sendHelp(user)
+         Sends a message to the user with help instructions.
+      
+      __splitMessage(message)
+         Receives the links message and split it according to Twitter's
+         format.
+         
+      __CheckNewFollowers()
+         In order to ask for links the user has to follow the GetTor 
+         account. The Twitter module will be constantly checking for
+         new followers and follow them back.
+         
+      __FollowUser(user)
+         Follow the given user.
+         
+      __logRequest(options)
+         Log information about the request for future stats (e.g. which
+         OS and language are the most required). If this is called
+         after a failure, a copy of the DM should be stored.
+         
+   * BASE_DIR/logs/: Directory for logs. BASE_DIR should be in the 
+     configuration file.
+   
+         ----- yyyy-mm-dd.log: daily log of requests.
+            
+   
+4. Roadmap
+
+   An example of how the Twitter module should work:
+   
+   a. The Twitter account receives a DM.
+   b. The Twitter service check if is a valid message and if the user is
+      in the blacklist, and then tries to obtain the package requested and 
+      the locale information.
+   c. The Twitter service asks for the links to the core module, then it
+      splits the message received to adopt Twitter's format.
+   d. One or more DMs are sent back to the user.
+   e. For all this, the user must follow the GetTor account. The Twitter
+      service will be constantly checking for new followers and following
+      them back.
+   
diff --git a/spec/overview.txt b/spec/overview.txt
new file mode 100644
index 0000000..435d256
--- /dev/null
+++ b/spec/overview.txt
@@ -0,0 +1,94 @@
+   Google Summer of Code 2014 GetTor Revamp - Overview
+   Author: Israel Leiva - <israel.leiva at usach.cl>
+   Last update: 2014-05-16
+   Version: 0.01
+   Changes: First version
+
+1. Background
+
+   GetTor was created as a program for serving Tor and related files over 
+   SMTP, thus avoiding direct and indirect censorship of Tor's software, 
+   in particular, the Tor Browser Bundle (TBB). Users interact with GetTor 
+   by sending emails to a specific email address. In the past, after the 
+   user specified his OS and language, GetTor would send him an attachment 
+   with the required package. This worked until the bundles were too large 
+   to be sent as attachments in most email providers. In order to fix this 
+   GetTor started to send links instead.
+
+2. Current status
+
+   The GetTor status can be summarized in the following points:
+
+      * Emails are sent to gettor at torproject.org
+      * The GetTor reply contains: TBB links, signatures (with text guides
+        for verification), mirrors, support instructions in six languages.
+      * Dropbox links are sent to download the TBB and signatures.
+      * Users can not ask for packages in their language.
+      * English-only replies are sent.
+      * Any email directed to GetTor is replied with the same information, 
+        there is no recognition of instructions.
+      * Links generation is not fully automated.
+      * All code is written in Python. Various parts are not currently used.
+      * Current repositories are [0] and [1].
+   
+3. Proposal
+
+   The accepted proposal [2] for Google Summer of Code (GSoC) 2014 proposes 
+   rewriting the current GetTor using a modular design, with a core module 
+   that handles the main GetTor functionalities, and several other modules, 
+   one for each service (e.g. SMTP), which can interact with the core and 
+   send replies to the users. Three modules will be developed for the 
+   purposes of GSoC: SMTP, Twitter, Skype|XMPP. 
+   
+3.1. Goals
+
+   The main goals of this proposal are the following:
+
+      * Provide old GetTor functionalites, such as replies in several
+        languages and recognize user instructions.
+      * Send fewer information in each reply.
+      * Support more providers for uploading the TBB packages.
+      * Automate links generation.
+      * Clearer, modular and well-documented code.
+      * Possibilty to create new modules for other common services.
+   
+3.2. Design
+
+   Preliminar designs for the core module and the services can be found 
+   in the design/ folder. All services consider creating a python script
+   to add the logic for using them. For example, there should be a script 
+   that receives the emails and uses the SMTP module. For simplicity,
+   I've tried to specify mostly the main functions of every module; there 
+   are some functions, like opening and writing files, that were not 
+   considered in this preliminar phase.
+     
+4. Discussion
+
+4.1. Skype
+
+   My co-mentor for GSoC, Nima, has publicly rejected the idea of creating
+   a module for Skype and proposed to implement one for XMPP instead.
+   I've chosen Skype for its popularity, but I have no other main reason 
+   to maintain this option, so it's probable that XMPP transport will be 
+   implemented.
+
+4.2. Storing links
+
+   My original proposal considered the fact that links could be stored 
+   somewhere with restricted access, ideally a git repository. Nima 
+   mentioned that ideally the links shouldn't be stored. May be this idea 
+   could be used only to the mirrors and providers configuration (see
+   core module design).
+
+4.3. Generating unique URLs
+
+   Nima mentioned that unique URLs could be generated for each request, 
+   and in case the user don't have access to SSL, these links could be 
+   served and later deleted or recycled. I like this idea.
+
+4. References
+
+   [0] https://gitweb.torproject.org/gettor.git
+   [1] https://gitweb.torproject.org/user/sukhbir/gettor.git
+   [2] https://ileiva.github.io/gettor_proposal.html
+   





More information about the tor-commits mailing list