[tor-bugs] #13720 [Ooni]: Investigate possible performance improvements to the ooni-pipeline
    Tor Bug Tracker & Wiki 
    blackhole at torproject.org
       
    Fri Jun 26 23:57:34 UTC 2015
    
    
  
#13720: Investigate possible performance improvements to the ooni-pipeline
-----------------------------+---------------------
     Reporter:  hellais      |      Owner:  hellais
         Type:  enhancement  |     Status:  new
     Priority:  normal       |  Milestone:
    Component:  Ooni         |    Version:
   Resolution:               |   Keywords:
Actual Points:               |  Parent ID:
       Points:               |
-----------------------------+---------------------
Comment (by dcf):
 For what it's worth, I was also struggling with the slowness of the Python
 yaml module (in the context of [https://lists.torproject.org/pipermail
 /ooni-dev/2015-June/000288.html this project]) to find server-side
 blocking of Tor in OONI reports). For me, yaml.CSafeLoader is ''way''
 faster, like over 30×.
 These are the times to parse 1.5 GB of gzip files, consisting of
 http_requests reports between 2015-06-16 and 2015-06-24:
 {{{
 yaml.safe_load_all(f)
 real    138m29.467s
 user    138m27.808s
 sys     0m6.356s
 yaml.load_all(f, Loader=yaml.CSafeLoader)
 real    4m40.021s
 user    5m21.960s
 sys     0m7.428s
 }}}
 I had tried optimizing the HTML parsing and gzip decompression; the YAML
 decoding was the bottleneck by far.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/13720#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
    
    
More information about the tor-bugs
mailing list