[tor-dev] Get Stem and zoossh to talk to each other
Damian Johnson
atagar at torproject.org
Sun Aug 16 21:44:40 UTC 2015
>> > Ideally, zoossh should do the heavy lifting as it's implemented in a
>> > compiled language.
>>
>> This is assuming zoossh is dramatically faster than Stem by virtue of being
>> compiled. I know we've discussed this before but I forget the results - with
>> the latest tip of Stem (ie, with lazy loading) how do they compare? I'd expect
>> time to be mostly bound by disk IO, so little to no difference.
>
> zoossh's test framework says that it takes 36364357 nanoseconds to
> lazily parse a consensus that is cached in memory (to eliminate the I/O
> bottleneck). That amounts to approximately 27 consensuses a second.
>
> I used the following simple Python script to get a similar number for
> Stem:
>
> with open(file_name) as consensus_file:
> for router in stem.descriptor.parse_file(consensus_file,
> 'network-status-consensus-3 1.0',
> document_handler = stem.descriptor.DocumentHandler.ENTRIES):
> pass
>
> This script manages to parse 24 consensus files in ~13 seconds, which
> amounts to 1.8 consensuses a second. Let me know if there's a more
> efficient way to do this in Stem.
Interesting! First thought is 'wonder if zoossh is even reading the
file content'. Couple quick things to try are...
with open(file_name) as consensus_file:
consensus_file.read()
... to see how much time is disk IO verses parsing. Second is to try
doing something practical (say, count the number of relays with the
exit flag). Stem does some bytes => unicode normalization which might
account for some difference but other than that I'm at a loss for what
would be taking the time.
Cheers! -Damian
More information about the tor-dev
mailing list