[tor-dev] gettimeofday() Syscall Issues
teor
teor2345 at gmail.com
Fri Jan 2 12:18:16 UTC 2015
> From: Yawning Angel <yawning at schwanenlied.me>
> Subject: Re: [tor-dev] gettimeofday() Syscall Issues
>
> On Thu, 01 Jan 2015 23:42:42 -0500
> Libertas <libertas at mykolab.com> wrote:
>
>> The first two account for the bulk of the calls, as they are in the
>> core data relaying logic.
>>
>> Ultimately, the problem seems to be that the caching is very weak. At
>> most, only half of the calls to tor_gettimeofday_cached_monotonic()
>> use the cache. It appears in the vomiting print statements that
>> loading a single simple HTML page
>> (http://www.openbsd.org/faq/ports/guide.html to be exact) will cause
>>> 30 gettimeofday() syscalls. You can imagine how that would
>>> accumulate for an exit carrying 800 KB/s if the caching
>> doesn't improve much with additional circuits.
>
> So while optimization is cool and all, I'm not seeing why this
> specifically is the underlying issue.
>
> Each cell can contain 498 bytes of user payload. Looking at things
> simplistically this is 800 KiB/s -> 1644 cells/sec, leaving you with
> approximately 608 microseconds of processing time per cell.
>
> On my i5-4250U box, gettimeofday() takes 22 ns on Linux, and 2441 ns on
> FreeBSD. I'm not sure how accurate the FreeBSD results are as it was
> in a VirtualBox VM (getpid() on the same VM takes 124 ns). If someone
> has a OpenBSD box they should benchmark gettimeofday() and see how long
> the call takes.
>
> Taking the FreeBSD case (since we know that tor works fine on Linux), a
IPredator has complained that tor on Linux spends too much time calling time() when pushing 500Mbit/s, which is an issue for them under 3.x series kernels, but not kernel 2.6.
https://ipredator.se/guide/torserver#performance
> single gettimeofday() call takes approximately, 0.39% of the per-cell
> processing budget.
>
> For reference (assuming gettimeofday() in *BSD really is this shit
> performance wise), 7000 calls to gettimeofday() is 17.09 ms worth of
> calls.
>
> The clock code in tor does need love, so I wouldn't object to cleanup,
> but I'm not sure it's in the state where it's causing the massive
> performance degradation that you are seeing.
>
Yawning/Libertas,
I just reviewed my profiling of an exit relay running chutney verify with 200MB of random data.
This is on OS X 10.9.5 with tor 0.2.6.2-alpha-dev running the chutney basic-min network.
The three leaf functions that take the most time in the call graph are:
* channel_timestamp_recv
* channel_timestamp_active
* time
Each of these functions takes around 16% of the execution time, the next nearest function is sha1_block_data_order_avx on 4%.
While I understand that OS X, BSD, and Linux syscalls aren't necessarily identical, we now have results for the following platforms suggesting that calling time() too often has a performance impact:
* Linux kernel 3.x
* OpenBSD
* OS X 10.9
My results suggest a maximum performance improvement of 15% on OS X if we reduced the calls to time() to a reasonable number per second.
teor
pgp 0xABFED1AC
hkp://pgp.mit.edu/
https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
http://0bin.net/paste/Mu92kPyphK0bqmbA#Zvt3gzMrSCAwDN6GKsUk7Q8G-eG+Y+BLpe7wtmU66Mx
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20150102/e50533a3/attachment.sig>
More information about the tor-dev
mailing list