[tor-bugs] #5232 [BridgeDB]: Import bridges into BridgeDB in a separate thread and database transaction
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Mar 21 12:47:22 UTC 2014
#5232: Import bridges into BridgeDB in a separate thread and database transaction
-------------------------+-------------------------------------------------
Reporter: karsten | Owner: sysrqb
Type: defect | Status: needs_revision
Priority: major | Milestone:
Component: | Version:
BridgeDB | Keywords: bridgedb-email, bridgedb-db,
Resolution: | bridgedb-https, bridgedb-0.1.x
Actual Points: | Parent ID:
Points: |
-------------------------+-------------------------------------------------
Comment (by sysrqb):
Replying to [comment:15 isis]:
> Sweet. I had to deal with a bit of merge conflicts to get it into
master... do you mind if I separate the additions to the unittest in
2a21dfcb55e659775fcde9dd4f668b98f41d0fd6 into another unittest? If I do
it, then you won't have to deal with the merge conflicts too.
>
Nope, that's fine by me. I have some more, but I've made them shorter this
time.
> So, this seems to work great, the parsing is done in a separate thread!
However, the call which takes longer, especially at start up time, is the
call to `bridgedb.Stability.addOrUpdateBridgeHistory()`. However, after
start up, the HTTPS distributor continues to function and hand out bridges
while the new descriptors are being parsed.
>
Indeed. This was a tradeoff between complexity and availability. In theory
this branch should significantly increase the latter with a small amount
of the former.
> For 10,000 bridge descriptors, with `addOrUpdateBridges()`:
> {{{
> * Starting the servers took: 1h 6m 58s
> * Restarting (SIGHUP) took: 2m 13s
> * Dumping buckets (SIGUSR1) took: 11s
> }}}
>
10,000 bridges, time taken until normal operation resumed
(time in parenthesis describe additional time taken for stability
calculations)
{{{
* Starting the server took: 32s (3s + 11s (for cleanup))
* Restarting (SIGHUP) took: 43s (4s + 10s (for cleanup))
}}}
I also checked the availability of the email and http distributors during
reload and found that they do become unavailable for a few seconds (which
is much better than the current situation, but still not good). There's
only one blocking operation during reload which is when we overwrite the
current data structures (with the new ones we just created in a background
thread) in the main thread, so this seems like the obvious place for the
bottleneck.
More testing and a comparison of startup timing compared to master will
follow.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/5232#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list