[tor-bugs] #18809 [Core Tor/Tor]: Handle linked connections better during bootstrap
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue May 17 00:01:14 UTC 2016
#18809: Handle linked connections better during bootstrap
-------------------------------------------------+-------------------------
Reporter: teor | Owner: arma
Type: defect | Status:
Priority: Medium | needs_revision
Component: Core Tor/Tor | Milestone: Tor:
Severity: Normal | 0.2.8.x-final
Keywords: must-fix-before-028-rc, | Version: Tor:
TorCoreTeam201605, review-group-1 | 0.2.8.1-alpha
Parent ID: | Resolution:
Reviewer: andrea | Actual Points: medium
| Points: medium
| Sponsor:
| SponsorS-can
-------------------------------------------------+-------------------------
Comment (by teor):
Replying to [comment:24 arma]:
> Replying to [comment:23 teor]:
> > Code review:
>
> Thanks for the review!
>
> > 91c5801 avoid following through on a consensus fetch if we have one
already arriving
> > Suffers from a race condition where two connections could attach at
the same time. But this is unlikely and relatively benign: we'll just
fetch the consensus twice.
>
> Can you elaborate on this one? This part of Tor is single-threaded, so
they can't attach at the same time. One of them attaches before the other,
and the first one to win gets immediately moved to state
AP_CONN_STATE_CONNECT_WAIT as soon as it's chosen, so the second one will
know that there's already a winner.
Oops, I always imagine more threads in Tor than what we actually have.
> > a7665df close other consensus fetches when we get a consensus
> > There were two race conditions avoided by the previous code:
> > * the old function closed the connection immediately to avoid wasting
data. But this meant that if a connection was receiving data or undergoing
other operations, there could be errors. I think calling
connection_mark_for_close is better, even if it's less efficient.
>
> The trouble is that connection_mark_for_close() just closes the dir conn
-- not the underlying tls connection, and so it doesn't undo any begin
cell already sent onto the network. So we can close the consensus request,
but if it already sent its begin cell, we will still get a consensus
coming back at us -- we'll just drop all the data cells because we won't
be expecting it anymore. This behavior is similar to the problem in bug
#16844.
>
> But to be clear, this new function
connection_dir_close_consensus_fetches() is aiming to find and kill
directory requests that haven't attached to a circuit yet. There should
not be any directory requests that are attached to a circuit (i.e.
fetching a consensus) at this point -- they would have been noticed inside
91c5801 and stopped there.
Ok, sounds sensible.
> > * connections can be initiated after
connection_dir_list_by_purpose_and_resource() is called. These connections
are never cleaned up. I think this is fixed by checking when we attach a
stream in 91c5801.
>
> You mean the connection_dir_list_by_purpose_and_resource() call inside
connection_dir_close_consensus_fetches()? That function is only called
after we've called networkstatus_set_current_consensus(), so we should
have a consensus in place. But! What about the case where we got a
consensus, but we don't have the certs yet for verifying it? Then
networkstatus_consensus_is_bootstrapping() will return 1, yes? Crappo.
Does this mean we want to modify
networkstatus_consensus_is_bootstrapping() so it says no if there is a
consensus but it's in waiting-for-certs limbo? After all, we don't want to
start launching more consensus fetches in that situation.
Yes, I think we'll have to modify this part.
> > d5a9628 simplify more -- we only call these funcs when bootstrapping
> > makes sense, but you'd better document that those functions should
only ever be called when bootstrapping
>
> The functions are named update_consensus_bootstrap_multiple_downloads
and update_consensus_bootstrap_attempt_downloads(), and also their
function comments start with "When we're bootstrapping, ". Should we do
more?
Nope.
> > 1f72653 fix a bug where relays would use the aggressive client
bootstrapping retry number
> > I think the correct fix for this is to check `server_mode(options)` in
`consensus_max_download_tries()`, and `return
options->TestingConsensusMaxDownloadTries` if it's true.
> >
> > Otherwise, this bug is just waiting to happen again.
>
> Hm? I was actually thinking of just removing
consensus_max_download_tries() entirely, since it is only called from two
places, and one of the places wants the first clause of the function, and
the other place wants the last clause of the function. I guess this is my
instinct to simplify rather than complexify.
Then let's split it up.
> > 507883e rip out the unit tests for the functions i removed
> > Please leave the parts of the unit tests that test existing functions,
or remove those functions if they are unused:
> > * connection_dir_list_by_purpose_and_resource
> > * connection_dir_count_by_purpose_and_resource
>
> They are still used. Would you find it easy to add back whichever parts
of the unit tests you want to keep? Or throw out this last commit and do
whatever you think is smart? :) Thanks!
I'd keep the bits of the unit tests that test these functions.
I'll let you make your changes first.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18809#comment:25>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list