[tor-bugs] #13664 [Tor]: Potential issue with rend cache object when intro points falls to 0.
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Nov 4 21:26:39 UTC 2014
#13664: Potential issue with rend cache object when intro points falls to 0.
---------------------+---------------------
Reporter: dgoulet | Owner:
Type: defect | Status: new
Priority: normal | Milestone:
Component: Tor | Version:
Keywords: tor-hs | Actual Points:
Parent ID: | Points:
---------------------+---------------------
(Reproduced on Tor v0.2.6.1-alpha-dev (git-a142fc29aff4b476))
Here is the use case I was testing. I setup an HS on a remote server for
perf analysis. On my client, I made a small script that torsocks 10
connections on a different circuit to that HS (considering that the SOCKS5
user/pass == unique circuit works).
With the above, one time out of 10, I get all 10 connections to
successfully connect and work. The rest of the time I get an arbitrary
amout of connections failing with "Host unreachable". I feel this is a
combo of sometimes luck and sometimes the real issue.
I analyze this and my understanding is that the rend cache contains v2
descriptor with stored intro points ("intro_nodes" variable). However,
through the cycle of trying to connect, some intro points may be
unrechable thus being removed from that list. It also appears that we can
remove nodes in that list when closing circuit that were built in
"parallel":
{{{
Nov 04 15:36:08.000 [info] rend_client_close_other_intros(): Closing
introduction circuit 25 that we built in parallel (Purpose 7).
Nov 04 15:36:08.000 [debug] circuit_get_by_circid_channel_impl():
circuit_get_by_circid_channel_impl() returning circuit 0x7f6f1a171190 for
circ_id 2434373038, channel ID 0 (0x7f6f1a0425e0)
Nov 04 15:36:08.000 [info] circuit_mark_for_close_(): Failed intro circ
rejxmpqgho5vqdl4 to $EBE718E1A49EE229071702964F8DB1F318075FF8 (awaiting
ack). Removing from descriptor.
}}}
circuit_mark_for_close_() triggers a INTRO_POINT_FAILURE_GENERIC failure
that removes the intro point from the list. I might be wrongly
interpreting the "we built in parallel" feature but what I can observed is
that the intro node list becomes empty at some point which triggers a
"let's refetch that v2 descriptor!" behaviour.
{{{
Nov 04 15:36:08.000 [info] rend_client_report_intro_point_failure():
Unknown service "rejxmpqgho5vqdl4". Re-fetching descriptor.
}}}
However, the rend cache is not cleared of the old entry before fetching
that new descriptor. So once the v2 descriptor is received, we store it in
the cache using "rend_cache_store_v2_desc_as_client()" that prints this:
{{{
Nov 04 15:36:09.000 [info] rend_cache_store_v2_desc_as_client(): We
already have this service descriptor rejxmpqgho5vqdl4. [rendezvous-
service-descriptor i7hkcux5dghqv6ahstewyccltr6aud2x
}}}
So since we "have it" in the cache, we call "rend_client_desc_trynow()"
and it completely fails because all intro points in the cache object are
gone so this closes all pending connections.
Now, I think this happens because the heuristic for telling if "We already
have the cache object" is just by comparing the "desc" string here in
rendcommon.c +1156
{{{
/* Do we already have this descriptor? */
if (e && !strcmp(desc, e->desc)) {
log_info(LD_REND,"We already have this service descriptor %s. [%s]",
safe_str_client(service_id), desc);
e->received = time(NULL);
goto okay;
}
}}}
I think when the intro point list ends up to 0 node, we should remove it
from the cache and trigger the "fetch it again".
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/13664>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list