[tor-bugs] #25125 [Core Tor/Tor]: KIST: Bug: scheduler_release_channel: Non-fatal assertion !(smartlist_pos(channels_pending, chan) == -1)
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Feb 2 13:25:24 UTC 2018
#25125: KIST: Bug: scheduler_release_channel: Non-fatal assertion
!(smartlist_pos(channels_pending, chan) == -1)
------------------------------+--------------------------------
Reporter: dgoulet | Owner: dgoulet
Type: defect | Status: assigned
Priority: High | Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor | Version:
Severity: Normal | Keywords: 032-backport
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
------------------------------+--------------------------------
Because of commit `cb5654f300312a8f` and `adaf3e9b89f62d68` merged last
night, we now see a new bug (benign as far as I can tell).
Here is the `SCHED_BUG()` on one of my test relay:
{{{
Feb 02 12:56:21.930 [warn] tor_bug_occurred_(): Bug:
src/or/scheduler.c:648: scheduler_release_channel: Non-fatal assertion
!(smartlist_pos(channels_pending, chan) == -1) failed. (on Tor 0.3.3.1
-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.930 [warn] Bug: Non-fatal assertion
!(smartlist_pos(channels_pending, chan) == -1) failed in
scheduler_release_channel at src/or/scheduler.c:648. Stack trace: (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.930 [warn] Bug:
/root/git/tor/src/or/tor(log_backtrace+0x42) [0x5579b56bb582] (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.930 [warn] Bug:
/root/git/tor/src/or/tor(tor_bug_occurred_+0xb9) [0x5579b56d6919] (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug:
/root/git/tor/src/or/tor(scheduler_release_channel+0x1aa) [0x5579b55edb0a]
(on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /root/git/tor/src/or/tor(+0x10bd94)
[0x5579b563fd94] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug:
/root/git/tor/src/or/tor(connection_handle_write+0x2e) [0x5579b564067e]
(on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /root/git/tor/src/or/tor(+0xbb229)
[0x5579b55ef229] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /root/git/tor/src/or/tor(+0xb8c8f)
[0x5579b55ecc8f] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /usr/lib/x86_64-linux-
gnu/libevent-2.0.so.5(event_base_loop+0x819) [0x7fcbd0fe94c9] (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug:
/root/git/tor/src/or/tor(do_main_loop+0x25c) [0x5579b5587f0c] (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug:
/root/git/tor/src/or/tor(tor_run_main+0x275) [0x5579b5589475] (on Tor
0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug:
/root/git/tor/src/or/tor(tor_main+0x3a) [0x5579b5582b3a] (on Tor 0.3.3.1
-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /root/git/tor/src/or/tor(main+0x19)
[0x5579b55828a9] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /lib/x86_64-linux-
gnu/libc.so.6(__libc_start_main+0xf0) [0x7fcbcff01830] (on Tor 0.3.3.1
-alpha-dev 0fd91772b1ece9e0)
Feb 02 12:56:21.931 [warn] Bug: /root/git/tor/src/or/tor(_start+0x29)
[0x5579b55828f9] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
}}}
It appears that we can release a channel from within
`connection_handle_write()`. At that point, the channel was popped from
the pending list and within the scheduler loop, we are writing the channel
connection's outbuf onto the wire so when it is release, this bug is
triggered in `scheduler_release_channel()`:
{{{
if (chan->scheduler_state == SCHED_CHAN_PENDING) {
if (SCHED_BUG(smartlist_pos(channels_pending, chan) == -1, chan)) {
log_warn(LD_SCHED, "Scheduler asked to release channel %" PRIu64 " "
"but it wasn't in channels_pending",
chan->global_identifier);
}}}
This doesn't affect tor in any ways and the current codeflow of channels
suggest that the release should accept that the channel is not in the
pending list even if the scheduler state is in PENDING.
The reason is that while in the scheduler loop, once the channel is popped
from the pending list, we *have* to keep it in PENDING state (see #24700)
until the scheduler is done with it and decides to put it back in the list
or change its state. This is because a channel can be rescheduled in while
it is in the scheduler loop which is *not* ideal for KIST but this is the
way it is for now.
We have to do a quick fix for 032 and 033 but this should be resolved in
034 with #24554.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25125>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list