[tor-bugs] #33211 [Circumvention/Snowflake]: proxy-go sometimes gets into a 100+% CPU state
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Feb 26 04:00:37 UTC 2020
#33211: proxy-go sometimes gets into a 100+% CPU state
-------------------------------------+-----------------------------------
Reporter: dcf | Owner: cohosh
Type: defect | Status: needs_information
Priority: Medium | Milestone:
Component: Circumvention/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------+-----------------------------------
Comment (by dcf):
Replying to [comment:20 cohosh]:
> I haven't been able to reproduce this yet, but I'm going to try running
it outside of a Docker container to see if that helps. I'm using
attachment:0001-Add-logging-for-pion-library-calls.patch to log debug
messages from the pion library.
I got it to happen again, with the logging patch. Something went wrong
with the profiling and the profile output was 0 bytes. I had built the
executable with `-race`, but I didn't see any race-related messages in the
log.
* [https://people.torproject.org/~dcf/bug33211/proxy-
go.2e9e8071.profiled+logging+race proxy-go.2e9e8071.profiled+logging+race]
executable
* [https://people.torproject.org/~dcf/bug33211/proxy-
go.2e9e8071.profiled+logging+race.log.xz proxy-
go.2e9e8071.profiled+logging+race.log.xz] log file, 6 MB, decompresses to
158 MB because of massive logging
I started everything according to the procedure in comment:17 at about
2020-02-26 00:50. It jumped to 250% CPU about one hour later at 2020-02-26
01:58:07. (Note that the pion log lines use local time (UTC–7) instead of
UTC, so the timestamp to look for is 18:58:07.) It's obvious in the log
file where it happens, starting at line 48424:
{{{
sctp DEBUG: 18:58:07.240692 association.go:648: [0xc000094ea0] sending
SACK: SACK cumTsnAck=4213437655 arwnd=1048576 dupTsn=[]
sctp DEBUG: 18:58:07.295662 stream.go:118: [0:0xc000094ea0]
reassemblyQueue readable=true
sctp DEBUG: 18:58:07.295750 stream.go:120: [0:0xc000094ea0]
readNotifier.signal()
sctp DEBUG: 18:58:07.295798 stream.go:122: [0:0xc000094ea0]
readNotifier.signal() done
sctp DEBUG: 18:58:07.414562 stream.go:118: [0:0xc000094ea0]
reassemblyQueue readable=true
sctp DEBUG: 18:58:07.414629 stream.go:120: [0:0xc000094ea0]
readNotifier.signal()
sctp DEBUG: 18:58:07.414671 stream.go:122: [0:0xc000094ea0]
readNotifier.signal() done
sctp DEBUG: 18:58:07.414811 association.go:648: [0xc000094ea0] sending
SACK: SACK cumTsnAck=4213437657 arwnd=1048040 dupTsn=[]
sctp DEBUG: 18:58:07.561561 stream.go:118: [0:0xc000094ea0]
reassemblyQueue readable=true
sctp DEBUG: 18:58:07.561630 stream.go:120: [0:0xc000094ea0]
readNotifier.signal()
sctp DEBUG: 18:58:07.561671 stream.go:122: [0:0xc000094ea0]
readNotifier.signal() done
sctp DEBUG: 18:58:07.636247 stream.go:118: [0:0xc000094ea0]
reassemblyQueue readable=true
sctp DEBUG: 18:58:07.636309 stream.go:120: [0:0xc000094ea0]
readNotifier.signal()
sctp DEBUG: 18:58:07.636358 stream.go:122: [0:0xc000094ea0]
readNotifier.signal() done
sctp DEBUG: 18:58:07.636511 association.go:648: [0xc000094ea0] sending
SACK: SACK cumTsnAck=4213437659 arwnd=1048576 dupTsn=[]
sctp DEBUG: 18:58:07.744138 stream.go:118: [0:0xc000094ea0]
reassemblyQueue readable=true
sctp DEBUG: 18:58:07.744215 stream.go:120: [0:0xc000094ea0]
readNotifier.signal()
sctp DEBUG: 18:58:07.744249 stream.go:122: [0:0xc000094ea0]
readNotifier.signal() done
sctp DEBUG: 18:58:07.762724 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=63 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763026 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=64 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763203 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=65 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763346 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=66 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763518 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=67 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763722 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=68 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.763895 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=69 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764038 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=70 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764202 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=71 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764426 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=72 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764601 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=73 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764741 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=74 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:58:07.764950 association.go:2066: [0xc0000941a0] T3-rtx
timed out: nRtos=75 cwnd=1228 ssthresh=4912
}}}
It goes on with those `T3-rtx timed out` for 3.5 minutes until line
1398659, where I stopped everything. About 6500 log lines per second.
This was snowflake commit 2e9e8071787398d9713395d94d93b5427ac8d452,
pion/sctp commit 32f160c5ee1fe015c168a0fc8272633b73c5e4bf.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33211#comment:24>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list