[metrics-bugs] #29787 [Metrics/Onionperf]: Enumerate possible failure cases and include failure information in .tpf output
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Apr 24 07:12:26 UTC 2019
#29787: Enumerate possible failure cases and include failure information in .tpf
output
-------------------------------+------------------------------
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Metrics/Onionperf | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+------------------------------
Comment (by karsten):
Alright, I finally made some progress here!
Last things first, I made the following plot:
[[Image(op_errors-2019-04-24.png​, 500px)]]
This plot uses your script with a minor extension:
{{{
diff --git a/op_errors.py b/op_errors.py
index 1c8b278..7169e4d 100644
--- a/op_errors.py
+++ b/op_errors.py
@@ -131,6 +131,7 @@ def main():
#if there are no failures at all in the circuit data then the
csv column will simply be left empty
pass
header = [
+ 'unix_ts_end', 'hostname_local',
'transfer_id', 'is_error', 'error_code', 'state_failed',
'total_seconds', 'endpoint_remote', 'total_bytes_read',
'circuit_id', 'stream_id','buildtime_seconds',
'failure_reason_local',
}}}
I fed it with all OnionPerf .json files that we have.
Then I combined the three fields `error_code`, `failure_reason_local` (if
present), and `failure_reason_remote` (if present, and only if
`failure_reason_local` is present, too) into a combined error code.
The result is that we have 11 combined error codes now, which are all in
the graph.
The next step will be to understand in more detail what causes these
errors. For example:
- `READ` is a fun one. The cases I looked at (all from op-ab) were all
onion service cases. The server had completed sending the response, and
all data was "in flight". Yet, some time later, the client had its
connection closed shortly before receiving the last remaining bytes. This
could be a bug. Still, needs closer investigation.
acute, if you'd like to take a look, too, maybe write down which combined
error codes you're going to look at, so that we can avoid duplicating
effort. (Thanks for all your efforts so far!)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29787#comment:20>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list