[tbb-bugs] #17568 [Tor Browser]: Clean up tor-control-port.js in Torbutton
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Nov 13 22:19:06 UTC 2015
#17568: Clean up tor-control-port.js in Torbutton
-------------------------------------------------+-------------------------
Reporter: gk | Owner: tbb-
Type: task | team
Priority: Medium | Status:
Component: Tor Browser | needs_review
Severity: Normal | Milestone:
Keywords: tbb-torbutton, | Version:
TorBrowserTeam201511R | Resolution:
Parent ID: | Actual Points:
Sponsor: | Points:
-------------------------------------------------+-------------------------
Comment (by cypherpunks):
Replying to [comment:6 arthuredelstein]:
> >
> > I was trying to track the input back to Tor's output and stumbled
across the 6500-lines control.c... So what I was wondering was:
> > - In general, what is the threat expectation here? What has to be
considered adversary-controlled input?
> > - Is it worth re-implementing the full control protocol parser in JS
so that you can verify each reply?
> > - Hopefully control.c takes a good defensive parsing approach. Does
control.c offer some guarantees about its output so that JS can just rely
on it?
>
> This is a good point worth considering. What do you consider to be a
full control protocol parser.
So, like I said, I'm not familiar with the codebase and so far only
skimmed the spec, please bear with me.
After reading gk's message where he described the flaw that caused the bug
in the ticked that spawned this one, my thinking went something like:
Aha, so the are 2 kinds of data being handled here: data that comes from
the network (from a remote Tor node; e.g. the name thingy gk mentioned),
and data generated by the local Tor node.
The first kind is evil, this is data you must not trust and do your best
to validate before attempting to use.
One may get away with trusting the second kind, since the program
producing it is already running on your system. If you trust this data,
this implies you trust the Tor node you are talking to. Since you trust
the local Tor node, just make sure you only connect to *it*.
Now, you get both kinds of data munged into a single protocol: the Tor
control protocol. But in fact, the evil data was already parsed by the
Tor node (in control.c or even earlier, IIUC).
Hopefully, that parsing is correct, and the result, safe. If it is, then
clients of the control protocol, "controllers", actually only get the
second kind of data. This is what I meant with "guarantees offered by
control.c".
In such case, controllers can get away with being a bit sloppy and parsing
with things like, umm, I don't know, regex. <insert trite quote about
regex here>
So when I said "full control protocol" I was thinking of not just the
"protocol" part of the control protocol, but the data as well (some of
which might be evil): a complete, comprehensive parser.
Maybe this is what the current code tries to do already?
The right solution, everyone knows, is to (a) define a complete grammar
and then (b) a strict parser for that grammar.
Do we have (a) yet? In "control-spec.txt" I saw several non-terminal
undefined (look for "XXX", for example).
Once we have (b), that same parser should be used in all JS code (button,
launcher, etc.)
Notice that this must have been done already: there are a few standalone
Tor controllers out there (I remember "stem" but I think there are
others).
> Another possibility is to consider whether the Tor control port should
switch to JSON or something similar to simplify code at both ends.
I guess the usefulness of this would be to simply rely on existing,
correct, secure parsers of JSON (or whatever else). A library like that
would indeed simplify code in both Tor and the controller.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/17568#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tbb-bugs
mailing list