[tor-bugs] #26227 [Core Tor/Stem]: Review existing stem.client code
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Jun 13 05:49:34 UTC 2018
#26227: Review existing stem.client code
---------------------------+------------------------------
Reporter: dmr | Owner: dmr
Type: task | Status: needs_review
Priority: Medium | Milestone:
Component: Core Tor/Stem | Version:
Severity: Normal | Resolution:
Keywords: client | Actual Points:
Parent ID: | Points:
Reviewer: atagar | Sponsor:
---------------------------+------------------------------
Comment (by dmr):
Replying to [comment:3 dmr]:
> [...] within the next day. Sorry for the delay!
Sorry //again// for the delay!
Here's comments I've collected from my code review.
Some of them are pretty straightforward and probably need no discussion. I
ran out of time with the commits I pushed, but these could probably be
addressed without much review. I'll note that as **Straightforward**; you
can probably skip over reading those ;).
For others, where I note **Suggestion** or **Question**, I'd like to get
some feedback on whether you think such a change is appropriate. Feel free
to respond with a quick "Go ahead" or "Let's not do that" for various
suggestions!
I can then make tickets or commits to the code-review branch - whichever
is most appropriate!
== Maybe //slightly// out of `stem.client` scope, but mentioning
anyways... (lest I forget)
==== stem.descriptor.remote `_download_from_*port()` signatures
**Straightforward**
Instead of `url`, this method should take an `endpoint` and `resource` -
having the same signature as `_download_from_orport()`
url creation should be thus done within the `_download_from_dirport()`
method
==== Assume DirPort for `_pick_endpoint()`, or...?
**Question**: Should we assume DirPort for the directory authorities by
default, or should it switch to ORPort, or have it be specifiable?
Right now it assumes DirPort.
(Note that the last retry for `_download_descriptors()` will use directory
authorities (and thus DirPort) if `Query.fall_back_to_authority` is
`True`.)
Suggestion, if have it be specifiable:
We could allow `fall_back_to_authority` in the constructor to take in an
`Endpoint` //class// to make it clear which type to use, and default to
`DirPort` if specified as `True` (backwards-compatible).
== stem.client and consumers
==== New `Relay` connection per `_download_from_orport()` call
**Summary**: each `_download_from_orport()` call creates a new `Relay`
connection; ideally these should be multiplexed
**Suggestion (for future)**
At a future point, we may want to refactor this to share connections and
possibly circuits with a stem.client singleton.
This is discussed more in the architecture ticket (#26227)
At that point, we'd want to avoid hardcoding the `stream_id`.
==== Protocol constants: `stem.client.Relay`
**Suggestion**
Somewhere I think we want to define a `SUPPORTED_LINK_PROTOCOLS` constant
We likely won't have full support for any protocol listed, so that could
be a bit disingenuous for a while. But in order to move forward with
development, it's a chicken-and-the-egg problem ;).
**Suggestion**
In similar fashion to defining the `SUPPORTED_LINK_PROTOCOLS` constant, we
should define constants for all the subprotocols. See
[[https://gitweb.torproject.org/torspec.git/tree/tor-
spec.txt?id=419ba0c307106f7d719b947faf7e4235f95b37ae#n1817|tor-spec
section 9]].
==== Parameter naming: `stem.client.Relay.connect()`, `link_protocols`
**Summary**: The method takes a `link_protocols` parameter. This should be
named a bit clearer for its purpose.
**Suggestion**: It could be named something like
`link_protocols_to_advertise` and default to `SUPPORTED_LINK_PROTOCOLs`.
Rationale:
I can see that it appears to already generate confusion with its current
name, because `_download_from_orport()` attempts to use link_protocol
values associated with the endpoint, i.e. the //remote// relay's
advertised link protocols.
==== Constants/magics for `link_protocol` values
**Straightforward**
Based on the above, refactor these
There's a few constants and a few magic numbers, e.g. `[3]`.
==== Naming: `stem.client.datatype.Size` subclasses/attributes
**Suggestion**:
It might be good to switch `CHAR`/`SHORT`/etc. to `UCHAR`/`USHORT`/etc.
I don't know what the convention is here, but it may help for readability.
I'm personally used to `U<size>` to signify unsigned and `<size>` to
signify signed. I think switching to `U<size>` would make the code
//potentially// easier to read for newcomers from various backgrounds.
**Suggestion**:
Similarly, it may help to put the bits length in it, too - for the most
immediate readability.
So, e.g.:
* `UCHAR8`
* `USHORT16`
* etc.
For reference, on a quick glance...
* [[https://gitweb.torproject.org/trunnel.git/tree/README|trunnel]]
[[https://gitweb.torproject.org/trunnel.git/tree/lib/trunnel/Grammar.py?id=c6e8a499f5a5f00113ea268cfcef9e7676c6ed96#n86|appears
to use]] `u8`, `u16`, etc.
* `tor`
[[https://gitweb.torproject.org/tor.git/tree/src/common/torint.h?id=1b04dab60c549d9f0d621e1a115cab8a49c839f9|appears
to use]] `uint8_t`, `uint16_t`, etc.
==== RelaySocket `recv()` may give us partial cells
**Summary**: our socket recv(), by nature of network communication, may
unblock at "incomplete" cell lengths.
(I believe so, at least.)
We don't account for mid-Cell reads/`recv()`s. We try to pop a Cell at a
time, and I feel it's probably possible to get a buffer read that is
between cell boundaries from our lower-level I/O, and thus error out.
Variable-width cells could potentially exacerbate this.
stem.client should be designed to handle that at a higher layer
**Suggestion**: new exception type named* `IncompleteCell`, raised from
`pop()`/`unpack()` instead of `ValueError`
That would let us catch it specifically and re-block on the recv
continuously, appending buffers until we have received the complete cell.
`ValueError` then would be used for malformed cells in which the code
believe it's seen all the content.
//* exact naming TBD//
==== RelaySocket `recv()` and `send()` use/handling
**Summary**: `recv()` is directly used by consumers at different layers.
`send()` too
We furthermore don't have an architecture with a centralized `recv()`. For
instance, it looks like `Relay.__init__()`, `Relay.create_circuit()`, and
`Circuit.send()` all directly use orport `recv()`, instead of e.g. having
a thread delegated to do the recv and sorting of cells into buckets or
whatnot.
The problem with these disparate consumers doing `recv()` so directly is
that they are dropping cells that don't pertain to what they're trying to
do. (And per above, maybe dropping portions of a cell, corrupting the
connection "framing".)
There may also be something similar with `send()`, but since underneath it
uses a socket //file// and flushes, I think it might be ok. (Though I've
read a lot of different documentation on receive/send methods - I could be
mixing things up.)
**Suggestion**: Manage these buffers in a centralized fashion, in a thread
More discussion thereon in the architecture ticket (#26227).
==== Decryption/encryption of `RelayCell` - wrong abstraction layer?
**Summary**: it looks like our abstraction is breaking down a bit
Much of what's happening in
[[https://gitweb.torproject.org/stem.git/tree/stem/client/__init__.py?id=4306013ab6e868e99553fbb78ed14ef51a896b7d#n223|`stem.client.Circuit.send()`]]
is more directly handling cell content than it should be.
**Suggestion**: move much of this into `RelayCell` methods. The forward-
key and backward-key data will need to reside in the Circuit, but much of
the rest of this can be encapsulated in the Cell.
==== Nonuniform integrity/content/etc. checking within `stem.client.cell`
**Summary**: we do cell content checks in a variety of methods, but not
uniformly
Often this is in `__init__` //or// `pack` //and/or// `unpack`.
Mostly it looks like the validation only happens in a subset of these, for
a given cell type. Sometimes the checks may be inconsistent.
//(For example: `AuthChallengeCell` seems to allow for 0 methods to be
specified, which would be bad (the unpacking expects at least 1 method,
iiuc))//
I think ideally we'd want it to happen for all of these.
That would allow a cell's contents to be changed e.g.:
* after it's unpacked, before it's packed
* after it's constructed, before it's packed
* etc.
... without bypassing the checks
**Suggestion**: Cells define a more generic `validate()` method for each
cell type; call this in each of `__init__`, `pack`, and `unpack`
Side suggestion:
To allow delayed validation and improper cell creation, we might want to
provide some kwargs to the constructor and pack methods; maybe:
* `__init__(..., delay_validation=False)`
* `pack(..., allow_invalid=False)`
And similarly, if the parsing allows for it, we might have something
similar (`allow_invalid=False`) for the unpack method.
Using these in non-default situations would be rare and probably mostly
useful for integration tests of edge cases with tor, seeing how it
responds to bad input.
==== Incongruency with Cell.unpack() and Field.unpack()
**Summary**: `Cell.unpack()` creates a generator whereas `Field.unpack()`
ensures that there is //no// remainder content.
**Suggestion**: we might want to standardize these
Perhaps to...:
* `unpack()` creates a generator; `unpack_one()` ensures there is no
remainder content
* `unpack_many()` creates a generator; `unpack()` ensures there is no
remainder content
==== `CircuitCell` subclass; what about `ConnectionCell` subclass?
**Summary**: We have a `CircuitCell` subclass, for cells that must pertain
to a circuit. Can we have a `ConnectionCell` subclass for cells that
//must not// pertain to a circuit?
I believe* for Cells, they would be either CircuitCell (circuit id must be
> 0) or ConnectionCell (circuit id must be 0).
Currently we enforce in `_pack()` that CircuitCells have a circuit id, but
we don't enforce anything for the other cells.
**Suggestion**: subclass all other Cells into a `ConnectionCell` subclass;
refactor a bit for circuit id checks
(Looking for your feedback here on //whether// you think we should do
this. The //how// is very straightforward.)
*This division is actually not completely clear to me in the spec, but I'm
pretty confident about it. I can hunt down spec wording while
implementing, and if it's not really not clear, track an edit to make this
division clearer.
==== Define a max-payload-length security parameter?
**Summary**: variable-length cell payloads have a length stored in a SHORT
but no specified max size
**(Open) Question**: does little-t `tor` define a security parameter here?
Potentially this should have a security parameter.
A SHORT allows a payload up to 65535 bytes, which isn't that bad (just
under 64KBytes), but I'd wonder if little-t `tor` actually defines
something smaller. There certainly isn't anything defined in the spec.
==== Not all the ValueError exceptions have tests
**Suggestion**: employ a code coverage tool.
**Question**: Possibly `coverage`?
==== No timeout specifiable for `_download_from_orport()`
**Summary**: `_download_from_dirport()` takes a `timeout` parameter, but
`_download_from_orport()` doesn't
**Suggestion**: should have a timeout argument, ideally
This in practice may be a bit harder to implement without rearchitecting a
bit. But probably can be done with a different architecture (see #26227).
Low priority in the interim?
== ...
//splitting here because the spam filter is complaining I have too many
links//
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26227#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list