[tor-bugs] #14011 [Stem]: Implement lazy parsing for zoossh (and maybe Stem)
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sun Dec 21 21:32:21 UTC 2014
#14011: Implement lazy parsing for zoossh (and maybe Stem)
-----------------------------+---------------------
Reporter: phw | Owner: phw
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Stem | Version:
Keywords: zoossh, parsing | Actual Points:
Parent ID: | Points:
-----------------------------+---------------------
Damian and I had a small discussion regarding lazy parsing (see below) and
how it could speed up dealing with descriptor data. This might not be an
awful lot of work for zoossh, so it might be worth implementing it.
{{{
18:28 <atagar> phw: Side note concerning zoossh, another option could be
lazy parsing for descriptors. If I was to do stem's parsers again that's
what I'd opt for to make them more performant. That would be a fair bit of
work, but would both benefit all stem users and have performance just as
fast as any Go solution (time would all be IO).
18:29 <atagar> That said though, Zoossh seems like a great way of
learning the language so if that's the goal have fun. :)
18:36 <phw> atagar: that's actually a good idea, thanks
18:39 <atagar> phw: Oh! If you're interested then please open a ticket
under the Stem component. This is something I've idly given some thought
to for over a year but never bothered to actually jot down the idea. ;P
18:39 <atagar> Didn't expect you to actually think about opting for this
route.
18:41 <atagar> Thought was that reading a descriptor dumps to a simple
object that's a {keyword: [lines...]} dictionary. The getter methods then
parse the actual content and cache the results. Upside: far, far faster
since you only parse the fields you care about, downside: no upfront
validation is done so malformed content would be acceptable.
18:42 <atagar> That said, validation is a far, far smaller concern for
our users than performance in practice so this is a tradeoff I'd be fine
with.
18:42 <atagar> We could then have a validate() method that simply calls
all the getters to achieve the same thing we do now.
18:45 <atagar> Previously I thought that doing this would break backward
compatibility which made me a little less keen on it (since we'd then need
'descriptor v2' objects) but on refelction it doesn't. We could slip this
in transparently. The only difference users would see would be a
tremendous speedup if the opt to not have validation.
}}}
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/14011>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list