[tor-bugs] #2687 [Torperf]: Update filter.R to parse Torperf's new .mergedata format
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Tue Apr 26 15:34:33 UTC 2011
#2687: Update filter.R to parse Torperf's new .mergedata format
-------------------------+--------------------------------------------------
Reporter: karsten | Owner: karsten
Type: enhancement | Status: needs_review
Priority: major | Milestone:
Component: Torperf | Version:
Keywords: | Parent:
Points: 4 | Actualpoints:
-------------------------+--------------------------------------------------
Comment(by tomb):
Hey all!
If we can hold on for just a little while, I have a new version almost
complete, but I really want to finish it before I post it. My new version
incorporates code from Karsten's version as well as mine. I'd love to get
it totally finished, but I do have to test the comparative efficiency of
the program against both good data, and data with some errors mixed in.
I've got this test code written, but I want to get it all wrapped together
in a branch before you all evaluate it.
BTW: The reason I didn't post my complete email to Karsten was because it
has some ideas that were already out of date since I had done further
thinking.
W/r/t adding all the data into a big vector, this may be the problem, and
it is what I meant when I said above that "I have some ideas" about where
to look for the problem.
If R makes any kind of sense at all this could not possibly be O(n^2)
since it is tail recursion. If list insertion at the tail of a vector is
any more than adding a pointer to the already allocated memory for the
line, then R is not the right choice. This is not impossible. My
experience with R like languages makes me intuit that insertion at the end
of a vector is a pointer operation, but I have not yet gotten the chance
to learn the details of R that are not listed in the language
specification which doesn't say. I know that it is constant time in other
list oriented languages such as LISP, Haskel, and SML.
I plan to try i/o code more in a c style with data output in small buffer
flushes rather than being in memory in a single large data structure,
_however_ I used the data structure approach because of my general
impression that this is more in the spirit of R. If I understand
correctly, R was designed to handle really large matrices, which is the
structure I chose. I think this is almost a defining feature of R.
Bottom line: I am comparing several approaches to find the best one. I
will post a working if not perfect version soon.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2687#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list