[tor-bugs] #2687 [Torperf]: Update filter.R to parse Torperf's new .mergedata format

Tue Apr 26 15:34:33 UTC 2011

#2687: Update filter.R to parse Torperf's new .mergedata format
-------------------------+--------------------------------------------------
 Reporter:  karsten      |          Owner:  karsten     
     Type:  enhancement  |         Status:  needs_review
 Priority:  major        |      Milestone:              
Component:  Torperf      |        Version:              
 Keywords:               |         Parent:              
   Points:  4            |   Actualpoints:              
-------------------------+--------------------------------------------------

Comment(by tomb):

 Hey all!

 If we can hold on for just a little while, I have a new version almost
 complete, but I really want to finish it before I post it.  My new version
 incorporates code from Karsten's version as well as mine.  I'd love to get
 it totally finished, but I do have to test the comparative efficiency of
 the program against both good data, and data with some errors mixed in.
 I've got this test code written, but I want to get it all wrapped together
 in a branch before you all evaluate it.

 BTW: The reason I didn't post my complete email to Karsten was because it
 has some ideas that were already out of date since I had done further
 thinking.

 W/r/t adding all the data into a big vector, this may be the problem, and
 it is what I meant when I said above that "I have some ideas" about where
 to look for the problem.

 If R makes any kind of sense at all this could not possibly be O(n^2)
 since it is tail recursion.  If list insertion at the tail of a vector is
 any more than adding a pointer to the already allocated memory for the
 line, then R is not the right choice.  This is not impossible.  My
 experience with R like languages makes me intuit that insertion at the end
 of a vector is a pointer operation, but I have not yet gotten the chance
 to learn the details of R that are not listed in the language
 specification which doesn't say.  I know that it is constant time in other
 list oriented languages such as LISP, Haskel, and SML.

 I plan to try i/o code more in a c style with data output in small buffer
 flushes rather than being in memory in a single large data structure,
 _however_ I used the data structure approach because of my general
 impression that this is more in the spirit of R.  If I understand
 correctly, R was designed to handle really large matrices, which is the
 structure I chose.  I think this is almost a defining feature of R.

 Bottom line:  I am comparing several approaches to find the best one.  I
 will post a working if not perfect version soon.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2687#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online