[tor-bugs] #4463 [Website]: Set up web log analysis tool
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Thu Dec 1 11:45:43 UTC 2011
#4463: Set up web log analysis tool
---------------------+------------------------------------------------------
Reporter: runa | Owner: runa
Type: project | Status: assigned
Priority: normal | Milestone: Sponsor Z: December 31, 2011
Component: Website | Version:
Keywords: | Parent:
Points: | Actualpoints:
---------------------+------------------------------------------------------
Comment(by runa):
I looked at four different web log analysis tools, here's what I found:
[http://piwik.org/ Piwik] looks great, but is not available in Ubuntu or
Debian. Setting it up manually is pretty straight forward, but you will
not be able to import Apache logs without using some third-party script.
Last time I checked, that third-party script had some issues with our
sanitized log format.
[http://awstats.sourceforge.net/ AWStats] is easy to set up and easy to
use, but incredibly slow when importing logs. I set up AWStats on an
Ubuntu EC2 instance and pulled the sanitized logs for January and February
2010 (you only get 8 GB storage). The import of wiki.torproject.org-
access.log was pretty quick, and we have some
[http://107.22.86.235/statistics/awstats.pl?month=01&year=2010&output=main&config=wiki&framename=index
preliminary results]. However, the import of www.torproject.org-access.log
does not complete at all. Maybe it's because I tried to do all this in the
cloud, or maybe it's just AWStats.
[http://www.webalizer.org/ Webalizer] is just as easy to set up and use as
AWStats. It doesn't look as pretty, but it's a lot faster when it comes to
importing existing log I managed to set it up and import the Jan+Feb
www.torproject.org-access.log without any problems.
[http://www.splunk.com/ Splunk] was recommended to me by someone on
Twitter, so I figured I'd look into it. The free version of Splunk allows
you to index only 500 megabytes of data per day, we probably want more
than that.
Another option is to write our own parser and use R to create graphs
similar to what we have on metrics.tpo. Writing our own parser will take
some time, so maybe we should just go with Webalizer for now.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/4463#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list