[tor-dev] GSoC - Search Engine for Hidden services

grarpamp grarpamp at gmail.com
Mon Mar 17 08:10:18 UTC 2014


> The rating idea is trivially gameable. Do we assume that
> all users are good citizens?

Given experience with onionland, unless you are
building your own review team, I too would be careful
with allowing random user input or believing it to have
any given percentage of good. There are already large
networks dedicated to self reinforcing their own bad
intentions out there.

> I'm quite excited about the changes to your crawler
> (that will give us a bigger list of HSes),
> Probably also include the YaCy/crawler configs?

I'd like to compare them to what I might deploy on my YaCy,
And more importantly, to test YaCy's capability to reach
the top live onion counts I'm finding with custom crawling.
I like to chat with ahmia and more crawlers in future
as my backend comes together better.

>From seeing a prototype already, I'd second looking at
nutch and/or some nosql (now maybe nutch v2) as well
long term.

I also suggest tor2web et al review the wisdom in building
more services on top of the basic gatewaying that exists.
At some point you need to be moving people and yourself
to simply run the Tor client, not to build comfy all in one
clearnet home so they just be lazy and suck your gateway
nipple forever.


> https://antani.tor2web.org/antanistaticmap/stats/yesterday

Publishing these types of lists works the good task of
helping seed crawlers with new onions as well. Note that
you should include the '.onion' suffixes so that all crawler
parsers can recognize and extract them without having
to parse you specifically.


More information about the tor-dev mailing list