[tor-dev] GSoC Ideas

George Kadianakis desnacked at riseup.net
Tue Feb 25 16:11:38 UTC 2014


Vighnesh Birodkar <vighneshbirodkar at gmail.com> writes:

> Hello
>
> I am found a couple of ideas from the Ideas Page interesting . I was a GSoC
> student for SimpleCV last year. In the past I've programmed in C,C++,Java
> and Python .
>
> Following are my queries .
>
> 1. Search for Hidden Services .
>
> I apologize in advance if there is something obviously wrong with my idea.
> Dark Web consists of information that cannot be crawled because it doesn't
> appear as hyperlinks in other pages . But someone somewhere will always
> have access to this information, either by entering search queries ,
> through subscriptions or logging in. What if we can index all the pages a
> browser visits ? Users can voluntarily install and enable or disable a
> plugin in their browsers . This plugin will index process ( and maybe index
> ) pages locally and upload it's data to servers which will hold the global
> index .
>

I'm not sure what you mean by 'Dark Web', but if you mean 'Tor Hidden
Services' it _is_ possible to crawl and index onion addresses. For
example, if you google for ".onion" and check through the first few
result pages you can find dozens of onion addresses. If you then crawl
those pages you will get even more onion addresses.

Then the question is how you present those onion addresses to the user
of the search engine. Users should be able to search for terms and get
accurate results (popularity tracking, backlinks, etc. should be used
to reduce phishing). The search engine should also be able to give a
short description of each hidden service (e.g. by scraping its
contents, or by the community editing the description, or by using
official descriptions [0], or...).

Assuming that all the above are solved we might get to the point were
we have indexed all the potentialy visible onion addresses and that's
where your browser extension idea might be useful. However we are
currently quite far away from that situation. I also doubt that many
users of hidden services would install a browser extension to index
Hidden Services that have been intentionally kept secret (and hence
not found by conventional crawling).

[0]: https://ahmia.fi/documentation/descriptionProposal/


More information about the tor-dev mailing list