[tor-dev] GSoC Ideas

Vighnesh Birodkar vighneshbirodkar at gmail.com
Tue Feb 25 16:22:53 UTC 2014


I am sorry. Where you mentioned "clear web" I mistook hidden services to be
the Deep Web[1]. That's what I meant by Dark Web.


[1] : http://en.wikipedia.org/wiki/Deep_Web



On Tue, Feb 25, 2014 at 9:41 PM, George Kadianakis <desnacked at riseup.net>wrote:

> Vighnesh Birodkar <vighneshbirodkar at gmail.com> writes:
>
> > Hello
> >
> > I am found a couple of ideas from the Ideas Page interesting . I was a
> GSoC
> > student for SimpleCV last year. In the past I've programmed in C,C++,Java
> > and Python .
> >
> > Following are my queries .
> >
> > 1. Search for Hidden Services .
> >
> > I apologize in advance if there is something obviously wrong with my
> idea.
> > Dark Web consists of information that cannot be crawled because it
> doesn't
> > appear as hyperlinks in other pages . But someone somewhere will always
> > have access to this information, either by entering search queries ,
> > through subscriptions or logging in. What if we can index all the pages a
> > browser visits ? Users can voluntarily install and enable or disable a
> > plugin in their browsers . This plugin will index process ( and maybe
> index
> > ) pages locally and upload it's data to servers which will hold the
> global
> > index .
> >
>
> I'm not sure what you mean by 'Dark Web', but if you mean 'Tor Hidden
> Services' it _is_ possible to crawl and index onion addresses. For
> example, if you google for ".onion" and check through the first few
> result pages you can find dozens of onion addresses. If you then crawl
> those pages you will get even more onion addresses.
>
> Then the question is how you present those onion addresses to the user
> of the search engine. Users should be able to search for terms and get
> accurate results (popularity tracking, backlinks, etc. should be used
> to reduce phishing). The search engine should also be able to give a
> short description of each hidden service (e.g. by scraping its
> contents, or by the community editing the description, or by using
> official descriptions [0], or...).
>
> Assuming that all the above are solved we might get to the point were
> we have indexed all the potentialy visible onion addresses and that's
> where your browser extension idea might be useful. However we are
> currently quite far away from that situation. I also doubt that many
> users of hidden services would install a browser extension to index
> Hidden Services that have been intentionally kept secret (and hence
> not found by conventional crawling).
>
> [0]: https://ahmia.fi/documentation/descriptionProposal/
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20140225/57567135/attachment.html>


More information about the tor-dev mailing list