[tor-commits] [tech-reports/master] Add new technical report discussing the Tor censorship analysis tool.
karsten at torproject.org
karsten at torproject.org
Wed Feb 6 07:21:35 UTC 2013
commit 532e0ba5ba8acf98e034e546e3e619e749ecd545
Author: Philipp Winter <identity.function at gmail.com>
Date: Wed Feb 6 01:30:43 2013 +0100
Add new technical report discussing the Tor censorship analysis tool.
---
.../censorship-analysis-tool.bib | 59 +++++
.../censorship-analysis-tool.tex | 244 ++++++++++++++++++++
2013/censorship-analysis-tool/tortechrep.cls | 1 +
3 files changed, 304 insertions(+), 0 deletions(-)
diff --git a/2013/censorship-analysis-tool/censorship-analysis-tool.bib b/2013/censorship-analysis-tool/censorship-analysis-tool.bib
new file mode 100644
index 0000000..4a30e60
--- /dev/null
+++ b/2013/censorship-analysis-tool/censorship-analysis-tool.bib
@@ -0,0 +1,59 @@
+ at misc{censorwiki,
+ title = {{Censorship Wiki}},
+ note = {URL: \url{https://censorshipwiki.torproject.org}},
+}
+
+ at misc{iran,
+ author = {phobos},
+ title = {{Update on Internet censorship in Iran}},
+ note = {URL: \url{https://blog.torproject.org/blog/update-internet-censorship-iran}},
+}
+
+ at inproceedings{Wright2011,
+ address = {San Francisco, CA, USA},
+ author = {Wright, Joss and de Souza, Tulio and Brown, Ian},
+ booktitle = {Free and Open Communications on the Internet},
+ publisher = {USENIX Association},
+ title = {{Fine-Grained Censorship Mapping: Information Sources, Legality and Ethics}},
+ year = {2011},
+ note = {URL: \url{http://static.usenix.org/event/foci11/tech/final_files/Wright.pdf}}
+}
+
+ at inproceedings{Filasto2012,
+ address = {Bellevue, WA, USA},
+ author = {Filast\`{o}, Arturo and Appelbaum, Jacob},
+ booktitle = {Free and Open Communications on the Internet},
+ publisher = {USENIX Association},
+ title = {{OONI: Open Observatory of Network Interference}},
+ year = {2012},
+ note = {URL: \url{https://www.usenix.org/system/files/conference/foci12/foci12-final12.pdf}}
+}
+
+ at misc{tlshistory,
+ author = {Nick Mathewson},
+ title = {{TLSHistory}},
+ note = {URL: \url{https://trac.torproject.org/projects/tor/wiki/org/projects/Tor/TLSHistory}},
+}
+
+ at misc{daphne,
+ title = {daphne},
+ note = {URL: \url{https://trac.torproject.org/projects/tor/wiki/doc/OONI/Tests/daphne}},
+}
+
+ at misc{mirrors,
+ author = {{The Tor Project}},
+ title = {{Tor: Mirrors}},
+ note = {URL: \url{https://www.torproject.org/getinvolved/mirrors.html.en}},
+}
+
+ at misc{obfsproxy,
+ author = {{The Tor Project}},
+ title = {obfsproxy},
+ note = {URL: \url{https://www.torproject.org/projects/obfsproxy.html.en}}
+}
+
+ at misc{gettor,
+ author = {{The Tor Project}},
+ title = {{GetTor e-mail autoresponder}},
+ note = {URL: \url{https://www.torproject.org/projects/gettor.html.en}}
+}
diff --git a/2013/censorship-analysis-tool/censorship-analysis-tool.tex b/2013/censorship-analysis-tool/censorship-analysis-tool.tex
new file mode 100644
index 0000000..73f479f
--- /dev/null
+++ b/2013/censorship-analysis-tool/censorship-analysis-tool.tex
@@ -0,0 +1,244 @@
+\documentclass{tortechrep}
+\usepackage{url}
+\usepackage{graphicx}
+
+\begin{document}
+
+\title{
+ Design Requirements for a \\
+ Tor Censorship Analysis Tool
+}
+
+\author{Philipp Winter}
+
+\contact{\href{mailto:phw at torproject.org}{phw at torproject.org}}
+\reportid{2013-02-001}
+\date{\today}
+
+\maketitle
+
+\section{Introduction}
+% motivation
+The Tor network is documented to be blocked in several countries
+\cite{censorwiki}. Analyzing and circumventing these blocks typically requires
+detailed \emph{packet traces} or access to \emph{machines inside censoring
+countries}. Both, however, are not always easy to acquire:
+
+\begin{enumerate}
+ % why network traces are not so good
+ \item Network traces are problematic for two reasons. First, they are
+ difficult to obtain since they require the cooperation of users within
+ censoring countries. Second, they are hard to anonymize and must not
+ fall into wrong hands. Derived information, such as flow
+ diagrams\footnote{See, for example, the blog post discussing a Tor
+ block in Iran \cite{iran}.}, are typically safe to publish but
+ frequently lack important information.
+ % why access inside countries is hard
+ \item The alternative to network traces is to gain access to machines inside the
+ censoring regime. This approach turns out to be difficult as well; mostly
+ due to the lack of volunteers who could provide machines or the lack of VPS
+ providers and open SOCKS proxies.
+\end{enumerate}
+
+% where we are heading
+These problems show that there is a strong need for a lightweight tool which can
+assist in analyzing censorship events. This tool should be run by censored users
+and perform several tests to gain a rough understanding of how and if Tor could
+be blocked in the respective network. The results of these tests should make it
+back to the Tor project and are used to improve circumvention technology such as
+obfsproxy \cite{obfsproxy} and to document censorship \cite{censorwiki}.
+
+% what this techreport does
+This technical report discusses the design requirements for such a censorship
+analysis tool. We list the desired features, discuss how they can be implemented
+and we give a rough overview of the software design. After all, this
+technical report should serve as basis for the development and deployment of the
+censorship analysis tool.
+
+\section{Feature Requirements}
+\label{sec:features}
+The following list enumerates the features which are desirable in a censorship
+analyzer. Naturally, certain features are harder to implement than others, so
+the list is organized in ascending order based on the difficulty of the
+respective feature.
+
+\begin{enumerate}
+ \item \textbf{Capture debugging process}: The tool should be able to create
+ a pcap file of the network debugging process to allow further
+ inspection. While very handy, pcaps are sensitive data and would require
+ Administrator/root permissions.
+
+ \item \textbf{User-friendly output}: While the censorship analyzer is meant
+ to assist Tor developers in debugging censorship incidents,
+ user-friendly log messages are easy to add and can give users an idea of
+ why their Tor fails to connect. Based on the gathered data, the analyzer
+ could give the user suggestions on what to try next. This might even
+ slightly reduce the help desk's load.
+
+ \item \textbf{Obfuscate tests}: Censors might be interested in identifying
+ the Tor censorship analyzer and try to actively falsify the tests.
+ Therefore, the analyzer should make an effort to stay under the radar.
+ In particular, the analyzer should implement:
+ \begin{enumerate}
+ \item Random sleep periods between (and perhaps during) tests.
+ \item Randomize the order of executed tests.
+ \item Use random IP addresses for tests such as the relay
+ reachability discussed below.
+ \end{enumerate}
+ Note that it is not possible to completely hide the analyzer's
+ existence. Rather, this feature should be understood as hiding
+ all too obvious network activity.
+
+ \item \textbf{Leave no traces behind}: The analyzer should not leave any
+ traces on the user's hard disk. Ideally, the analyzer should generate a
+ single report file which is placed in the same directory as the analyzer
+ itself. That would make it possible for users to conveniently delete all
+ traces. Temporary analysis files should be deleted after the report was
+ generated.
+
+ \item \textbf{DirAuth reachability}: Try to connect to the directory
+ authorities and download the consensus. If this fails, check if:
+ \begin{enumerate}
+ \item The authorities respond to ICMP echo requests to see if the
+ IP addresses are blocked.
+ \item Run traceroutes to the directory authorities as well as to
+ other---hopefully unblocked---hosts in the same subnet as the
+ directory authorities. This could yield the location of
+ censoring boxes and serve as proof that the IP addresses are, in
+ fact, blocked.
+ \end{enumerate}
+
+ \item \textbf{Web site reachability}: Try to connect to
+ \url{https://www.torproject.org} and fetch the index page. If the web
+ site fails to load, check if:
+ \begin{enumerate}
+ \item One of the official Tor mirrors \cite{mirrors}
+ works\footnote{In particular, mirrors without the strings ``tor''
+ or ``torproject'' in the domain should be given a try.}.
+ \item The domain \texttt{www.torproject.org} resolves to the correct,
+ non-poisoned IP addresses.
+ \item A simple TCP connection to
+ \texttt{www.torproject.org} succeeds. If so, in the subsequent
+ step, a TLS session could be established. That way, it is possible
+ to find out whether DPI boxes are inspecting the SNI in the TLS
+ client hello.
+ \item The hosts behind \texttt{www.torproject.org} respond to ICMP
+ echo requests.
+ \end{enumerate}
+
+ \item \textbf{Bridge distribution}: Try to connect to
+ \url{https://bridges.torproject.org} and fetch the index page. If the
+ web site loads, it is safe to assume that obfsproxy bridges can be
+ fetched as well. If the web site fails to load, check if:
+ \begin{enumerate}
+ \item The domain \texttt{bridges.torproject.org} resolves to the
+ correct, non-poisoned IP address.
+ \item A simple TCP connection to
+ \texttt{bridges.torproject.org} succeeds. If so, in the subsequent
+ step, a TLS session could be established. That way, it is possible
+ to find out whether DPI boxes are inspecting the SNI in the TLS
+ client hello.
+ \item The host behind \texttt{bridges.torproject.org} responds to
+ ICMP echo requests.
+ \end{enumerate}
+
+ \item \textbf{Relay reachability}: Try to connect to a number of Tor relays
+ listed in the consensus. Typically, clients connect to entry guards.
+ However, it would also be interesting to learn whether connections to
+ pure middle or exit relays succeed\footnote{This could be an indicator
+ that a censor is blindly blacklisting all IP addresses found in the
+ consensus.}. If this fails, check if:
+ \begin{enumerate}
+ \item A Tor-specific TLS client hello can be sent to
+ \texttt{mail.google.com:443} -- assuming that this host is
+ reachable. If the connection is closed in a non-clean fashion,
+ this could be an indicator that fields in the TLS client hello
+ are subject to filtering.
+ \item (Private) bridges and their censorship-resistant
+ variants (brdgrd, obfs2, obfs3, flashproxies) are reachable.
+ % TODO - all bridges we put in the censorship analyzer bundle
+ % will eventually get blocked.
+ \end{enumerate}
+
+ \item \textbf{Gather debug information}: Censorship is typically not
+ homogeneous across a country and varies depending on provinces,
+ autonomous systems or ISPs \cite{Wright2011}. As a result, we are
+ interested in information which can help shed light on the respective
+ censorship infrastructure. Also, this would help ruling out
+ interferences and prevent jumping to wrong conclusions. Of interest
+ would be:
+ \begin{enumerate}
+ \item What ISP does the user have?
+ \item What is the autonomous system number?
+ \item Is the user behind a captive portal?
+ \item Is all traffic forced to go through an HTTP proxy?
+ \end{enumerate}
+
+ \item \textbf{Debug the TLS handshake}: Tor is frequently blocked based on
+ identifying information in its TLS handshake
+ \cite{tlshistory,censorwiki}. Debugging the exact fingerprint used by
+ DPI boxes to identify Tor can be of great value. This is, however, a
+ very hard problem which requires a client server architecture to infer
+ fingerprints. The tool daphne was started with this goal in mind
+ \cite{daphne}.
+\end{enumerate}
+
+\section{Software Architecture}
+The following list enumerates software-specific aspects of the censorship
+analyzer.
+
+\begin{enumerate}
+ \item \textbf{Ease of use}: It is crucial that the analyzer is as easy to
+ use as possible. Ideally, it should be a self-contained click-and-go
+ executable, just like the Tor Browser Bundle. After all, the target
+ group consists mostly of ordinary users rather than developers.
+
+ Ease of use also involves the analyzer's \emph{bundle size}. Ideally,
+ the analyzer would only be a few megabytes in size which would also make
+ is suitable for distribution via GetTor \cite{gettor}.
+
+ \item \textbf{Configurable during build}: It should be possible to pass
+ configuration parameters to the analyzer during the build process. That
+ is necessary because certain information such as IP addresses of relays
+ to test or of \url{www.torproject.org} change over time or might be
+ white-listed by censors. As a result, it is not a good idea to hard-code
+ these things in the source code.
+
+ \item \textbf{Least privilege}: Ideally, the analyzer should not require
+ Administrator/root access.
+
+ \item \textbf{Existing framework}: There is no need to reinvent the wheel.
+ The analyzer should be implemented as tests for the open observatory of
+ network interference (OONI) \cite{Filasto2012}. OONI provides a Python
+ API which can be used to develop all of the above mentioned features.
+
+ \item \textbf{Data delivery}: Eventually, the Tor project has to learn about
+ test results. There are two possible ways:
+ \begin{enumerate}
+ \item The analyzer could automatically transmit gathered data to
+ the Tor project. Automated uploads must require the user's
+ informed consent and the user must be given the choice to
+ review the report prior to submission.
+ \item The analyzer can create a report and then ask the user to
+ send the report to censorship-analyzer at torproject.org (which
+ does not exist yet). This could still be a fallback plan if an
+ automated upload fails.
+ \end{enumerate}
+ Further, the report should contain a \emph{message digest} which is
+ built over the report. This is particularly important when the report is
+ being sent over e-mail since it allows us to detect if the report is
+ incomplete or the user accidentally changed parts of it.
+
+ \item \textbf{Testability}: All the features discussed in Section
+ \ref{sec:features} should be testable in an automated way. Otherwise, we
+ might end up shipping code which does not work in real environments or
+ we might not notice if improvements break existing code.
+\end{enumerate}
+
+\section*{Acknowledgments}
+Arturo Filast\`{o}, George Kadianakis, Karsten Loesing and Runa A. Sandvik
+provided valuable feedback for this technical report.
+
+\bibliography{censorship-analysis-tool}
+
+\end{document}
diff --git a/2013/censorship-analysis-tool/tortechrep.cls b/2013/censorship-analysis-tool/tortechrep.cls
new file mode 120000
index 0000000..4c24db2
--- /dev/null
+++ b/2013/censorship-analysis-tool/tortechrep.cls
@@ -0,0 +1 @@
+../../tortechrep.cls
\ No newline at end of file
More information about the tor-commits
mailing list