[metrics-bugs] #33010 [Metrics/Ideas]: Monitor cloudflare captcha rate: do a periodic onionperf-like query to a cloudflare-hosted static site
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Mar 10 03:15:32 UTC 2020
#33010: Monitor cloudflare captcha rate: do a periodic onionperf-like query to a
cloudflare-hosted static site
---------------------------------------+------------------------------
Reporter: arma | Owner: metrics-team
Type: task | Status: new
Priority: Medium | Milestone:
Component: Metrics/Ideas | Version:
Severity: Normal | Resolution:
Keywords: network-health gsoc-ideas | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
---------------------------------------+------------------------------
Comment (by woswos):
I wanted to conduct a few simple experiments on this issue. I will start
by explaining my setup and continue with the experiments themselves.
'''Domain Setup'''
I registered two domains ([https://captcha.wtf/ captcha.wtf] and
[https://exit11.online/ exit11.online]) with IPv4 records on Cloudflare.
After playing with Cloudflare settings, I understood that domain owners
have an important role in the way Cloudflare blocks Tor users.
A new free Cloudflare account comes with a default security level (like
the security levels in the Tor browser and as comment:5 mentioned), and
the default security level doesn't explicitly block Tor users. I am not
saying Cloudflare is innocent, but they don't mention a possible Tor user
blocking at this security level. However, Tor shows up as a country on the
Cloudflare firewall settings, and it is possible to block Tor users based
on this firewall rule. I think they have a list of Tor exit node IPs, and
they use this list to perform the filtering. They "offer" JS and Captcha
challenges in addition to simple blocking, as shown in the image below:
[[Image(https://bottomless-pit.barkin.io/tor-firewall-rules.png,
width=100%)]]
I think that's why some Tor users face more captcha challenges at higher
Tor browser security levels. JavaScipt is blocked at higher security
levels, and they can't pass the Cloudflare JS challenges.
\\
Also, if a firewall rule related to Tor is set, Cloudflare applies that
rule (for example, the never-ending captcha challenge) all the time even
if the user has somehow managed to pass the challenge 5 seconds ago - I
think that is the part all of us hate, it just creates an endless loop. A
sample Cloudflare firewall record below shows that the same IP address is
continuously challenged over and over again, even after successfully
passing the captcha challenge.
[[Image(https://bottomless-pit.barkin.io/tor-firewall-1.png, width=100%)]]
\\
exit11.online has the default Cloudflare configuration without any
additional firewall or protection. I am guessing that this would be the
case with most of the average Cloudflare users. I also registered the
[https://bypass.exit11.online/ bypass.exit11.online] subdomain, which
bypasses the Cloudflare proxy and only utilizes Cloudflare as a DNS
hosting service and CDN.
[[Image(https://bottomless-pit.barkin.io/tor-cloudflare-exit11.png,
width=100%)]]
\\
captcha.wtf has the default Cloudflare configuration ''with the additional
firewall configuration'' for blocking Tor users, as I have mentioned
previously. I registered this second domain to see the difference between
using the default Cloudflare settings and adding additional firewall
rules. I also registered the [https://bypass.captcha.wtf/
bypass.captcha.wtf] subdomain, which bypasses the Cloudflare proxy and
only utilizes Cloudflare as a DNS hosting service and CDN.
[[Image(https://bottomless-pit.barkin.io/tor-cloudflare-wtf.png,
width=100%)]]
[[Image(https://bottomless-pit.barkin.io/tor-cloudflare-wtf-firewall.png,
width=100%)]]
\\
Both of these domains have a very simple static "Hello world!" page at
`/index.html`, and there is a more complicated page at `/complex.html`
that loads resources from different locations. Additionally, captcha.wtf &
exit11.online have SSL certificates issued by Cloudflare and
bypass.captcha.wtf & bypass.exit11.online have SSL certificates issued by
Let's Encrypt. I thought that these might have an effect on the way
Cloudflare behaves.
'''Experimenting'''
Later, I used the Python script mentioned in comment:7 (it uses httplib)
and the tor-browser-selenium mentioned in comment:13 to conduct a few
simple experiments. I wrote another script to fetch different domain
combinations via tor-browser-selenium and Python's httplib. For example,
fetching bypass.exit11.online, exit11.online, exit11.online/complex.html,
and bypass.exit11.online/complex.html via both tor-browser-selenium and
Python's httplib.
'''Results'''
After fetching each combination about 100 times at one minute intervals,
the domain with the default configuration (exit11.online) was not blocked
a single time via both Tor and httplib. However, the domain with
additional firewall configuration (captcha.wtf) was blocked every single
time when fetched via Tor. Of course, both of the `bypass` subdomains were
fine since Cloudflare proxy was disabled, but I wanted to test it anyway.
'''Possible Conclusions'''
I'm sure my simple tests are not enough at all to draw a meaningful
conclusion, but these results make me question the role of domain owners
in this endless captcha problem. The domain with default Cloudflare
configurations didn't block Tor users, but the domain with extra firewall
configuration set by the domain owner banned Tor users all the time.
However, again, this is an observation based on my very limited
experiments.
I want to conduct more advanced experiments based on your feedback to
address the metrics mentioned in the original ticket and find possible
patterns in the recorded data.
I will organize my code a little bit more and put a link to the repository
here. Meanwhile, please feel free to use both of these domains for further
testing.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33010#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list