[metrics-bugs] #24384 [Metrics/Onionoo]: Decode percent-encoded characters in qualified search terms
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Nov 24 16:42:26 UTC 2017
#24384: Decode percent-encoded characters in qualified search terms
-----------------------------+------------------------------
Reporter: karsten | Owner: metrics-team
Type: defect | Status: merge_ready
Priority: High | Milestone:
Component: Metrics/Onionoo | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------+------------------------------
Changes (by iwakeh):
* status: new => merge_ready
Comment:
URI syntax allows for different types of delimeters. For generic
delimeters, e.g. `/` and `?`, the percent encoding is only used when they
appear inside data transferred in the URI.
There is another subset of delimiters (referred to as sub-delims in
[https://tools.ietf.org/html/rfc3986#section-2.2 rfc3986]), which can be
used as separators in sub-components as an application defines it: For
example, if the plus `+` was used as a delimiter between several search
terms it ought to be percent encoded when being part of the data. Without
any special meaning (as done in Onionoo) the plus doesn't need to be
encoded and whether only the non-encoded or both are accepted is courtesy
of the processing application.
With the current patch Onionoo allows for plus to be submitted as `+` and
`%B2` in searches. The following tests pass (ResourceServletTest) as well
as their already existing counterparts with just `+`:
{{{
@Test(timeout = 100)
public void testSearchBase64FingerprintPlusEncoded() {
this.assertSummaryDocument(
"/summary?search=ACXBNsHzqe7%2BKuP5GPA7+iG1Bws", 1,
new String[] { "TimMayTribute" }, 0, null);
}
@Test(timeout = 100)
public void testSearchEmailAddressEncodedPlus() {
this.assertSummaryDocument(
"/summary?search=contact:<tor%2Bsteven.murdoch at cl.cam.ac.uk>", 1,
new String[] { "TimMayTribute" }, 0, null);
}
}}}
Regarding the space character:
Replacing it by `+` is done for form encoding (cf.
[https://secure.php.net/manual/en/function.urlencode.php PHP manual],
[https://www.w3.org/TR/html4/interact/forms.html#h-17.13.3.4 w3c HTML
spec]) whereas `encodeURIComponent()` properly uses the percent encoding
according to the applied URI spec.
Onionoo does not receive form data, therefore accepting plus encoded as
`+` or `%2B` and space as `%20` is perfectly fine.
The patch passes all checks and tests, and is ready to be merged. Maybe,
with some tests added to make clear what data is accepted.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24384#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list