[metrics-bugs] #24384 [Metrics/Onionoo]: Decode percent-encoded characters in qualified search terms

Tor Bug Tracker & Wiki blackhole at torproject.org
Fri Nov 24 16:42:26 UTC 2017


#24384: Decode percent-encoded characters in qualified search terms
-----------------------------+------------------------------
 Reporter:  karsten          |          Owner:  metrics-team
     Type:  defect           |         Status:  merge_ready
 Priority:  High             |      Milestone:
Component:  Metrics/Onionoo  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:                   |         Points:
 Reviewer:                   |        Sponsor:
-----------------------------+------------------------------
Changes (by iwakeh):

 * status:  new => merge_ready


Comment:

 URI syntax allows for different types of delimeters.  For generic
 delimeters, e.g. `/` and `?`, the percent encoding is only used when they
 appear inside data transferred in the URI.
 There is another subset of delimiters (referred to as sub-delims in
 [https://tools.ietf.org/html/rfc3986#section-2.2 rfc3986]), which can be
 used as separators in sub-components as an application defines it: For
 example, if the plus `+` was used as a delimiter between several search
 terms it ought to be percent encoded when being part of the data.  Without
 any special meaning (as done in Onionoo) the plus doesn't need to be
 encoded and whether only the non-encoded or both are accepted is courtesy
 of the processing application.

 With the current patch Onionoo allows for plus to be submitted as `+` and
 `%B2` in searches.  The following tests pass (ResourceServletTest) as well
 as their already existing counterparts with just `+`:

 {{{
   @Test(timeout = 100)
   public void testSearchBase64FingerprintPlusEncoded() {
     this.assertSummaryDocument(
         "/summary?search=ACXBNsHzqe7%2BKuP5GPA7+iG1Bws", 1,
         new String[] { "TimMayTribute" }, 0, null);
   }

   @Test(timeout = 100)
   public void testSearchEmailAddressEncodedPlus() {
     this.assertSummaryDocument(
         "/summary?search=contact:<tor%2Bsteven.murdoch at cl.cam.ac.uk>", 1,
         new String[] { "TimMayTribute" }, 0, null);
   }
 }}}

 Regarding the space character:

 Replacing it by `+` is done for form encoding (cf.
 [https://secure.php.net/manual/en/function.urlencode.php PHP manual],
 [https://www.w3.org/TR/html4/interact/forms.html#h-17.13.3.4 w3c HTML
 spec]) whereas `encodeURIComponent()` properly uses the percent encoding
 according to the applied URI spec.

 Onionoo does not receive form data, therefore accepting plus encoded as
 `+` or `%2B` and space as `%20` is perfectly fine.

 The patch passes all checks and tests, and is ready to be merged.  Maybe,
 with some tests added to make clear what data is accepted.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24384#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list