[tor-bugs] #33835 [Circumvention/BridgeDB]: Gmail's quoted response confuses BridgeDB's email autoresponder
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Apr 16 15:10:15 UTC 2020
#33835: Gmail's quoted response confuses BridgeDB's email autoresponder
------------------------------------+-------------------------------
Reporter: phw | Owner: agix
Type: defect | Status: assigned
Priority: Medium | Milestone:
Component: Circumvention/BridgeDB | Version:
Severity: Normal | Resolution:
Keywords: s30-o22a2 | Actual Points:
Parent ID: #31279 | Points: 1
Reviewer: | Sponsor: Sponsor30-can
------------------------------------+-------------------------------
Comment (by agix):
To effectively parse out the requested options via email, I used the
get_payload() function for EmailMessage objects.
As pointed out in both answers of
[https://stackoverflow.com/questions/45124127/unable-to-extract-the-body-
of-the-email-file-in-python/45124153] there needs to be a certain policy
defined (policy.compat32 instead the default policy) to be able to use
get_payload instead of get_body.
The advantage of get_payload is that it ignores the Content-Type and
Content-Transfer-Encoding header and therefore solely focuses on the
actual payload which makes the parsing way easier.
get_payload(0).get_payload() parses out the plain text, but cant
differentiate between the actual message and previous responses.
Thats why this approach just focuses on the first 3 words of an incoming
email to determine the response action.
This seems to be one of only few approaches if we don't want to rely on
regex anymore, but I am open to suggestions.
Additionally some unittests need to be adjusted for this patch.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33835#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list