[tor-bugs] #26420 [Core Tor/Stem]: Testing - specify literal patterns instead of regex patterns (was: Discuss: Testing - specify literal patterns instead of regex patterns)
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Jun 20 20:29:12 UTC 2018
#26420: Testing - specify literal patterns instead of regex patterns
------------------------------------------+---------------------
Reporter: dmr | Owner: dmr
Type: enhancement | Status: new
Priority: Very Low | Milestone:
Component: Core Tor/Stem | Version:
Severity: Minor | Resolution:
Keywords: dev testing code-improvement | Actual Points:
Parent ID: | Points:
Reviewer: atagar | Sponsor:
------------------------------------------+---------------------
Changes (by dmr):
* status: needs_review => new
* type: task => enhancement
Old description:
> w.r.t.
> [[https://gitweb.torproject.org/stem.git/commit/?id=7711050619af1a2f8ecf4aa87f774baa5f367b3b|7711050619af1a2f8ecf4aa87f774baa5f367b3b]],
> I was planning to file this ticket anyways, so might as well now for the
> discussion.
>
> atagar linked [[https://stackoverflow.com/questions/8672754/how-to-show-
> the-error-messages-caught-by-assertraises-in-unittest-in-
> python2-7/8673096#8673096|this StackOverflow answer]] in the commit
> message.
>
> (I was a bit behind on filing this ticket, but already started doing the
> literal `re.escape()` bit in my test cases. Hence atagar's comment in the
> commit.)
>
> Anyway, here's the ticket text I had started to prep - now //slightly//
> tweaked:
> ===
> The testing codebase makes pretty extensive use of
> `unittest.TestCase.assertRaisesRegexp()`.
> An example is
> [[https://gitweb.torproject.org/stem.git/tree/test/integ/client/connection.py?id=0192b29a4784465e5f69f11ced584a54644e4a90#n36|here]]:
> {{{
> def test_no_common_link_protocol(self):
> """
> Connection without a commonly accepted link protocol version.
> """
>
> for link_protocol in (1, 2, 6, 20):
> self.assertRaisesRegexp(stem.SocketError, 'Unable to establish a
> common link protocol with 127.0.0.1:1113', Relay.connect, '127.0.0.1',
> test.runner.ORPORT, [link_protocol])
> }}}
> The second argument is treated as a regex pattern, so it will actually
> match more than intended - possibly leading to some false negatives
> (although unlikely in this example).
>
> The use of `unittest.TestCase.assertRaisesRegexp()` without `re.escape()`
> for a literal expression is decently common - the use of it intended for
> a regex is fairly rare (I haven't come across a test yet that I can
> recall).
>
> Having a "literal" check is possible in (at least) two ways:
> {{{
> with self.assertRaises(SomeException) as cm:
> do_something(some_param)
> self.assertEqual(str(cm.exception), expected_literal)
> }}}
> {{{
> self.assertRaisesRegexp(SomeException, '^%s$' %
> re.escape(expected_literal), do_something, some_param)
> }}}
>
> Importantly, both of these check for //exact// string.
> Much of the codebase doesn't use `re.escape()`, and where it does, I
> didn't see any `^` or `$` (although I didn't search exhaustively), so
> that could match on substrings, also allowing for subtle bugs.
>
> So we may consider a helper class `StemTestCase(unittest.TestCase)` which
> adds an `assertRaisesLiteral()` method, to make it easier to do this. (Or
> some other means of adding that in.)
>
> We could of course take the second option with `re.escape()`, but since a
> lot of the codebase already doesn't seem to do that, it's probably quite
> easy to forget, especially the `'^%s$' % ` part.
>
> atagar: thoughts on these options? or leave things as-is / `wontfix`?
>
> (Filing this as a //task// ticket, as it's probably a discussion point
> more than anything else. I'd expect from the edge cases, there //could//
> be some defects, some enhancements.)
New description:
~~w.r.t.
[[https://gitweb.torproject.org/stem.git/commit/?id=7711050619af1a2f8ecf4aa87f774baa5f367b3b|7711050619af1a2f8ecf4aa87f774baa5f367b3b]],
I was planning to file this ticket anyways, so might as well now for the
discussion.~~
atagar linked [[https://stackoverflow.com/questions/8672754/how-to-show-
the-error-messages-caught-by-assertraises-in-unittest-in-
python2-7/8673096#8673096|this StackOverflow answer]] in the commit
message.
(I was a bit behind on filing this ticket, but already started doing the
literal `re.escape()` bit in my test cases. Hence atagar's comment in the
commit.)
Anyway, here's the ticket text I had started to prep - now //slightly//
tweaked:
===
The testing codebase makes pretty extensive use of
`unittest.TestCase.assertRaisesRegexp()`.
An example is
[[https://gitweb.torproject.org/stem.git/tree/test/integ/client/connection.py?id=0192b29a4784465e5f69f11ced584a54644e4a90#n36|here]]:
{{{
def test_no_common_link_protocol(self):
"""
Connection without a commonly accepted link protocol version.
"""
for link_protocol in (1, 2, 6, 20):
self.assertRaisesRegexp(stem.SocketError, 'Unable to establish a
common link protocol with 127.0.0.1:1113', Relay.connect, '127.0.0.1',
test.runner.ORPORT, [link_protocol])
}}}
The second argument is treated as a regex pattern, so it will actually
match more than intended - possibly leading to some false negatives
(although unlikely in this example).
The use of `unittest.TestCase.assertRaisesRegexp()` without `re.escape()`
for a literal expression is decently common - the use of it intended for a
regex is fairly rare (I haven't come across a test yet that I can recall).
Having a "literal" check is possible in (at least) two ways:
{{{
with self.assertRaises(SomeException) as cm:
do_something(some_param)
self.assertEqual(str(cm.exception), expected_literal)
}}}
{{{
self.assertRaisesRegexp(SomeException, '^%s$' %
re.escape(expected_literal), do_something, some_param)
}}}
Importantly, both of these check for //exact// string.
Much of the codebase doesn't use `re.escape()`, and where it does, I
didn't see any `^` or `$` (although I didn't search exhaustively), so that
could match on substrings, also allowing for subtle bugs.
So we may consider a helper class `StemTestCase(unittest.TestCase)` which
adds an `assertRaisesLiteral()` method, to make it easier to do this. (Or
some other means of adding that in.)
We could of course take the second option with `re.escape()`, but since a
lot of the codebase already doesn't seem to do that, it's probably quite
easy to forget, especially the `'^%s$' % ` part.
~~atagar: thoughts on these options? or leave things as-is / `wontfix`?~~
~~(Filing this as a //task// ticket, as it's probably a discussion point
more than anything else. I'd expect from the edge cases, there //could//
be some defects, some enhancements.)~~
See especially [[comment:2|atagar's comment about implementation
thoughts]].
--
Comment:
Since this is no longer really a discussion, swapping some aspects of the
ticket.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26420#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list