[tor-bugs] #21304 [Obfuscation/Snowflake]: Sanitize snowflake.log
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Mar 25 19:24:35 UTC 2019
#21304: Sanitize snowflake.log
-----------------------------------+------------------------------
Reporter: arlolra | Owner: cohosh
Type: defect | Status: needs_review
Priority: Medium | Milestone:
Component: Obfuscation/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: starter | Actual Points:
Parent ID: | Points: 1
Reviewer: | Sponsor:
-----------------------------------+------------------------------
Changes (by cohosh):
* status: needs_revision => needs_review
Comment:
Here's a new candidate:
https://github.com/cohosh/snowflake/compare/safe_log
Replying to [comment:12 dcf]:
Thanks for the helpful feedback! I've addressed each of the comments
below:
> Let me suggest a further simplification: ''only'' worry about the server
log for now, and then we can perhaps factor the safe-logging code into a
separate package for use by the other programs.
This is what I've done so far, the log scrubber is only being used on the
server log at the moment.
>
> * I think it's best to default to scrubbed in all cases.
Agreed. Used the suggested logic to avoid missing default loggers in the
future:
[https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server.go#L329
server.go#L329]
> * the default log output in the absence of a `--log` option should be
> `os.Stderr`, not `os.Stdout`.
> [https://golang.org/pkg/log/#pkg-overview Ref]: "That logger writes to
standard error..."
> There's now a [https://golang.org/pkg/log/#Logger.Writer
Logger.Writer] method that would allow
> us to ask for the default output, but as it was only
[https://github.com/golang/go/pull/28399 added in 1.12],
> it's probably still too new to use.
Oops, fixed. I agree that the Logger.Writer method is the best thing to
use and it's probably a good idea to wait until we want to update the go
version for other reasons as well.
> * The `logScrubber.Write` method only looks at one buffer's worth of
data at a time,
> See below re `RedactErrorWriter` for an idea on how to deal with this.
Cool, I really like how it works. I implemented that for the log scrubber
here:
[https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server.go#L91
server.go#L91] and added your tests:
[https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server_test.go#L54
server_test.go#L54]
> * You can use `!bytes.Equal(a, b)` instead of `bytes.Compare(a, b) !=
0`.
Done.
> * Should we scrub port numbers, too?
Yeah, that's a good idea. I've updated the regular expressions to catch
the port. See my comments below for what I settled on.
> * Consider structuring repetitive tests like this:
Done.
> * Instead of `X.X.X.X`,
[https://gitweb.torproject.org/tor.git/tree/src/app/config/config.c?h=tor-0.3.5.8#n1102
tor uses] `[scrubbed]`, as do
> [https://gitweb.torproject.org/pluggable-transports/meek.git/tree
/meek-server/meek-server.go?h=0.31#n181 meek-server]
> and
[https://gitlab.com/yawning/obfs4/blob/obfs4proxy-0.0.9/common/log/log.go#L42
obfs4proxy].
Nice, I like this better. It also leaks less information this way. Done.
>
> ----
> Is it worth doing both, structure-based object manipulation and string-
based patterns? Perhaps not, since the correct functioning of one can mask
errors in the other. You could think of it as defense in depth, but it
will be hard to consistently use a `redactError` function everywhere, when
forgetting it doesn't have any observable consequences. Surely having both
is safer, but the marginal safety increase of `redactError` may not be
worth the (albeit small) code complexity increase.
I think for now it's okay to just have the log scrubber. Before I started
it, I went through and looked at all of the returned errors and I don't
recall finding any net.OpErrors anyway.
> One other difference is this massive, yet conservative,
[https://gitweb.torproject.org/user/dcf/snowflake.git/tree/server/error.go?h
=redact-error#n58 IP address–detecting regexp]. I admit its appearance is
comical, but it will detect IPv6 addresses not surrounded by brackets, and
avoid false-positive matches like what you mentioned in comment:11. Also,
it scrubs port numbers. I would say, test
[https://github.com/cohosh/snowflake/blob/4b0acda984a5ae4335d206fc534f51efb9303d5d/server/server_test.go#L53
your] and
[https://gitweb.torproject.org/user/dcf/snowflake.git/tree/server/error_test.go?h
=redact-error#n148 my] test cases against the pattern you already have;
and if there are any misdetections, a more precise regexp may be in order.
I stumbled across that as well when I wrote the original regexp >.< I did
end up taking some elements from it and expanding the regexp that I have
to:
{{{
const ipv4Address = `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}`
const ipv6Address = `(([0-9a-fA-F]{0,4}:){2,7}([0-9a-fA-F]{0,4})?(` +
ipv4Address + `))` +
`|(([0-9a-fA-F]{0,4}:){2,7}([0-9a-fA-F]{0,4})?)`
const optionalPort = `(:\d{1,5})?`
const addressPattern = `((` + ipv4Address + `)|(\[(` + ipv6Address +
`)\])|(` + ipv6Address + `))` + optionalPort
}}}
My reason for not going with the full one was that I found it unnecessary
after expanding my tests with the ones you used. I also found that this
massive regexp was still scrubbing the fingerprint, meaning additional
changes still had to be made. In particular, this test case failed:
{{{
//Make sure it doesn't scrub fingerprint
"a=fingerprint:sha-256
33:B6:FA:F6:94:CA:74:61:45:4A:D2:1F:2C:2F:75:8A:D9:EB:23:34:B2:30:E9:1B:2A:A6:A9:E0:44:72:CC:74",
"a=fingerprint:sha-256
33:B6:FA:F6:94:CA:74:61:45:4A:D2:1F:2C:2F:75:8A:D9:EB:23:34:B2:30:E9:1B:2A:A6:A9:E0:44:72:CC:74\n",
}}}
Go does not yet support look-ahead or look-behind, which is what we'd want
to pass this test case. Instead I implemented a work-around that will find
regular expressions that check for characters surrounding the address, but
only replaces the address portion of it:
{{{
const fullAddrPattern = `(^|\s|[^\w:])` + addressPattern +
`(\s|(:\s)|[^\w:]|$)`
var scrubberPatterns = []*regexp.Regexp{
regexp.MustCompile(fullAddrPattern),
}
func scrub(b []byte) []byte {
scrubbedBytes := b
for _, pattern := range scrubberPatterns {
// this is a workaround since go does not yet support look
ahead or look
// behind for regular expressions.
scrubbedBytes = pattern.ReplaceAllFunc(scrubbedBytes,
func(b []byte) []byte {
return addressRegexp.ReplaceAll(b,
[]byte("[scrubbed]"))
})
}
return scrubbedBytes
}
}}}
> In the case of these full SDP stanzas, I think we should just not be
logging them at all, not by default anyway.
> ...
> That's what the "SEND" button at
​https://snowflake.torproject.org/snowflake.html, and the
{{{NewCopyPasteDialer}}} in client, are for. I'm okay with that mode of
operation going away.
I removed this from the client logs, but if there's now unused
functionality in the client or proxy code perhaps we should remove that
too?
----
I made some other additional changes:
- The regular expressions are now compiled only once, outside of the write
function. This will make it easier to add more patterns the scrubber later
and probably makes it more efficient
- Modified the broker to not print out the raw bytes of ICE answers
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21304#comment:15>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list