[tor-dev] TBB Memory Allocator choice fingerprint implications

Mon Aug 19 16:09:36 UTC 2019

Okay I'm going to try and clear up a lot of misconceptions and stuff
here.  I don't own Firefox's memory allocator but I have worked in it,
recently, and am one of the people who are working on hardening it.

Firefox's memory allocator is not jemalloc. It's probably better
referred to as mozjemalloc. We forked jemalloc and have been improving
it (at least from our perspective.) Any analysis of or comparison to
jemalloc is - at this point - outdated and should be redone from
scratch against mozjemalloc on mozilla-central.

LD_PRELOAD='/path/to/libhardened_malloc.so' /path/to/program will do
nothing or approximately nothing. mozjemalloc uses mmap and low level
allocation tools to create chunks of memory to be used by its internal
memory allocator. To successfully replace Firefox memory allocator you
should either use LD_PRELOAD _with_ a --disable-jemalloc build OR
Firefox's replace_malloc functionality:
https://searchfox.org/mozilla-central/source/memory/build/replace_malloc.h

Fingerprinting: It is most likely possible to be creative enough to
fingerprint what memory allocator is used. If we were to choose from
different allocators at runtime, I don't think that fingerprinting is
the worst thing open to us - it seems likely that any attacker who
does such a attack could also fingerprinting your CPU speed, RAM, and
your ASLR base addresses which depending on OS might not change until
reboot.

The only reason I can think of to choose between allocators at runtime
is to introduce randomness into the allocation strategy. An attacker
relying on a blind overwrite may not be able to position their
overwrite reliably AND it has the cause the process to crash otherwise
they can just try again.

Allocators can introduce randomness themselves, you don't need to
choose between allocators to do that.

In virtually all browser exploits we have seen recently the attacker
creates exploitation primitives that allow partial memory read/write
and then full memory read/write. Randomness introduced is bypassed and
ineffective. I've seen a general trend away from randomness for this
purpose. The exception is when the attacker is heavily constrained -
like exploiting over IPC or in a network protocol. Not when the
attacker has a full Javascript execution environment available to
them.

When exploiting a memory corruption vulnerability, you can target the
application's memory (meaning, target a DOM object or an ArrayBuffer)
or you can target the memory allocator's metadata. While allocator
metadata corruption was popular in the past, I haven't seen it used
recently.

Okay all that out of the way, let's talk about allocators.

I skimmed https://github.com/GrapheneOS/hardened_malloc and it looks
like it has:
 - out of line metadata
 - double free protection
 - guard regions of some type
 - zero-filling
 - MPK support
 - randomization
 - support for arenas

mozjemalloc:
 - arenas (we call them partitions)
 - randomization (support for, not enabled by default due to limited
utility, but improvements coming)
 - double free protection
 - zero-filling
In Progress:
 - we're actively working on guard regions
Future Work:
 - out of line metadata
 - MPK

harden_malloc definitely has more bells and whistles than mozjemalloc.
But the benefit gained by slapping in an LD_PRELOAD and calling it a
day is small to zero. Probably negative because you'll not utilize
partitions by default. You'd need a particurally constrained
vulnerability to actually prevent exploitation - it's more likely
you'll just cost the attacker another 2-8 hours of work.

Out of line metadata is on-the-surface-attractive but... that tends to
only help when you have a off-by-one/four write and you corrupt
metadata state because it's the only thing you *can* do. With out of
line metadata, you can just corrupt a real object and effect a
different type of corruption. I'm pretty skeptical of the benefit at
this point, although I could be convinced. We don't see metadata
corruption attacks anymore - but I'm not sure if it's because we find
better exploit primitives or better vulnerabilities.

In particular, if you wanted to pursue hardened_malloc you would need
to use replace_malloc and wire up the partitions correctly.
Randomization will almost certainly not help (and will hurt
performance)*. MPK sounds nice but you have to use it correctly (which
requires application code changes), you have to ensure there are no
MPK gadgets, and oh wait no one can use it because it's only available
in Linux on server CPUs. =(

* One place randomization will help is on the other side of an IPC
boundary. e.g. in the parent process. I'm trying to get that enabled
for mozjemalloc in H2 2019.

In conclusion, while it's possible hardened_malloc could provide some
small security increase over mozjemalloc, the gap is much smaller than
it was when I advocated for allocator improvements 5 years ago, the
effort is definitely non-trivial, and the gap is closing.

If people had the cycles to invest in something like this, I would
actually advocate for helping us test and benchmark Fuzzyfox, and see
if we can get the browser into a usable state with Fuzzyfox so we
could enable it in Tor Browser.

-tom