mmap Considered Harmful

Richard, as quoted by Simon Willison:

I’m of the opinion that you should never use mmap, because if you get an I/O error of some kind, the OS raises a signal, which SQLite is unable to catch, and so the process dies. When you are not using mmap, SQLite gets back an error code from an I/O error and is able to take remedial action, or at least compose an error message.

I have ranted about mmap¹ on this site before, and avoidable crashes ought to be a compelling reason to avoid it, but people still use it! Perhaps it would help to ask why:

Performance

This is the usual reason people give. “mmap is faster” is usually supported by two reasons: syscall overhead and buffer copying.

Syscalls were an uncommon bottleneck twenty years ago, but almost nobody today can honestly attribute their software’s poor performance to syscall overhead, in part because overhead for common syscalls (like pread) has decreased significantly over the years.

Likewise, most people in userland whose performance still depends on zero copy techniques are writing filesystems, not using them². Copy-on-write is good enough for the remaining people who can express their work units as some multiple of the VM’s page size.

Shared Memory

SQLite uses mmap³ for manifesting garbage-collected interprocess shared memory regions by path, which is the only reasion I’ve found for which there is no other option. A POSIX shmem can’t be reliably associated with a database when the process is chrooted and will leak if the processes using it all die (or are killed), which is super common on platforms with tight memory constraints.

This is where I step back from the doomerism a bit. The combination of file monitoring via DISPATCH_SOURCE_TYPE_VNODE and the strategic use of mach_vm_read_overwrite when significant time has elapsed between accesses has, in practice, almost entirely eliminated mmap-related crashes in SQLite (the API will return SQLITE_IOERR_VNODE (6922) instead) with no measureable performance cost. But these mitigations are so much more complicated than “an API that returns an error if it fails”, which brings us finally to the real reason people use mmap,

Convenience

mmap is from a time before APIs expressed their own opinions⁴ and its ease of use makes it an extremely attractive nuisance. Nobody likes to admit to being lazy, but for files of arbitrary size (like databases), jumping around with pointer arithmetic is hard to beat: one bounds check and no memory management. Humans are creatures of incentive, and for the most part C provides no incentives for software to be any better than “good enough”.

The solution here is one of progress. CISA and the FBI have asked vendors to provide memory-safety roadmaps by 2026; while C code can be made memory-safe, languages built with memory safety by default tend to be higher level and include standard libraries that provide more ergonomic (and opinionated) abstractions for things like file I/O.

I am going to use mmap to refer specifically to the creation of file-backed memory maps. This is not strictly correct—there are map types other than MAP_FILE—but it is colloquial. ↩
Also virtual machines, at least circa 2010 when I helped get our hardware accelerated 3D graphics stack get down to zero copies within the scope of the vmx. It has been very funny to watch Apple Silicon’s Unified Memory Architecture do the same thing but in actual hardware. ↩
For the wal index. ↩
These days, languages and their libraries can conspire so “bad” code is more difficult to write than “good” code. This was never the case with POSIX interfaces and remains rare among syscalls in general. ↩