Richard, as quoted by Simon Willison:

I’m of the opinion that you should never use mmap, because if you get an I/O error of some kind, the OS raises a signal, which SQLite is unable to catch, and so the process dies. When you are not using mmap, SQLite gets back an error code from an I/O error and is able to take remedial action, or at least compose an error message.

I have ranted about mmap1 on this site before, and avoidable crashes ought to be a compelling reason to avoid it, but people still use it! Perhaps it would help to ask why:

Performance

This is the usual reason people give. “mmap is faster” is usually supported by two reasons: syscall overhead and buffer copying.

Syscalls were an uncommon bottleneck twenty years ago, and overhead for common syscalls (like pread) have decreased significantly since then. These days their overhead is a rounding error compared to the work they perform.

Likewise, the people in userland whose performance depends on zero-copy techniques are writing filesystems, not using them2. VM CoW is good enough for the remaining people who can express their work units as some multiple of the VM’s page size.

Shared Memory

SQLite uses mmap3 for manifesting path-addressible garbage-collected interprocess shared memory regions, which is the only reason I’ve found for which there is no other option. A POSIX shmem can’t be reliably associated with a database when the process is chrooted and will leak if the last process using it dies without releasing its claim first, which is super common on platforms with aggressive oom behaviour (like iOS).

So this is where I step back from the doomerism a bit. Practically speaking, the combination of file monitoring via DISPATCH_SOURCE_TYPE_VNODE and the strategic use of mach_vm_read_overwrite when significant time has elapsed since the last access has, in practice, almost entirely eliminated mmap-related crashes in SQLite (the API will return SQLITE_IOERR_VNODE (6922) instead) with no measureable performance cost. But these mitigations are so much more complicated than “an API that returns an error if it fails”, which brings us finally to the real reason people use mmap,

Convenience

mmap is from a time before APIs expressed their own opinions intentionally4 and its ease of use makes it an extremely attractive nuisance. Nobody likes to admit to being lazy, but for files of arbitrary size (like databases), jumping around with pointer arithmetic is hard to beat: one bounds check5 and no memory management. Humans follow incentives, and C provides an environment where it’s way easier to ignore errors than to handle them. Consider what SQLite does instead, roughly 8,000 lines of code implementing a pagerhttps://www.sqlite.org/src/file?udc=1&ln=on&ci=trunk&name=src%2Fpager.h and another 2,000 implementing a page cache. It’s hard to blame people for making one call to mmap when the alternative is maintaining an additional 10kLoC to do it properly! .

The solution here is basically one of progress. CISA and the FBI have asked vendors to provide memory-safety roadmaps by 2026; while it’s possible to improve the memory safety of C code, these solutions are not complete6. Languages built with memory safety by default tend to include syntax that make errors harder to ignore than to handle and include standard libraries with more ergonomic (and opinionated) abstractions for things like file I/O.