Parallel & Distributed Operating Systems Group

FSCQ: File-system verification

Project overview

FSCQ is the first file system with a machine-checkable proof (in the Coq proof assistant) that its implementation meets its specification and whose specification includes behavior under crashes.

The original version of FSCQ (as presented in the SOSP 2015 paper) guaranteed transactional behavior for every system call: if the system crashes and recovers, either the entire system call or none of it appears to execute.

Research directions

We are actively working on FSCQ from several angles, with the goal of demonstrating that performance comparable to ext4 is achievable in a verified file system.

Deferred durability

One line of work is to improve FSCQ’s performance by providing weaker, deferred durability guarantees under crashes, as modern file systems do. Applications might issue several system calls and see all of them disappear if the system crashes. The file system buffers changes longer, exposing a new sync operation to give users control over when to pay the cost of writing to disk. New APIs allow applications to modify file data without transactional guarantees, in addition to transactions for file metadata. These optimizations make the specification more complicated, and formalizing these weaker guarantees is also an active part of our research. Progress on FSCQ has resulted in a logging design with I/O efficiency on par with ext4.

CPU performance

Another line of work focuses on making FSCQ more CPU-efficient. Currently the code is translated mechanically to Haskell, with CPU overhead that makes FSCQ CPU-bottlenecked rather than I/O bottlenecked. The goal of this work is to extract to Go, an imperative language, while proving that the semantics of the code are preserved.

Not only does verified extraction give us control over the runtime code and scope for optimization, it also removes Coq’s built-in extraction to Haskell as well as the GHC compiler from our trusted computing base.

Concurrency

Another line of work focuses on making FSCQ concurrent. Our current approach is to first make the file-system I/O-concurrent, running a single system call at a time while another system call blocks waiting for disk I/O. We are able to implement this approach on top of arbitrary sequential code, isolating the concurrency to a shared buffer cache. Thus the concurrent FSCQ re-uses both the code and proofs of the sequential version.

Applications

Finally, we are interested in writing applications on top of FSCQ, such as key-value stores, databases, and mail servers. Part of this motivation is to gain confidence that the specification of FSCQ is usable. These applications will make use of common patterns for using POSIX to achieve correctness, such as issuing fsync and fdatasync only when necessary and using rename to atomically effect changes.

People

Publications

Publicity

Source code

Our source code is available on GitHub at mit-pdos/fscq.

Funding support

This research was supported in part by NSF award 1563763.