FSCQ is the first file system with a machine-checkable proof (in the Coq proof assistant) that its implementation meets its specification and whose specification includes behavior under crashes.
The original version of FSCQ (as presented in the SOSP 2015 paper) guaranteed transactional behavior for every system call: if the system crashes and recovers, either the entire system call or none of it appears to execute.
We are actively working on FSCQ from several angles, with the goal of demonstrating that performance comparable to ext4 is achievable in a verified file system.
One line of work is to improve FSCQ’s performance by providing weaker, deferred durability guarantees under crashes, as modern file systems do. Applications might issue several system calls and see all of them disappear if the system crashes. The file system buffers changes longer, exposing a new
sync operation to give users control over when to pay the cost of writing to disk. New APIs allow applications to modify file data without transactional guarantees, in addition to transactions for file metadata. These optimizations make the specification more complicated, and formalizing these weaker guarantees is also an active part of our research. Progress on FSCQ has resulted in a logging design with I/O efficiency on par with ext4.
Another line of work focuses on making FSCQ more CPU-efficient. Currently the code is translated mechanically to Haskell, with CPU overhead that makes FSCQ CPU-bottlenecked rather than I/O bottlenecked. The goal of this work is to extract to Go, an imperative language, while proving that the semantics of the code are preserved.
Not only does verified extraction give us control over the runtime code and scope for optimization, it also removes Coq’s built-in extraction to Haskell as well as the GHC compiler from our trusted computing base.
Another line of work focuses on making FSCQ concurrent. Our current approach is to first make the file-system I/O-concurrent, running a single system call at a time while another system call blocks waiting for disk I/O. We are able to implement this approach on top of arbitrary sequential code, isolating the concurrency to a shared buffer cache. Thus the concurrent FSCQ re-uses both the code and proofs of the sequential version.
Finally, we are interested in writing applications on top of FSCQ, such as key-value stores, databases, and mail servers. Part of this motivation is to gain confidence that the specification of FSCQ is usable. These applications will make use of common patterns for using POSIX to achieve correctness, such as issuing
fdatasync only when necessary and using
rename to atomically effect changes.
- Tej Chajed (PhD)
- Atalay İleri (PhD)
- Alex Konradi (MEng)
- Daniel Ziegler (MEng)
- Haogang Chen (PhD, alum)
- Prof. M. Frans Kaashoek
- Prof. Nickolai Zeldovich
- Prof. Adam Chlipala
Using Crash Hoare Logic for Certifying the FSCQ File System
Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich
SOSP 2015 (Best paper award)
Specifying crash safety for storage systems
Haogang Chen, Daniel Ziegler, Adam Chlipala, M. Frans Kaashoek, Eddie Kohler, and Nickolai Zeldovich.
- MIT researchers create file system guaranteed not to lose data even if a PC crashes, Computer World.
- MIT Researchers Create Crash-Tolerant File System Guaranteed Not To Lose Data, Forbes.
Our source code is available on GitHub at mit-pdos/fscq.