6.824 Lab 4: Semantic File System

Due: Tuesday, October 16


The Semantic File System (proposed by Gifford in [1]) provides access to a file system by fulfilling queries for file attributes rather than organizing files in a hierarchy of directories. A typical use of the Semantic File System as originally proposed might look as follows:
% ls /sfs/owner:/smith
bio.txt paper.tex prop.tex
In this example the user has requested all documents owned by user smith. This is done by listing the virtual directory owner:/smith. The system creates this directory "on the fly" and arranges to populate it with the files which match the search criterion (i.e. are owned by smith). In this lab you will implement a scaled-down version of the semantic file system using an SFS user level server similar to the toolkit described in the Mazieres paper you read for class [2]. In this document "SFS" will refer to the self-certifying file system; we will refer to the semantic file system by its full name. The final product of this lab will be a user-level file server that enables semantic file system-like access to a local directory. A working daemon might function as follows:
blood% ./sfsusrv -s -p 0 -f ./cfg
sfsusrv: version 0.5, pid 82575
sfsusrv: No sfssd detected, running in standalone mode.
sfsusrv: Now exporting directory:/tmp/export-fdabek
sfsusrv: serving /sfs/2101@blood.lcs.mit.edu:ueu2nc7xix8uh63he852ymv8tmzaujix
blood% ls /tmp/export-fdabek
test.c test.o a.out
sweat% cd /sfs/2101@blood.lcs.mit.edu:ueu2nc7xix8uh63he852ymv8tmzaujix
sweat% ls
test.c test.o a.out
sweat% cd :extension=c
sweat% ls
sweat% cat test.c
In this example, I have listed the files stored in /tmp/export-fdabek (a directory on blood's local disk) with a .c extension by accessing a specially named semantic directory (:extension=.c) under the SFS mount point for that file system on a remote machine. Note that the ls is done from sweat: because of a risk of deadlock, it is not possible to access files on the local machine via SFS.

The SFSUSRV file server

You will implement your semantic file system by modifying an existing userspace SFS server (sfsusrv). sfsusrv reads SFS RPCs from a network connection, and executes them using ordinary UNIX system calls such as open() and read(). The following shows how sfsusrv fits into the rest of the NFS/SFS system:
Your job will be to extend sfsusrv to answer semantic file system queries as well as ordinary file accesses.

Building the sfsusrv software

To get started with the software, you should unpack the existing sfsusrv from /home/to2/labs/sfsusrv.tar.gz.. This project uses a static makefile so configuration is not necessary. However, this makefile will not function correctly on machines other than the class machines.
% cd
% tar xzvf /home/to2/labs/sfsusrv.tar.gz
% cd sfsusrv
% gmake

Running sfsusrv

Before you run sfsusrv, you need to generate a public/private key pair for use by the server. Run the following command:
% sfskey gen -KP sfs_host_key
 Creating new key for sfs_host_key.
      Key Name: fdabek@blood.lcs.mit.edu Press return
You'll want to run sfsusrv with these arguments: To export a directory from blood using sfsusrv, do the following:
blood% mkdir /tmp/export-$USER
blood% echo hello > /tmp/export-$USER/test.file
blood% echo export /tmp/export-$USER > cfg
blood% echo keyfile sfs_host_key >> cfg
blood% ./sfsusrv -s -p 0 -f ./cfg 
sfsusrv: version 0.5, pid 82575
sfsusrv: No sfssd detected, running in standalone mode.
sfsusrv: Now exporting directory:/tmp/export-fdabek
sfsusrv: serving /sfs/2101@blood.lcs.mit.edu:ueu2nc7xix8uh63he852ymv8tmzaujix
The last line that sfsusrv prints is the path name under which your exported directory (/tmp/export-$USER on blood) will appear on SFS client machines. Your path will be different from the one printed above. The 2102 is the port number that sfsusrv is listening to. You can log into sweat and look at your exported files (but change the /sfs/... pathname to the one that your sfsusrv printed):
sweat%  ls /sfs/2101@blood.lcs.mit.edu:ueu2nc7xix8uh63he852ymv8tmzaujix/
You can learn more about how SFS works, and about why /sfs/... pathnames look the way they do, by reading the SFS paper.

Tracing RPCs

Once you've gotten sfsusrv running, you can use it to trace NFS RPCs. Kill your existing sfsusrv with control-C, and start a new one with the ASRV_TRACE environment variable set to 10:
blood% env ASRV_TRACE=10 ./sfsusrv -s -p 0 -f ./cfg 

Now, on sweat, browse the exported file system. You'll have to use the /sfs/... pathname printed out by the current instance of sfsusrv, since the port number will change each time. As you browse, the server on blood will print out a complete trace of all NFS requests it receives. (Large structures may be truncated; if this is ever a problem, try higher values than 10.)

Now capture the output in a file:

blood% env ASRV_TRACE=10 ./sfsusrv -s -p 0 -f ./cfg |& tee nfs.trace
If you're using a Bourne-like shell, try this instead:
blood% env ASRV_TRACE=10 ./sfsusrv -s -p 0 -f ./cfg 2>&1 | tee nfs.trace

After setting up sfsusrv to trace NFS traffic, run the following commands (substitutiong the correct self-certifying pathname):

sweat% cd /sfs/...
sweat% rm junk
rm: junk: No such file or directory
sweat% echo hello > junk
sweat% cat junk
sweat% cat junk
Now stop sfsusrv, and look at the RPCs in the nfs.trace file. Answering the following questions will help you understand how NFS and sfsusrv interact.

A Semantic File System

Now that you understand how the sfsusrv software works you can modify it to process semantic file system queries. Your sfsusrv should make it look as if there are directories corresponding to queries, but should calculate their contents on the fly, since those directories don't actually exist on the server machine.

You are not required to implement the semantic file system as described in Gifford's paper. You only have to serve queries of the following form:

When you see a reference to a file whose name looks like this, you should produce a virtual directory containing all files in the original directory whose names end with ext. Note that because your program is an NFS server, unlike the implementation described in Gifford's paper, you do not need to create symbolic links to the "real" files.

All of the files that are eligible to satisfy queries will be located in the current working directory; you are not responsible for indexing an entire file system. For example,

ls /sfs/.../foo-dir/:extension=.c/
should list all of the files that end in .c in directory foo-dir.

Getting Started

If you aren't extremely comfortable with the NFS specification, refer to the RFC. Another great resource is NFS Illustrated by Brent Callaghan (published by Addison-Wesley).

You should use sfsusrv as a starting point to implementing your daemon. You'll "override" the way sfsusrv handles a subset of the NFS3 RPC calls to provide for the semantic queries. You'll need to modify the handling of at least the following RPCs:

Each RPC is implemented in a method named "nfs3_RPCNAME" in the file client.C. client.C contains code which handles NFS calls. filesrv.C contains some utility functions and a cache of file descriptors associated with recently accessed files.

You may also find that you need to keep some additional state about the status of a query. The fh_entry structure defined in filesrv.h is a good place to add this additional state.

Your server is free to access the local file system via standard POSIX system calls (read, write, readdir, etc). Because your program is doing disk I/O you may not access it via SFS locally because of the danger of deadlock.

It is important to remember when designing this experiment that NFS filehandles are opaque data structures. You are free to form the file handles you return in any way you see fit (as long as they are 64 bytes or shorter).


Your daemon should


A test script has been provided: it is located in /home/to2/labs/. The script takes a single argument: the self-certifying patname to your server. Since the tester accesses files via SFS it must be run on a machine other than the one your server is running on. For example, if you ran your server on blood, you might run the following on sweat:
sweat% /home/to2/labs/sfs-test.pl /sfs/4007@blood.lcs.mit.edu:gej8jaf3ky53jipevweaham84iw4xpr2/
Setting up test files in/sfs/4007@blood.lcs.mit.edu:gej8jaf3ky53jipevweaham84iw4xpr2/
Testing basic sfsusrv behavior...
    Testing ls..passed
    Testing write/read...passed
Testing semantic behavior...
    Testing ls...passed
    Testing write/read...passed
    Testing create...passed
    Testing remove...passed
    Testing rename...passed
Testing semantic behavior in a subdirectory...
    Testing ls...passed
    Testing write/read...passed
    Testing create...passed
    Testing remove...passed
    Testing rename...passed
The tests are in three phases: Feel free to browse the source of the tester to gain a fuller understanding of what it is testing.

Submitting your lab

To submit your lab place a tarball in ~/handin/lab4/sfs.tar.gz that contains source files and a makefile. Do not include object files or binaries please. To make the tarball run the following commands:
% cd
% cd sfsusrv
% gmake clean
% cd
% mkdir ~/handin/lab4
% tar czvf ~/handin/lab4/sfs.tar.gz sfsusrv
The lab is due before class on Tuesday, October 16.

Filehandle FAQ

Q: What is an NFS filehandle?

A: An NFSv3 filehandle is a 64-byte opaque data structure used to identify a 'file'. The fact that the filehandle is "opaque" means that the server generates the filehandle with whatever internalstructure it wishes and the client never interprets the structure of the filehandle. The client may compare the filehandle to others or return it to the server, but it has no need to understand how it was constructed.

Q: What can/should I put in an NFS filehandle?

A: Anyting you like, as long as you follow a few guidelines:

Q: Ok, but how do I actually get my filehandle into the NFS reply?

A: The nfs_fh3 structure defines the representation of a filehandle. It supports two important operations (for this descriptions assume we have declared nfs3_fh *fh):


[1] Semantic File Systems David K. Gifford, Pierre Jouvelot, Mark A. Sheldon, and James W. O'Toole, Jr. 13th ACM Symposium on Operating Systems Principles, October 1991

[2] A Toolkit for User Level File Systems, David Mazeries, Usenix 2001.