NFS is a distributed file system that provdes transparent access
to files residing on remote disks. Developed at Sun Microsystems in
the early 1980Õs. The NFS protocol has been revised a number of
times over this time span. It is available on all Unix systems,
including Berkeley 4.4 and Linux, VMS, Mac, IBM, and Novell.
Other alternatives to NFS include:
Problems with NFS.
Why is NFS used then?
NFS Design and Theory.
Virtual File System
VFS was developed as a generic interface to the unix file
system. It defines a set of operations to perform on the file system
independent of the underlying file system. VFS provides a consistent
interface to the file system whether files are accessed locally (ufs)
or remotely (NFS). For example the stat call is handled on a local
system via the standard kernel call to stat. On a nfs mounted system
it is done via a rpc to the server machine. From the programmer or
users perspective nothing has changed.
Fundamental to VFS is the concept of a file handle. A file
handle is nothing more than a reference to a file. For a local file
system this is the inode. On a remote system this is a name supplied
by the server. Via the handle name the file system can determine the
correct file to use.
RPC and NFS
NFS uses the RPC mechanism found in Unix. This enforces a
client-server relationship on the hosts that use it. RPC allows a host
to make a procedure call from a process that appears as if it is local
but in fact is executed on a remote machine. Typically, the host on
which the call is executed has resources (such as files) needed by the
calling host.
XDR (eXternal Data Representation) is a mechanism used by the
RPC mechanism (and NFS) to ensure reliable data exchange between
hosts. XDR defines a machine independent format for the exchange of
binary data.
At one point many believed RPC calls would become the defacto
way of programming. As such, it was made quite flexible in the way it
handles IP port mapping. A program named the portmapper runs on each
server and handles the task of determining the proper IP port to use
for an application. As servers are created they are registered with
the portmapper, when a client then requests services the portmapper
determines if the requested application is present and returns back
the proper port to use.The file /etc/rpc defines the RPC services to
support.
Below is a list of rpc procedures found in NFS.
- NFSPROC_NULL This procedure is termed null because it does nothing.
Applications use this to test if a server is responding.
- NFSPROC_GETATTR -- This procedure is used to get the attributes of a
file found in the inode. It returns protection, owner, size, and access times.
- NFSPROC_SETATTR.-- This procedure sets the file attributes.
- NFSPROC_ROOT -- This procedure is obsolete, originally it was used
for mouting file systems.
- NFSPROC_LOOKUP -- This procedure does a directory lookup for the
client. On success, it returns a file handle which the client can use to
access the file as well as the current attributes of the file.
- NFSPROC_READLINK -- This procedure reads the value stored in a
symbolic link.
- NFSPROC_READ -- This procedure allows a client to read data from
a file.
- NFSPROC_WRITECACHE -- This pocedure is not implemented yet.
- NFSPROC_WRITE -- This procedure is used to write data to a file.
- NFSPROC_CREATE -- This procedure allows a client to create a file in
a directory.
- NFSPROC_REMOVE -- This procedure is used to delete a file.
- NFSPROC_RENAME -- This procedure is to rename a file.
- NFSPROC_LINK -- . Clients use this to form a hard link to an existing
file.
- NFSPROC_SYMLINK -- Clients use this to create a symlink.
- NFSPROC_MKDIR -- Clients use this to create a directory.
- NFSPROC_RMDIR -- Clients use this to remove a directory, as with local file systems,
- NFSPROC_READDIR -- Clients use this to get the contents of a
directory. This is a little more complicated than the readdir call on
a local file system since the server must provide a handle for each
file in the directory. In addtion, directories can be long and may
take more space than the buffer allows for. This means that the
readdir call must have someway of maintaining state between the client
and server. They do this through a so-called magic-cookie. The
magic-cookie is passed back and forth between the client and server
and maintains the state information needed.
- NFSPROC_STATFS -- Clients may use this call to get information
from the server on the status of the filesystem. This call returns
back information such as total blocks, block size, unused blocks, etc.
Seperate from the NFS RPCs is something referred to as the
mount protocol. The mount protocol also uses rpc's and provides the following:
Mount Protocol
The mount protocol authenicates a
clients request for the handle to a root directory. The client passes
the uid and gid of the process requesting the mount and the server
validates. One of many security holes is that nfs only authenicates
clients on the mount request. Once a file system is mounted the server
just accepts rpc requests for the filehandle. If a filehandle can be
determined any client can forge RPC's to manipulate the file.
Stateless protocol Design
NFS was designed so that the server required no clue as to how
a sequence of NFS operations relate to one another. The client keeps
track of all information. This has the following implication: RPC
requests must fully describe the operation to be performed, a term
called idempotent. For example, the write operation must specify the
file to use (via the file handle), the starting location, and the
number of bytes to be written. This is much different that a write
statement on a local file system. In addition, since NFS uses UDP as a
transport mechanism it must be able to handle duplicate requests.
Also, as a result of UDP, NFS must acknowledge each operation has been
completed.
The stateless protocol design was chosen because it greatly
simplified the design of NFS. Servers do not have to worry abut
transactions or journaling. One advantage of this is that it allows a
server to reboot and have little affect on a client. The client will
just sit there waiting for the server to come back up. When it does it
issues the request and continues on. Of course, users or applications
are stuck waiting on the I/O to complete.
Recently, stateless designs using the TCP protocol have been
developed. Using TCP you can take advantage of the the fact that NFS
runs on networks with extremely low error rates and thus you can use
the sliding window acknowledement scheme built within TCP. A
disadvantage of this method is that the server must maintain a limited
amount of state information. Recent test have shown NFS over TCP
provides a 200 > percent increase in performance.
NFS Setup and Configuration.
A. Files used by NFS Clients
- /usr/etc/biod -- Block I/O Deamon. Performs read-ahead and
write-behind caching for the client.
- /usr/etc/portmap -- rpc portmapper deamon.
- /usr/etc/rpc.lockd -- Lock deamon. Provides file lock
services for clients. Often not used since locking is poor.
- /usr/etc/rpc.statd -- Status deamon. Monitors locks reserved and
handles lock regeneration after a server reboot.
- /etc/fstab -- File system table. One line for each file system to
be mounted. Used by clients to automatically mount files on reboot.
- /usr/etc/mount -- mount progra,m. Used to mount a file system
(local or NFS).
- /usr/etc/umount -- umount program. Used to dismount a file
system (local or NFS).
- /etc/rc.local -- local configuration file where deamons are started.
- /etc/mtab-- flat file that shows mounted file systems.
Files used by NFS Servers.
- /usr/etc/nfsd -- nfs deamon
- /usr/etc/rpc.mountd -- handles mountd protocol for the server
- /etc/exports -- list of filesystems to export
- /usr/etc/exportfs -- program used to read /etc/exports file and
allow them to be mounted by clients. Interacts with rpc.mountd
- /usr/etc/showmount -- Lists systems and file systems being served.
- /etc/rmtab -- Ascii file that stores information on remote systems
that have mounted file systems.
Exporting file systems.
Below are the rules for exporting a file system.
- Any file system or part thereof can be exported. Thus /home could
be executed as well as /home/suess
- If a subdirectory of a exported filesystem is itself exported then
it must reside on a seperate device.
- If a subdirectory is exported then the parent file system can only
be exported IFF it resides on a seperate device.
- Only local file systems can be exported.
These rules force system administrators to think how they want
to give access before exporting file systems. Basically, the rule is
you can only export a physical device in one manner.
The format of the exports file is the following:
filesystem -options where options can be
-ro allow only read only access
-rw=host1:host2 which hosts can write
-access=host1:host2 which hosts can access file system
-anon=uid map unknown uidÕs to this user
-root=host1:host2 which hosts does root have complete access.
example:
/home/suess -rw=icarus.ifsm.umbc.edu,anon=suess
Note. That a client can mount multiple parts of a exported
file system. This restriction listed above is only on servers and the
makeup of the /etc/exports file.
Setting up an NFS server on the SGI
Here are the steps to use.
- Make sure the NFS config file is turned on for the machine,
use the command chkconfig.
If not, turn it on with the command chkconfig nfs on.
- Reboot the machine if you turned on the flag.
- Verify the NFS deamons are running using the ps command.
ps -ef | grep nfs should show the deamons running.
If not, you must reconfigure the kernel to support NFS. To verify this
enter the command strings /unix | grep nfs
If the nfs string doesn't show up build a new kernel with the command
/etc/init.d/autoconfigure command.
- Verify the mount daemon is registered with the portmapper with the command
/usr/etc/rpcinfo -p | grep mountd
- Edit the /etc/exports file given the rules listed above.
- If you added lines to the exports file you must rebuild the export list
with the command /usr/etc/exportfs -av
Verify it is exported with the command exportfs
Mounting file systems
The file /etc/fstab contains filesystems to mount.
The format for NFS mounts is the following:
filesystem@host:mount-point:options OR
host:filesystem mount-point options
options include
nfs --- signifies a nfs mounted file system
rw --- mount the file system read/write
ro -- mount the file system read only
bg -- mount file system in background
hard/soft -- hard mount are retried until acknowledged
by server. soft mounts can be interupted by ctrl/c or will time out.
hard mounts should generally be used for file systems mounted writable.
retrans/timeo -- number of times to retry nfs operation.
timeout deals with the value to use before attempting retrans.
Example:
sunspots:/home/suess /users/suess nfs rw,bg,hard 0 0
Directly via the mount command
mount host:filesystem mount-point -o options
example
mount sunspots:/home/suess /users/suess - o rw,bg,hard
Setting up an NFS client under IRIX
- Make sure the NFS config file is turned on for the machine,
use the command chkconfig.
If not, turn it on with the command chkconfig nfs on.
- Reboot the machine if you turned on the flag.
- Edit the file /etc/fstab
Add mounts in using the format of host:filesystem described above. here
is a sample one:
f-umbc8:/umbc/src /umbc/src nfs rw,hard,nosuid,intr,bg 0 0
umbc4:/usr/local/install /usr/local/install nfs ro,nosuid,soft,bg 0 0
- create the directory mount point using mkdir
- mount the files using the mount command.
Examples of using NFS
Creating a /usr/local environment in a multi architecture environment.
Some directories such as man and doc are architecture neutral, others
such as bin and lib are architecture specific.
Creating a rational user file system, when users have files located on
their specific machines NFS can be used to make it look
coherent on the server.
Creating a central mail spool for all users. This allows users to sign
on any machine and access there mail.
Common Problems
a- symbolic links can cause strange problems. In the case of
mounting a file on a symbolic link you must remember that the file
system is actually mounted on the file system pointed at to by the
link. When exporting a symbolic link you must be certain the file
system pointed too by the link is exported.
- NFS server not responding.
This can happen when a server is
heavily loaded and cannot process the request within the retrans
period. The indication that this is the case is a message just after
the first stating the NFS serve is OK.
- Permission denied.
This can result when working as root on the client and root is mapped to
nobody. Alternatively, a user on a client may not have a corresponding ID
on a server.
- No Space.
The server is out of space on the file system.
- Stale File handle.
This is caused when two users are accessing files in a common directory
and one user removes the file or directory.
Client 1. Client 2
cd /src/mod1
cd /src
rm -rf package1
ls
Performance issues.
All versions of Unix provide a tool
named nfsstat.
nfsstat provides alot of information on the mixture of operations taking place. The output will show how often each NFS procedure was executed. In addition, netstat can also be useful.
- How many nfsd processes to run?
Nfs receives server requests through udp. When a packet arrives all free nfsd processes are scheduled for running. We want to balance the flow the number of nfsd processes so we donÕt need to wait for one to free up when we receive a packet, but at the same time we donÕt want to burdent the system managing unneeded nfsd processes. Use netstat -s to look at socket overflows. regular overflows signify more nfsd processes could be useful. Likewise nfsstat shows a field named nullrecv. This count is updated each time a nfsd process is scheduled but no work needed to be done. A high or increasing count here may signify to many nfsd processes.
- Setting the retrans/timeout parameters.
Setting timeout to low will cause clients to unnecessarily retransmit
packets. Setting it to high causes long delays when packets are lost
on the network. Sun systems have an option nfsstat -m that showÕs
average response for requests. In general, retrans is not changed and
only timeout is used for making changes. The output of nfsstat -m is
useful for determining the time to use for timeout.
- nfsstat -rc gives a short summary of client stats.
If you see a low value for badxid and a high value for timeout then
you probably have a network device dropping packets. This often
happens with bridges.
- Nfs is very much tied to the quality of the network.
A poor network configuration will result in poor nfs performance.
Using netstat to check your network is a good idea. Monitor the
network for utilization and where appropriate make adjustments.
Automouter
Concepts
The motiviation behind the automounter is that in a dynamic NFS
environment you need more flexibility than the /etc/fstab provides.
The automounter uses the concept of a map filoe that controls how
mounts are handled. The automounter works by providing a deamon. That
deamon listens for nfs rpc lookup requests on particular directories.
When it sees one it issues a mount request for the particular file
system and mounts the file system in a scratch area. It then creates a
symbolic link that points to the directory path we want in the
scratch area. The advantage of this method is that we only mount file
systems when we actually need them. This keeps the number of nfs
mounts as small as possible. The automounter can be configured to
release a file system after some time span of inactivity.
Indirect Automounter Maps
There are two types of maps, direct and indirect. Indirect
maps are the most common, here the map identifies the name of each
directory and the associated system and directory to mount. For
example, with three users, larry, moe, and curly under the directory
/home we might have a map like this.
larry host1:/home/larry
curly host2:/home/curly
moe host3:/home/moe
We could call this map auto.users ,
to associate the map auto.users with the directory /home
through the automounter we would enter:
automount /home auto.users
Direct Automounter Maps
Direct maps specify the full directory as opposed to a single name,
thus the following might be a direct map
/home/larry host1:/home/larry
/home/curly host2:/home/curly
/home/moe host3:/home/moe
The map would be named auto.direct and would be invoked with
automount /- auto.direct
The /- signifies there is no directory associated with the map and
forces the automounter to treat it as a direct map.
Direct maps put more burden on the system in that they can result in a
furry of mount activity. This is because indirect maps do not mount
file systems until you enter the directory. Direct maps mount the file
system when you access it via a directory listing.
Side affects of using the automounter.
- long search paths.
If you include many directories handled
by the automounter you could experience significant delay when your
.cshrc file is executed. This is caused by the shell going out and
reading each directory to get the contents for the executable hash
table. One solution, is to create wrapper scripts in another directory
not handled by the automounter that modify your path and invoke the
utility.
- ls does not list files.
Performing a ls at the mount point
will not list the files found in each directory, instead just the
symbolic link names will be listed. This can confuse users and
applications.
- Pathnames look strange.
Automount works by creating a directory and mounting all files in
that directory, then creating symbolic links to make the path work.
However, the pwd command expands symbolic links and returns a pathname
starting with /tmp_mnt. This can be avoided by aliasing pwd to echo
$cwd
Order of startup operations under BSD
- Portmapper, needed to handle rpc based services.
- NIS is used (more later).
- mount nfs based file systems in fstab
- start biod, ( client operations can be done without biod; however
performance is awful).
- start rpc.lockd and rpc.statd. Usually started togethor since they
are both needed for locking.
- If /etc/exports exists then start set up NFS server
- execute exportfs -a command
- start up some nfs deamons
- start up the rpc.mountd deamon