Unix Processes

A process under unix consists of an address space and a set of data structures in the kernel to keep track of that process. The address space is a section of memory that contains the code to execute as well as the process stack.

The kernel must keep track of the following data for each process on the system:

A process has certain attributes that directly affect execution, these include:

To view a process you use the ps command.
umbc9[8]# ps -l
 F S  UID   PID  PPID  C PRI NI P   SZ:RSS     WCHAN TTY       TIME COMD
30 S    0 11660   145  1  26 20 *   66:20   88249f10 ttyq6     0:00 rlogind
30 S14066 11662 11661  0  26 36 *  129:43   88249f10 ttyq6     0:00 zwgc
30 S    0 11681 11663  0  39 36 *   85:27   88246890 ttyq6     0:00 csh
30 S14066 11661 11660  0  29 36 *   86:33   8815012c ttyq6     0:00 login.kr
30 S14066 11663 11661  0  39 36 *   86:27   88246890 ttyq6     0:00 csh
30 R    0 12539 11681 46  98 36 0  207:171           ttyq6     0:01 ps

The man page for ps describes all the fields displayed with the ps command as well as all the command options. Some important fields you must know are the following:

The F field.
This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory.
The S field.
The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available. When you see this frequently on your system you are out of memory.
UID field.
The UID field shows the User ID (UID) of the process owner. For many processes this is 0 because they are run setuid.
PID field.
The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal.
PPID field. This refers to the Parent Process ID. This identifies the parent process that started the process. Using this it allows you to trace the sequence of process creation that took place.
PRI field.
This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range form 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value. This value is used by the scheduler to compute the next process to get the cpu.
The P flag.
This is the processor flag. On the SGI this refers to the processor the process is running on.
SZ field.
This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes. The sort command is your friend when looking at the system. Use the sort command as the pipe output to sort by size or PID. For example to sort by SZ field use the command ps -el | sort +9 (remember sort starts numbering fields with zero).
RSS field.
This refers to Resident Set Size and refers to the pages in memory. Note the RSS size should ALLOWS be less than the SZ.
TTY field.
This is the terminal assigned to your process. On SGI based systems tty's with the letter "q" in them are psuedo, or network, tty's.
Time field.
The cumulative execution time of the process in minutes and seconds.
COMD field.
The command that was executed.
As a system administrator you often want to look at all processes, this is done under SV5 with the command ps -el or under BSD with the command ps -al. There are a number of variations that control what information is printed out.

Sending a Signal

Unix supports the idea of sending software signals to a process. These signals are ways for other processes to interact with a running process outside the context of the hardware. The kill command is used to send a signal to a process. In addition, it is possible to write a signal handler in either C or the Shell that responds to a signal being sent. For example, many system administration utilities, such as the name server, respond to SIGHUP signal by re-reading their configuration file. This can then be used to update the process while running without having to terminate and restart the process.

For many signals there is really nothing that can be done other than printing an appropriate error message and terminating the process. The signals that system administrators will use the most are the HUP, KILL, and STOP signals. The HUP signal as mentioned previously is used by some utilities as a way to notify the process to do something. The KILL signal is used to abort a process. The STOP command is used to pause a process.

A common problem system administrators will see is one where a user made a mistake and is continuely forking new processes. While all users have some limit on the number of processes they can fork, as they reach that limit they will wait, if you kill a process the system will resume creating new processes on behalf of the user. The best way to handle this is to send the STOP signal to all processes. In this way, all processes are now suspended, then you can send a KILL signal to the processes. Since the processes were first suspended they can't create new processes as you kill the ones off.

Signals available under Unix

     SIGHUP     01       hangup
     SIGINT     02       interrupt
     SIGQUIT    03[1]    quit
     SIGILL     04[1]    illegal instruction (not reset when caught)
     SIGTRAP    05[1][5] trace trap (not reset when caught)
     SIGABRT    06[1]    abort
     SIGEMT     07[1][4] EMT instruction
     SIGFPE     08[1]    floating point exception
     SIGKILL    09       kill (cannot be caught or ignored)
     SIGBUS     10[1]    bus error
     SIGSEGV    11[1]    segmentation violation
     SIGSYS     12[1]    bad argument to system call
     SIGPIPE    13       write on a pipe with no one to read it
     SIGALRM    14       alarm clock
     SIGTERM    15       software termination signal
     SIGUSR1    16       user-defined signal 1
     SIGUSR2    17       user-defined signal 2
     SIGCLD     18[2]    death of a child
     SIGPWR     19[2]    power fail (not reset when caught)
     SIGSTOP    20[6]    stop (cannot be caught or ignored)
     SIGTSTP    21[6]    stop signal generated from keyboard
     SIGPOLL    22[3]    selectable event pending
     SIGIO      23[2]    input/output possible
     SIGURG     24[2]    urgent condition on IO channel
     SIGWINCH   25[2]    window size changes
     SIGVTALRM  26       virtual time alarm
     SIGPROF    27       profiling alarm
     SIGCONT    28[6]    continue after stop (cannot be ignored)
     SIGTTIN    29[6]    background read from control terminal
     SIGTTOU    30[6]    background write to control terminal
     SIGXCPU    31       cpu time limit exceeded [see setrlimit(2)]
     SIGXFSZ    32       file size limit exceeded [see setrlimit(2)]

Setting a processes priority.

Unix attempts to manage priority by giving those who have used the least access first. In addition, those users who are sleeping on an event (e.g. such as a keyboard press) get higher priority than those jobs that are purely CPU driven. On any large system with a number of competing user groups the task of managing resources falls on the system administrator. This task is both technical and political. As a system administrator you MUST understand you company goals in order to manage this task. Often, the most prolific users of a machine are in fact the most important!

Once you understand the political implications on who should get priority you are ready to manage the technical details. As root, you can change the priority of any process on the system. Before doing this it is critical to understand how priority works and what makes sense. First, while CPU is the most watched resource on a system it is not the only one. Memory usage, disk usage, IO activity, number of processes, all tie togethor in determining throughput of the machine. For example, given two groups, A and B. Both groups require large amounts of memory, more than is available when both are running simultaneously. Raising the priority of group A over Group B may not help things if Group B does not fully relinguish the memory it is using. While the paging system will do this over time, the process of swapping a process out to disk can be intensive and greatly reduce performance, especially if this becomes a recurring problem if process b gets swapped back in. Possibly a better alternative is to completely stop process b with a signal and then continue it later when A has finished.

Unix does provide the command nice [increment] command to lower a priority of a process. This is a command that should be run by users who are running large jobs. As system administrator you may need to explain this to them. There are two versions of this command that have an opposite syntax. The csh uses a positive increment to change the NICE value for this process, the larger the value the lower the priority. The bourne shell version of the nice command uses a negative increment to change the value. Remeber to use the appropriate form depending on your shell.

As system administrator you can use the renice command to change the priority of a process, all processes of a user, or all processes belong to a group of users. The renice command has the form

/etc/renice priority [ [ -p ] pid ... ] [ [ -g ] pgrp ... ] [ [ -u ] user

User education is always the key. One common problem seen is using running three jobs simultaneously when it would be much better if they ran these jobs chained back-to-back.

Other Process Related Commands

The command top is useful for viewing what is going on with a system. It is a graphical application that will show you the processes using the most CPU.

Another command that is useful is lsof for viewing open files of processes. This command is NOT part of standard Unix but is available over the net. The command fuser is another command that is useful for showing what processes are using certain files.