You can see the initial sizes of the various components of a process
by examining its program file. The Unix size
program
takes a filename as a comman line argument and prints out the size of
the text segment, the initialized user data segment (labelled "data"),
and the uninitialized user data segment (labelled "bss"), along with
their sums in hex and decimal. The total amount of space taken up by
N running copies of a process is 1*text +
N*data + N*bss.
$ file /bin/date; ls -l /bin/date; size /bin/date /bin/date: sparc pure dynamically linked executable -rwxr-xr-x 1 root staff 7472 Jul 23 1992 /bin/date* text data bss dec hex 5944 1496 80 7520 1d60 file /local/bin/xmosaic ; ls -l /local/bin/xmosaic ; size /local/bin/xmosaic /local/bin/xmosaic: sparc demand paged dynamically linked executable -rwxr-xr-x 2 chas sourcers 2514944 Jul 11 20:04 /local/bin/xmosaic* text data bss dec hex 2269184 245760 377408 2892352 2c2240
Since a process can change the size of its user data segment as it
runs, the numbers from the size
program only specify the
initial memory use for a process. You can examine the real memory use
of any process with the sps
program. Here's the output
of the Unix sps
command for a copy of xmosaic which the
OS has swapped out (since I'm not running it at the moment):
Ty User Status Fl Nice Prv Shr Res %M Time Child %C Proc# Command keith select 2344+2704 0 0 19.4 0 1742 xmosaicThe Prv (for "private") field represents the size of the user data segment; the Shr field (for "shared") represents the size of the text segment. These values don't match the output of
size
exactly because size
describes the sizes as stored in the
program file. The Res field (for "resident") represents how much
memory the process is actually taking up in real memory (as opposed to
virtual memory), and the %M field represents the percent of real
memory allocated to the process; these values are zero because the
process is swapped out. All these numbers (except %M) are in kilobytes.
I can change the resident memory values (and maybe the Prv value as
well) by making xmosaic
fetch a page of hypertext:
Ty User Status Fl Nice Prv Shr Res %M Time Child %C Proc# Command keith SELECT 2940+2704 2284 5 23.7 15 1742 xmosaicNote that this action swapped in the process, and that it took up 2,284K of real memory, or 5% of the real memory available on my machine.
fork
and exec
System Callsexec
system call is the only way a process begins
execution; the fork
system call is the only way to create
a new process. These two system calls are somewhat puzzling when
first encountered, as it will seem that they ought to be combined into
one: each, taken on its own, may seem almost pointless. But the
separation of the two is a key idea in Unix and simplifies programming
with processes tremendously. Note: the exec
system
call comes in several flavors, all fundamentally the same; the most
commonly used flavor is called execl
.
When a process executes the exec
system call, the kernel
replaces the the code and user data segments of the running
process with the code and data segments from a program stored in a
file. The process remains the same: no new kernel data structures are
allocated, and the process has the same process-ID as before. The
process's system data segment is also almost the same as it was before
the exec; the only things that change are:
But if exec
only reinitializes an existing process, then
it provides no way to create a new process. For this, we need
fork
.
The fork
system call creates a new process, but it does
not initialize it from a new program. When a process executes
fork
, the kernel clones the running process to
make the new one. The text segment, user data segment and system data
segment are all copied almost exactly: execution continues in both the
old and new process from the same exact point!
The crucial difference is, the fork
system call itself
returns different values in parent and child. This allows the cloned
code to execute differently in the two processes.
With exec
, the new program used the orginal,
mostly-unchanged, system data segment of the original program. WIth
fork
, the child process gets a nearly identical
copy of the parent's system data segment. The only things that
are different are:
fork
and exec
are usually used in tandem to
allow a process to create new processes. The classic model is as
follows:
exec
to run a new program.fork
are what allow the code
to be written to allow for different actions in parent and child; in
sketchy C:
/* ignoring errors ... */ if (fork() == 0) { /* i am the child */ exec("new program"); } else { /* i am the parent */ wait(); }
Since the parent and child are nearly identical, they can communicate with one another via a pipe (remember, the child gets a copy of all the parent's open file descriptors). So instead of just waiting for the child to terminate, the parent and child may act together.
exit
System Callexit
system call terminates the execution of a
process. All open file descriptors are closed. If the exiting
process had any children, they no longer have a parent and so the
kernel sets their parent-process-IDs to 1, which is the process-ID of
the init
process. These child processes are called
orphans and init
is said to have
adopted them.
The exit
system call retuns an integer value to the
kernel, which makes this value available to the parent process via the
wait
system call. This value includes the one-byte exit
status of the child, and so can be used to pass a very small message
back to the parent.
wait
System Callwait
system call causes a process to sleep until a
child terminates. If there are no children, wait
returns
immediately. wait
returns a small amount of info to the
parent, including the child's process-ID (so the parent knows which
child terminated), exit status, reason for termination, and some
resource use statistics.
A process can terminate for either of two reasons:
exit
system call;
wait
.
When a process terminates without its parent having waited for it, the process is called a zombie. It's good practice to wait for your children, but not required.
pid id process id process parentThe
pid
command returns, as a decimal number, the process
id of the current process. This is useful for creating unique temp file
names:
set fp [open "/tmp/foo[pid]" w]
The Extended Tcl id
command has several subcommands for
accessing process, parent process, user and group ids.
nice
System Callnice ?increment?Every process has a priority that determines how much of the CPU it is allowed to hog. This priority is called the nice value, since if you lower your own priority you are being nice to other processes. The nice vaule is an integer ranging (traditionally) from -19 to 19. Higher nice values are nicer (to other processes). With no argument, the
nice
command
returns the current nice value; a numeric argument is added to the
current nice value. Only root processes can decremnt their nice
value.
execl
Commandexecl ?-argv0 argv0? prog ?arglist?Extended Tcl's
execl
command implements the Unix
execl
system call (actually, it implements
execlp
): it reinitializes the running Tcl process with a
new program. execl
only returns in the event of an
error, so the usual use is like this:
execl /bin/date puts stderr "execl failed!" exit 1If prog is given with a relative pathname,
execl
will search your PATH
for it.
Arguments to the new program can also be specified:
execl cal 7 1959and the
-argv0
option allows you to change the name under
which the program is called.
fork
CommandforkThe Extended Tcl
fork
command implements the Unix
fork
system call. It takes no arguments, and forks a new
process as described above. fork
returns 0 in the child,
and non-zero in the parent. The non-zero value is in fact the
process-ID of the child. Any open file descriptors that have been
written to in the parent should probably be flushed (with
flush
) before forking. fork
can return a
Tcl error if there are not enough resources to fork a new process.
if {[fork] == 0} { # i am the child execl /bin/date puts stderr "execl failed!" exit 1 } # i am the parent waitRemember, the child can never execute the parent code because
execl
won't return (unless there's an error).
Note carefully the output in this annotated version of the above:
if {[fork] == 0} { # i am the child puts "Child: my pid is [pid]" execl /bin/date puts stderr "execl failed!" exit 1 } # i am the parent puts "Parent: [wait]" ------------------------------- Child: my pid is 2804 Tue Aug 23 12:23:41 CDT 1994 Parent: 2804 EXIT 0 -------------------------------Note that the output of
date
appears because after the
execl
, the date
program's standard output is
the same as the Tcl child process's and the Tcl parent process's
standard output.
wait
Commandwait ?-nohang? ?-untraced? ?-pgroup? ?pid?The Extended Tcl
wait
command, when invoked with no
arguments, sleeps until a child terminates (if there are no children,
a Tcl error results). wait
then returns a list of three
elements. The first element is always the process-ID of the child
that terminated. The second and third elements differ depending on
the reason for termination.
If the child exited, the second element is the string
EXIT
and the third element is the exit status of the
child. If the child was killed by a signal, the second element is the
string SIG
and the third element is the signal name.
pipe
Commandpipe ?read write?The Extended Tcl
pipe
command creates a pipe.
A pipe is a pair of file dexcriptors, one for reading and one for
writing, connected through a buffer in the kernel. Remember that a
pipe automatically synchronizes two processes, making for easy
interprocess communication.
To creat a pipe, you simply invoke the pipe
command with
two variable names to hold the read and write file descriptors. Any
data written to the write file descriptor is available for reading on
the read file descriptor. Don't forget to flush
!
pipe r w puts $w Hey!; flush $w gets $r => Hey!
Remember that the size of the pipe buffers in the kernel is small (on
the order of 4K), and a write that's bigger than the buffer size will
fill the buffer and the cause the writing process to block
until the reading process reads some of the contents. In a single
process, this is a problem! Try writing a hundred K or so of data in
place of Hey!
in the above and you'll see the problem.
This problem goes away when the pipe is connecting two processes.
Although it works, a pipe is an IPC mechanism and so is not very useful in a single process. But it's just what you need to communicate between a parent and child. The parent creates the pipe, and since file descriptors are shared with the child, they can send messages back and forth across the pipe. Note that for one-way communication, each process only needs one of the two pipe file descriptors, so the usual practice is for each to close the appropriate one (just because file descriptors are a finite resource (per process)). Here's an example of one-way communication where the parent sends the child a message:
pipe read write if {[fork] == 0} { # child close $write gets $read message puts "Child: got $message" } else { # parent close $read puts $write "Hey, child!" flush $write }Remember, you should always
flush
after each write to a
pipe (or turn off buffering). In the above, the parent can write any
size message to the child.
A single pipe has two ends, and so can be used for two way communication, but a tricky problem exists, so we normally use two separate pipes; careful naming of the file descriptors minimizes confusion:
pipe childread parentwrite pipe parentread childwrite if {[fork] == 0} { # child close $parentwrite close $parentread gets $childread message puts $childwrite "Child: got $message" flush $childwrite } else { # parent close $childread close $childwrite puts $parentwrite "Hey, child!" flush $parentwrite gets $parentread message puts "Parent: Child sez: \"$message\"" }
The problem with a single pipe arises because a single buffer is being used for both directions. Suppose the parent writes to the pipe; the child is supposed to read the data in the buffer, then send a message back to the parent in the same buffer. The problem is, after the parent writes, it does a read on the pipe; if the parent beats the child to the read, the parent will read its own data back!
dup
Commanddup fileId ?targetFileId?The
dup
implements the dup
system call,
which duplicates one desired open file descriptor into another. This
can be used connect standard input or standard output to a pipe. This
sample code shows how a parent process can fork the standard Unix
sort
command and then feed it data to be sorted. A
simple extension would allow the child to write the results back to
the parent.
pipe read write if {[fork] == 0} { # child dup $read stdin close $write execl sort puts stderr "Can't execl!" exit 1 } close $read foreach word [list zoo ylem quark flake dog aarhus] { puts $write $word flush $write } close $write wait
exec
Commandexec
command encapsulates several
complex yet common combinations of fork
,
execl
, and pipe
into one handy command.