PCLSR
ing:Abstract:This paper describes the "
PCLSR
ing" feature of the Incompatible Time Sharing (ITS) operating system.PCLSR
ing permits a process to access the state of another process in a modular fashion. The benefits ofPCLSR
ing are explained, its implementation is described in detail, and the results of some simple performance metering are presented.
The Incompatible Time Sharing (ITS) operating system was originally written for the PDP-6 at the MIT Artificial Intelligence Laboratory during the late 1960's. At MIT during the 1970's ITS ran on three KA-10's and one KL-10, providing service to researchers in the AI Laboratory and in MIT's Laboratory for Computer Science. Many well-known PDP-10 programs, such as the Emacs editor, the MacLisp dialect of Lisp, the Macsyma symbolic algebra system, and the Midas assembler, were originally developed under ITS. More than twenty years later ITS still has a small, but devoted, user community. It now runs on a handful of KS-10 systems around the world.
The designers of ITS were not primarily interested in doing research in operating systems. They held a pragmatic view of ITS as a tool for enabling the AI Lab's small user community to get the most out of their PDP-6/10 and its peripherals (robot arms, TV cameras, etc.). This is not to say that the designers were ignorant of the state of the art in operating systems, but they were unlikely to publish papers about their experiences. As a result, many of their pioneering ideas remain largely unknown outside the ITS user community.
Among other features, ITS boasts a convenient symbolic system call
mechanism that allows calls to be invoked by name and to pass values
in arbitrary locations, a flexible system for delivering software
interrupts to processes, device-independent display terminal support,
a process organization that allows a single command processor to
control multiple sub-processes, and a powerful mechanism for
implementing devices in software that enabled ITS to have what was
probably the first network file system. Many of these features have
since been adopted or rediscovered by other operating systems, but one
of ITS's major design principles, called "PCLSR
ing", has
never found its way into any other operating system.
It is my belief that PCLSR
ing is an important idea
that does not deserve to be neglected merely because it has never been
adequately explained in print. Therefore, in this paper I have
endeavored to explain what PCLSR
ing is, what its benefits
are for both the users and the implementors of ITS, and how it is
implemented.
The reader should not expect to completely understand the
description of the implementation of PCLSR
ing on first
reading. Many of the points are subtle, and can probably only be
fully appreciated by someone who tries to implement
PCLSR
ing for himself. (If any reader makes this attempt,
I would like to hear from him!)
PCLSR
ing Modularity PrincipleUnder any timesharing operating system there will be occasions when a process must access the state of another process. A process may need to start, stop, debug, load, dump, create, or destroy another process. There is also one occasion when a process must access its own state: an interrupt handler needs access to the state of the running process prior to the arrival of the interrupt, so that the process may continue after the interrupt has been dealt with.
"PCLSR
ing" is a mechanism the ITS operating system
uses to enforce a kind of modularity when a process must access the
state of another process. The modularity principle is very simple: no
process ever catches another process (including itself) in the act of
executing a system call. System calls thus behave as if they were
directly implemented in hardware. A process can no more catch another
process in the middle of deleting a file than it can catch another
process in the middle of a multiply instruction.
Another way to state the PCLSR
ing modularity principle
is to say that when a process looks at the program counter of another
process, it must always see a User mode PC, never an Exec mode PC.
When one process wants to access the state of another process, and the
target process happens to be executing in Exec mode, then something
must be done in order to put the target process into User mode.
Either the target process must be backed out of the system call,
leaving the User mode PC pointing at the call, or it must be allowed
to finish, leaving the User mode PC pointing just after the call.
This act of getting a process into User mode as quickly as possible is
called "PCLSR
ing" (pronounced "pee-see-loo-zer-ing") a
process. We say that one process "PCLSR
ed"
("pee-see-loo-zerd") another process, or we speak of the need to
"PCLSR
" ("pee-see-loo-zer") a process before giving it an
interrupt.
Just as certain PDP-10 instructions cannot be guaranteed to run to
completion, either because they might be interrupted or because they
run the risk of taking a page fault (e.g. BLT
), certain
ITS system calls cannot always run to completion before the executing
process must be PCLSR
ed. A prime example is the
SIOT
system call, which is used to transfer a string of
bytes to or from an I/O device.
Suppose a process is using SIOT
to send a long string
of characters to a slow device (such as a terminal) when an interrupt
arrives. It is impossible to take back the characters that have
already reached the device, and there may be an indefinite delay
before the rest of the characters can be output. The
PCLSR
ing modularity principle requires that the process
be maneuvered out of the system call and into User mode as quickly as
possible, else the interrupt cannot be promptly delivered. How can
the state of the process be captured in a User mode PC given that the
process really is halfway through outputting the string of
characters?
The solution is to employ the same trick that the PDP-10 hardware
uses when it is necessary to interrupt an instruction, such as
BLT
, that has only partly performed its task. The
arguments originally passed to the system call are updated in place to
reflect any work already accomplished, and the PC is left pointing at
the system call, so that when the system call is restarted it will
proceed from where it left off. In the case of the SIOT
system call, the locations in memory from which the byte count and the
byte pointer arguments were originally fetched will be overwritten
with a new count and pointer that address the bytes that remain to be
output.
All ITS system calls that might be interrupted with their tasks partially accomplished are designed to update their arguments in this fashion. This requirement might be expected to introduce some system calls with baroque calling conventions, but in fact the vast majority of system calls have no need of ever updating any of their arguments, and those few that do are generally able to do so in a natural way. (One rarely used system call does require an additional argument that should initially be supplied as zero and is used as a "state word" to keep track of the progress of the call. In this obscure case, the documentation explains what the non-zero values of this argument mean, although no program is known to make use of this information.)
PCLSR
ingThe PCLSR
ing modularity principle is easy to state and
easy for the user to understand, but it is tricky to implement. The
majority of this paper is devoted to explaining the techniques used in
the ITS implementation of PCLSR
ing. Before going into
that, however, let us consider why PCLSR
ing is worth all
the trouble it entails. I will present examples of the benefits that
PCLSR
ing has for both the users and the implementors of
ITS.
Users are chiefly conscious of the effects of PCLSR
ing
when they write interrupt handlers. The main program level PC that is
passed to an interrupt handler is always a User mode PC. By examining
this PC and the contents of memory, the interrupt handler can
determine precisely what the state of the interrupted task was. If
the interrupt handler needs to know, for example, whether or not the
main program level had successfully deleted a file before the user
typed an interrupt character, it can simply look to see if the PC
points before or after the DELETE
system call.
Any question the interrupt handler may have about the state of the computation when the interrupt occurred can be answered in terms of a simple machine model in which all system calls behave as if they were hardware instructions. The effects of eccentric interrupt handlers that dismiss interrupts by returning to a PC other than the one interrupted from, or that alter the main program level accumulators, are straightforward to predict. It is, for example, easy to construct a simple interrupt driven scheduler that multiplexes one process into multiple tasks.
For the implementors of ITS, PCLSR
ing is a powerful
aid in managing the interactions between processes. This can be
illustrated by considering the example of the DTTY
system
call.
ITS supports a notion of "ownership" of the user's terminal. As
various processes run on the user's behalf (editor, compiler,
debugger, etc.), they pass ownership of the terminal back and forth.
A critical moment occurs when one process uses DTTY
to
take ownership of the terminal back from a process that it had
previously given it to. When the acquiring process wrests control of
the terminal away from its current owner, how can it know it won't
catch the owner in the middle of actually using the terminal,
potentially leaving some per-terminal data structures in an
inconsistent state?
The answer is for the acquiring process to start by
PCLSR
ing the terminal's current owner. This will insure
that the current owner is not in the middle of any system call.
Therefore all terminal-specific data structures will be in a known
state, and it will be safe to transfer ownership of the terminal.
This example shows most clearly why PCLSR
ing is a tool
for promoting modularity. PCLSR
ing can be used
to hide the implementation details of a system call that modifies the
state of some data structure. Only documented state transitions are
ever exposed to other processes.
In order to understand the techniques used to implement
PCLSR
ing, it is necessary to have some understanding of
the basic structure of ITS. This section introduces the players.
The basic job of any timesharing operating system is to share a single processor among set of independent processes. An ITS process has a name, some I/O channels, a program counter, sixteen accumulators, a page table, an interrupt system, and assorted other properties. These properties are all held in a collection of per-process variables that ITS keeps in its own address space.
At any particular moment a process can be running in either User mode or Exec mode (as determined by a bit in the left half of the program counter). In User mode, each process has its own page table, and can configure its User mode address space to suit itself. In Exec mode, all processes share the same page table, and thus all execute in the same Exec mode address space. Hardware interrupt service routines also execute in this single Exec mode address space.
An important property of the Exec mode address space is that its pages are never swapped out to secondary storage. All system code and data structures, including the per-process variables, are stored in Exec address space, and are always available without risk of causing a page fault. This convenient property is possible only because the implementors have taken great pains to minimize the size of ITS and its data structures.
ITS processes are variously called "processes", "jobs", "users", and "procedures". To avoid confusion, this paper consistently uses the word "process", with one exception: the set of per-process variables that ITS maintains are referred to as "user variables", rather than the more consistent "process variables", because to do otherwise would violate tradition. (Knowing that processes are sometimes called "users" will also help the reader understand why there are so many U's in the names of various routines and variables!)
Typical user variables are the UNAME
and
JNAME
; these contain two SIXBIT
words that
are unique to this process, and together serve as the process's name.
Additional user variables will be introduced as they are needed. For
easy reference, a dictionary of all user variables mentioned anywhere
in the paper, each with a brief description of its function, is given
in an appendix. This appendix also reveals
some simplifications made by the main text.
As in any other PDP-10 operating system, ITS system calls are
performed by executing instructions that cause a trap to Exec mode.
The Exec mode trap handler saves the User mode PC in the
SUUOH
user variable, saves the contents of the User mode
accumulators in the UUOACS
user variable (an array of
sixteen words), examines the instruction that caused the trap, and
dispatches to the appropriate system call.
Each system call is responsible for obtaining its own arguments from the User mode address space, performing its particular task, and writing any return values back into User mode memory. (Many system calls make use of a flexible and convenient facility ITS provides for processing arguments and returning values.)
All system calls return to a common point in order to return to
User mode. This common return code checks for a number of possible
error conditions (such as returning to User mode while still holding a
lock or with hardware interrupts deferred), performs some general
cleanup, restores the accumulators from UUOACS
, and
returns to the User mode PC saved in SUUOH
.
The ITS Scheduler runs at the lowest priority of the PDP-10's seven hardware interrupt levels. This interrupt level is commonly called "clock level" because a hardware clock interrupt is signaled there periodically. (Actually, several minor interrupts are also signaled at clock level, and a variety of periodic overhead tasks are performed there as well, but in this paper the only clock level activity that will concern us is the Scheduler.)
The Scheduler's job is to switch the processor back and forth
between those processes that are currently runnable. When a process
is not actually running on the processor the Scheduler keeps its PC in
the UPC
user variable and keeps its accumulators in the
AC0S
user variable (an array of sixteen words).
When the Scheduler examines a process it may discover that the process is either "stopped", "blocked", or "running". If a process is stopped, then some external agent has decided that this process must not run, so the Scheduler can do nothing with it. If a process is blocked, it is waiting for something, so the Scheduler must check for the awaited condition to see if the process can be unblocked and allowed to run. If a process is running, then it is capable of making immediate progress, so the Scheduler can allow it to use the processor.
The Scheduler uses the user variables USTP
and
FLSINS
to keep track of the runnability of a process. If
a process's USTP
is non-zero, then the process is
stopped. The left half contains various bits corresponding to
specific reasons why the process should not run (for example, one bit
is set when the process receives a fatal interrupt). The right half
contains a count of the number of other processes that are currently
keeping this process stopped. As each such process finishes, it
decrements this variable so that the Scheduler will know when
everybody is finished and it is safe to run the process once
again.
If the FLSINS
user variable is non-zero, then the
process is blocked waiting for some condition, such as the typing of a
character on the user's terminal, the passage of a given amount of
time, or the arrival of a page to satisfy a page fault. In such a
case FLSINS
contains an instruction that the Scheduler
can execute in order to determine if the awaited condition has come to
pass. Such an instruction should skip if the process is no longer
blocked.
At any given moment a process can be in one of five possible states:
USTP
is non-zero and
UPC
contains a User mode PC. (FLSINS
is
zero, which isn't immediately relevant, but does insure that the
process will run as soon as its USTP
is
cleared.)USTP
is zero,
FLSINS
contains an instruction that will skip when the
process is no longer blocked, and UPC
contains an Exec
mode PC.USTP
is zero,
FLSINS
contains an instruction that will skip when the
process is no longer blocked, and UPC
contains a User
mode PC. The only possible reason for a process to be blocked in User
mode is that it is waiting for a page to satisfy a page fault; all
other cases of blocking occur after a process has executed a system
call and entered Exec mode.USTP
is zero,
FLSINS
is zero, and UPC
contains an Exec
mode PC.USTP
is zero,
FLSINS
is zero, and UPC
contains a User mode
PC.Once they get a chance to run, processes have ways of influencing the behavior of the Scheduler. For example, a process can temporarily mask clock-level interrupts, which insures that it has exclusive use of the processor while it performs some critical action that the Scheduler must not interrupt. Processes can also intentionally yield the processor and allow the Scheduler to run other processes; a process will do this when it first becomes blocked.
ITS has a sophisticated software interrupt system that it uses to
inform processes of a variety of conditions that might require
attention. When a hardware device driver or another process wishes to
interrupt a given process, it merely sets a bit in one of two user
variables, PIRQC
or IFPIR
. It is then up to
the Scheduler to actually deliver the interrupt.
Whenever the Scheduler is considering running a process it looks at
its PIRQC
and IFPIR
variables as well as its
USTP
and FLSINS
. If it discovers that an
interrupt is pending, it tries to manipulate the process into its
interrupt handler. This is a somewhat tricky procedure, because the
required manipulations can cause page faults! Fortunately, only the
first step of the procedure concerns us in this paper: the Scheduler
first must PCLSR
the process so that the process's
interrupt handler sees a User mode PC.
PCLSR
ingNow that we have introduced the players, we can proceed to examine
the implementation of PCLSR
ing in detail.
When it becomes necessary to PCLSR
some target
process, we will find it in one of the five possible states described
in the introduction to the Scheduler above. Considering each state in
turn, what needs to be done in order to get the target process into
User mode as quickly as possible?
LSWPR
user
variable, described below, contains a linked list of cleanup actions
that we perform. Then we copy the saved User mode accumulators from
UUOACS
into AC0S
, and we pick up the User
mode return PC from SUUOH
, decrement it so that the
process will start the system call from the beginning, and put it in
UPC
. Finally we clear the process's
FLSINS
.FLSINS
.SUEXIT
user variable) that tells the process that it
should PCLSR
itself either the next time it tries to
block or when it tries to return to User mode.The decision made in case 4 has some important corollaries. First,
it means that when a PCLSR
is requested, the requester
cannot be assured of immediate success. Second, it means that a
process can run in Exec mode without risk of being
PCLSR
ed, provided it doesn't do anything that might cause
it to become blocked.
In the sections that follow we explore the consequences of the
algorithm given above. The algorithm itself is available as a routine
named "PCLSR
".
PCLSR
ing cannot work unless all system calls are
written with attention to the fact that they might be
PCLSR
ed. This isn't all that great of a burden once the
rules of the game are explained.
The most obvious rule is that a system call is free to do anything it likes if it disables clock level interrupts before it starts, and cleans up after itself before reenabling them. This may seem a rather heavy-handed technique, especially given the subtlety of some of the other techniques discussed here, but for critical operations no more than a few instructions in duration it's hard to imagine anything quicker or more effective.
(The basic PCLSR
ing routines are themselves built on
the ability to temporarily turn off clock level interrupts; if ITS
were ever converted to run with multiple processors, it would be
necessary to rework this code. PCLSR
ing would be just as
useful on a multiprocessor ITS, but the substrate upon which it is
built would have to change.)
System calls may also rely on the fact that they cannot be
PCLSR
ed unless they attempt an operation that might cause
them to become blocked. A process can mess around with its own state
as much as it likes and no other process will be able to see what it
is doing, as long as it is careful not to block. If another process
wants to examine this one, it must first PCLSR
it, but as
long as this process remains running in Exec mode, the
PCLSR
ing algorithm given above guarantees that it will
continue to run, and the other process will have to wait.
This guarantee is only useful if a system call can easily tell when
it runs the risk of going blocked. Fortunately there are only two
things that a process can do to become blocked in Exec mode. First,
it can explicitly ask to go blocked, directly supplying the Scheduler
with the instruction for its FLSINS
user variable.
Second, it can take a page fault referencing a page in User address
space, an action that requires the use of a special instruction. A
third possibility, taking a page fault referencing a page in Exec
address space -- perhaps doing something as innocuous as fetching the
next instruction -- cannot occur. This is because, as was pointed out
above, pages in Exec address space are never swapped out.
One additional guarantee is made concerning references to pages in
a process's own User address space. System calls can be certain that
any page successfully referenced will remain swapped in for the
duration of the execution of the system call. (More on how this is
accomplished later.) A system call can thus be certain that a
reference to a given User address will not cause the process to block,
and therefore risk PCLSR
ing, if it has previously
referenced that address.
Some system calls take advantage of this guarantee by touching some
selected User addresses at the beginning of their execution, thus
insuring that they can modify the contents of those addresses later
without fear of being PCLSR
ed at an awkward moment. For
example, the SIOT
system call starts by touching the
locations that contain its byte pointer and byte count arguments. It
can then proceed, secure in the knowledge that those locations are
swapped in (as well as writable!).
Most system calls cannot completely avoid the possibility of going
blocked. Given the PCLSR
ing algorithm, every point where
the execution of a system call might block is also a point where that
execution might be terminated. In order to allow system calls to
prepare themselves for this possibility, each process's
LSWPR
user variable contains a list of actions that must
be performed in order to clean up if it is PCLSR
ed.
LSWPR
is usually maintained as a stack, with various
system subroutines pushing and popping entries as needed. The most
common cleanup handler calls for the release of a lock that the
process holds, but it is possible for a system call to supply an
arbitrary routine. (There are some restrictions on what such a
routine can do, given the delicacy of the situations in which it will
be invoked.)
System calls also prepare themselves for possible
PCLSR
ing by updating the locations in User memory that
contained their original arguments to reflect whatever work they have
accomplished thus far. This insures that if the call is
PCLSR
ed and restarted, it will take up where it left off.
We have already seen how users are made aware of this aspect of
PCLSR
ing.
The need to update locations in User memory in order to properly
prepare for possible PCLSR
ing is the reason why ITS does
not swap out User pages during the execution of a system call. If it
did, then the very attempt to update arguments before possibly
blocking might itself cause the process to block due to a page fault.
If the process were to be PCLSR
ed at that point, there
would be no record of what the system call had already accomplished,
since the arguments would remain unmodified.
PCLSR
ing has consequences that go beyond simply
supplying a few cleanup handlers to protect system calls where they
might block. Consider the case of a system call that must read some
data from the disk. After it makes its request of the disk system, it
will want to block waiting for the disk interrupt level to complete
the transfer. If the executing process is PCLSR
ed at
that point, then when the disk interrupt level finishes, there won't
be any process waiting for the answer.
The process may come back later looking for the same information, so the interrupt level does not want to discard the data that it just read. On the other hand, the process may never come back, so something must make certain to eventually reclaim the space occupied by the data. The interface between the system call and interrupt level must be designed with these possibilities in mind. In general, certain operations that in other operating systems could be performed directly by system calls must be performed by someone else, since any code in a system call beyond a point where the call might block may never get executed.
One final rule that system calls must observe in order for
PCLSR
ing to work: system calls may not run indefinitely
without presenting opportunities to be PCLSR
ed. For some
system calls it is possible to supply arguments that require the call
to loop. Those system calls periodically check the
SUEXIT
user variable themselves to see if someone has
requested a PCLSR
, and if so, they submit to the
request.
It is the Scheduler's job to deliver software interrupts to a
process when its PIRQC
or IFPIR
user
variable is non-zero. The first step of delivering such interrupts is
to PCLSR
the process so that its interrupt handler will
see a User mode PC.
The Scheduler discovers pending software interrupts when it is
examining a process to determine if it should actually be run. If a
process is not stopped (i.e., if its USTP
is zero) and if
it has pending interrupts, then the Scheduler will immediately try to
PCLSR
the process.
If the PCLSR
attempt fails, then the process must be
running in Exec mode. In this case, the Scheduler just treats the
process as it would any other runnable process. Since the process's
SUEXIT
now indicates that the process needs to be
PCLSR
ed, the next time the process tries to go blocked,
or return to User mode, it will PCLSR
itself and return
to the Scheduler. The Scheduler will then be able to deliver the
interrupt.
If the PCLSR
attempt succeeds, then the process has
been forced back to User mode. The Scheduler then proceeds to
manipulate the process to put it into its interrupt handler. The
subsequent steps of this procedure (while quite interesting!) are
beyond the scope of this paper.
PCLSR
AnotherThe most interesting case of PCLSR
ing occurs when one
process wants to PCLSR
another process, as in the example
of the DTTY
system call given above. In this case the
PCLSR
ing process will want to PCLSR
its
target and immediately stop it by incrementing the target's
USTP
. This common operation is done by calling a routine
named "RPCLSR
", so we often speak of one process
RPCLSR
ing ("ar-pee-see-loo-zer-ing") another.
The difficulties with RPCLSR
ing all stem from the
possibility that the target process might be running in Exec mode, and
thus will be unable to be PCLSR
ed immediately. In this
case the perpetrating process will have to go blocked (!) waiting for
the target to notice the PCLSR
attempt.
As we are about to discover, this situation is more delicate than
blocking ordinarily is. Later, when the target PCLSR
s
itself, it will need to be able to find the perpetrator. Thus we
introduce the RPCL
user variable to keep track of this
situation when it occurs. When a process fails to immediately
RPCLSR
its target, it will store the address of the
target process in its own RPCL
, and its own address in
the target's RPCL
. The left half of RPCL
is
used as a flag to record which process is the target and which process
the perpetrator.
The algorithm for RPCLSR
ing is as follows:
First, try to PCLSR
the target process. If this
succeeds, then immediately increment the target's USTP
and return. Clock level interrupts are turned off during this
procedure to prevent the target from getting a chance to run after it
is PCLSR
ed, but before it is stopped.
If the PCLSR
attempt fails, the target's
SUEXIT
has been set to indicate a PCLSR
attempt in progress. The perpetrator should now set up the
appropriate pointers in both its own and the target's
RPCL
user variables, but two rare cases must be
considered first:
RPCL
indicates that it itself is the target of an RPCLSR
by
some third process. In this case the perpetrator immediately submits
to the third party. Since the target's RPCL
is
unmodified, but its SUEXIT
has been set, it will
uselessly PCLSR
itself at some point in the future and
then continue to run. These useless PCLSR
s are not a
problem because this case rarely happens, and a few extra
PCLSR
s are harmless.RPCL
indicates that some third process is already trying to
RPCLSR
the target. In this case the perpetrator simply
blocks waiting for the target's RPCL
to be cleared, and
then starts the RPCLSR
attempt again from the
beginning.RPCL
is clear and unblocks the perpetrator, but before
the perpetrator gets a chance to run, the target will start up and get
back into Exec mode and become the target of yet another third-party
RPCLSR
attempt. A process might therefore starve forever
waiting for its chance to RPCLSR
someone, even though all
other processes are making progress. This is not a problem in
practice because it is extraordinarily unlikely. It may be more
likely to snow in the machine room than it is for a process to starve
in this way for a noticeable amount of time.)If neither of these cases occurs, the perpetrator sets up the two
RPCL
user variables and blocks. Since the act of going
blocked runs the risk of being PCLSR
ed, something must be
done to clear out the RPCL
pointers if the perpetrator is
itself PCLSR
ed while it waits. This could be done by
using an appropriate cleanup handler pushed on the perpetrator's
LSWPR
, but instead, the PCLSR
routine itself
always checks the RPCL
of any process it is applied to
for the possibility that that process is RPCLSR
ing
someone, and cleans up appropriately.
The final step in RPCLSR
ing occurs when the target
tries to block or to return to User mode, and notices that its
SUEXIT
indicates that it should PCLSR
itself, and that in addition its RPCL
indicates that it
is the target of an RPCLSR
request. The target must then
PCLSR
itself, clear the RPCL
s of both itself
and the perpetrator, increment its own USTP
in order to
stop itself, and clear the FLSINS
of the perpetrator in
order to unblock it. (And it must do all this with clock level
interrupts turned off, to prevent a variety of timing errors.)
It is important to unblock the PCLSR
ing process
instantly; otherwise it might be PCLSR
ed and the target's
USTP
would never get decremented. This risk is
eliminated by starting the perpetrator running in Exec mode where it
can't be PCLSR
ed until after taking appropriate
precautions. In fact, the next thing many RPCLSR
ing
processes do is add a cleanup handler to their LSWPR
to
decrement the target's USTP
. (Of course, if there is no
danger of going blocked, the perpetrator can simply perform its task
and decrement the target's USTP
when it is done without
bothering about cleanup handlers at all.)
Recall that ITS guarantees that no User mode pages will be swapped
out during the execution of a system call. This cannot be
accomplished by simply granting blanket immunity to processes in Exec
mode, because processes spend most of their time blocked in Exec mode
(waiting for the next character from the terminal or whatever). It is
therefore necessary to resort to PCLSR
ing blocked
processes in order to find pages to swap out to secondary storage.
Unfortunately, simply PCLSR
ing a process blocked in
Exec mode backs it out of the system call and starts it running in
User mode again, which causes it to immediately reexecute the call,
and usually to return to the same blocked condition as before. Thus
if ITS were to immediately PCLSR
a process every time it
wanted to swap out some of its pages, that would at least cause a lot
of unnecessary activity, and might even cause the process to swap
pages back in that were just swapped out.
Instead, when ITS wants to swap out a page belonging to a process
that is blocked in Exec mode, it simply sets a bit (named
"%SWPCL
") in the process's USWST
user
variable. %SWPCL
is checked by the Scheduler whenever it
is about to unblock a process, and if it is set, the Scheduler will
PCLSR
the process instead. The process is allowed to
finish waiting for whatever it was waiting for, and only then is it
PCLSR
ed. This is just as effective at preventing the
process from ever seeing its pages swapped out as immediately
PCLSR
ing it would be.
(Actually, the swapping code will PCLSR
a process
immediately if it has any entries on its LSWPR
.
Presumably the idea is to release any locks that the process is
holding as soon as possible, so other processes can seize them and
make progress. Perhaps because of this, the swapping code tries to
avoid swapping out pages that belong to processes with
LSWPR
entries. The author is not prepared to judge these
aspects of ITS's page swapping algorithm.)
Since processes that are running in Exec mode cannot be
PCLSR
ed, and since they might depend on keeping any of
their pages swapped in, those pages must be kept immune from being
swapped out. Fortunately, processes spend very little time actually
running in Exec mode, so the number of pages locked down by this
restriction is generally small.
PCLSR
Test ModeTo debug ITS system calls, it is convenient to have a way to test
their behavior when PCLSR
ed from various points in their
execution. PCLSR
test mode allows the system call writer
to choose a set of such test points by inserting calls to an assembler
macro, named "PCLT
", in the system call. Each call to
PCLT
is positioned immediately before a place where the
call would block. Then PCLSR
test mode lets the system
call writer run the system call until it reaches one of the test
points. There are three modes that control what happens each time the
process being tested reaches one of the test points:
PCLSR
s itself and
stops. If the system call writer starts the process running again, it
will run to the exact same test point and then PCLSR
and
stop again. Thus the system call writer can test the same point over
and over again.PCLSR
and stop at the
next test point, where and whenever that happens to occur.
Thus the system call writer can test all of the test points in
succession in a particular path through a system call.PCLSR
ed,
it just keeps running and thus immediately starts the system call
over. This lets the system call writer exercise a system call over
and over, presumably waiting for some bug to happen. The exact way in which PCLSR
test mode decides that
the call has reached "the same test point" is not simply a test of the
PC. This would not work very well because a particular test point
might be in a subroutine that was called from several places. In
fact, PCLSR
test mode crawls down the stack sniffing
around for return addresses, in an attempt to discover the call chain
that got to the current routine. These return addresses are then
hashed together. It is the resulting hash that is used to compare
test points.
Many of the design decisions made in the ITS implementation of
PCLSR
ing can only be justified by measuring the behavior
of a running ITS. Of primary interest are the frequencies at which
the various events considered above occur.
The statistics below were collected on AI.AI.MIT.EDU, a KS-10, during December 1988 and January 1989. AI.AI.MIT.EDU supports a moderately small user community; typically there are 3 or 4 users logged in, each with 3 or 4 processes, plus a handful of system demons and network servers, for a total of between 15 and 20 processes. There are rarely more than 40 processes, and the system never runs up against its configured limit of 62 processes.
How often do the various cases in the PCLSR
ing
algorithm occur? During a 10 day period PCLSR
was called
once every 0.32 seconds. 35.9% of the time the target was in User
mode already (stopped, blocked, or running). 60.3% of the time the
target was blocked in Exec mode. The remaining 3.8% of the time the
target was running in Exec mode and could not be immediately
PCLSR
ed. This last case therefore happened once every
8.5 seconds.
The problem case of PCLSR
ing, where the target cannot
be immediately forced into User mode, is observed to be a relatively
rare occurrence. A process that calls RPCLSR
will only
rarely have to block, and will even more rarely find a third party
already blocked attempting to RPCLSR
the same target.
Also, at any given moment, only a few processes will be holding their
pages locked in memory because they are running in Exec mode.
We might also wonder for what reasons PCLSR
is called.
During a 32 hour period PCLSR
was called once every 0.40
seconds. The Scheduler called PCLSR
in order to deliver
a software interrupt once every 0.58 seconds, accounting for 69.0% of
all calls to PCLSR
. A process called RPCLSR
once every 1.44 seconds, accounting for 27.7% of all calls to
PCLSR
. The page swapping code caused a process to be
PCLSR
ed once every 11.77 seconds, accounting for the
remaining 3.4% of all calls to PCLSR
.
Given that PCLSR
fails 3.8% of the time, we can
estimate that the Scheduler failed to immediately deliver an interrupt
(and ran the target process instead) once every 15 seconds. We can
also estimate that a process called RPCLSR
and had to
block once every 38 seconds. Of course these events would have
happened more frequently on an ITS supporting more processes; still,
it is nice to have a feel for just how rare these events are. It is
also nice that the page swapping code was doing so little
PCLSR
ing. Of course an ITS with less physical memory, or
running more memory intensive applications, would exhibit a higher
percentage of PCLSR
s for swapping.
Finally, we might wonder just how quickly a process that is
PCLSR
ed while running in Exec mode normally reaches User
mode. During a 53 hour period PCLSR
failed to
immediately force a process into User mode 13381 times. On 7376 of
these occasions the target process was already the subject of a
PCLSR
attempt. (These occasions mostly represent
duplicate attempts by the Scheduler to deliver an interrupt to a
process that had been continuously running in Exec mode since the last
time the Scheduler ran.) Therefore on the remaining 6005 occasions
the target was not the subject of a prior PCLSR
attempt. On these 6005 occasions a timer was started to measure the
amount of time it took for the process to eventually reach User
mode.
The resulting distribution of times has some very interesting
features. The average time taken was 0.116 seconds, which is
misleading because a minority of cases that took a very long time
inflate the average. Perhaps a more reasonable measure is the median:
50% of the time the process reached User mode in under 0.015 seconds.
80% of the time the process took less than 0.107 seconds (still less
than the average). 90% of the time the process took less than 0.267
seconds. The Scheduler allows a process 0.133 seconds of runtime
before the next schedule, so 82% of the time the target of a
PCLSR
attempt reached User mode before the next
schedule.
The time measured was elapsed time, not per-process run time. This
makes little difference to the results because typically the target
process was the only runnable process in the system, so no delay was
introduced while some other process ran. Occasionally, however, there
was another runnable process, and as a result the distribution shows a
small spike just after 0.133 seconds, caused by PCLSR
ed
processes that had to wait one turn, while another process ran, before
they got their chance to quickly reach User mode. An even smaller
spike appears after 0.266 seconds.
The PCLSR
ing modularity principle is simple: a process
should never catch another process with its pants down.
PCLSR
ing gives a process a clean way to access the state
of another process.
In this paper I have explained the benefits of this principle for both users and system programmers, and I have explained how the ITS operating system goes about supporting it. I described the implementation in some detail, and also provided some performance metering, in order to demonstrate that the costs of applying the principle are acceptable.
I would like to believe that some reader of this paper will be
inspired to try bringing the benefits of PCLSR
ing to some
younger operating system.
I am grateful to the early ITS developers who originally designed, implemented, and debugged the ideas presented here: Donald Eastlake, Jerry Freiberg, Richard Greenblatt, Jack Holloway, Tom Knight, George Mitchel, Stewart Nelson, Jeff Rubin, and Frederick Wright.
Dave Moon and Tom Knight provided many useful comments on early
drafts of this paper and helped me debug my understanding of
PCLSR
ing. Joe Dempster, Ian Horswill, Richard
Greenblatt, Leigh Klotz, Chris Maeda, and Guy Steele also provided
valuable feedback. It was Joe Dempster who convinced me that
something ITS should be recorded for posterity. Pandora
Berman sometimes helped me find the right words. Erasmus helped
too.
AC0S
UPC
.FLSINS
USTP
.LSWPR
PCLSR
ed. Common
cleanup actions are releasing locks held by this process and
decrementing counters incremented by this process. An important
instance of such a cleanup action is ensuring that the
USTP
variable of another process is properly
decremented.PIRQC
IFPIR
RPCL
RPCLSR
which other processes. If process A is trying
to RPCLSR
process B, then process A's RPCL
will contain the address of process B in its right half and zero in
its left half. Process B's RPCL
will contain the address
of process A in its right half and minus one in its left half.SUEXIT
SUUOH
. If there is an attempt in progress to
PCLSR this process, this variable will instead contain a jump to a
special routine. This variable could actually be replaced with a flag
that just indicates a pending PCLSR
attempt.SUUOH
UUOACS
.UNAME
JNAME
SIXBIT
names of
the process. These variables actually have nothing to do with
PCLSR
ing; they are here to serve as examples of "typical"
variables.
UPC
AC0S
.USTP
FLSINS
.USWST
%SWPCL
, concerns us in this
paper. %SWPCL
is set if any of this process's pages were
swapped out while it was blocked in Exec mode. When the process
becomes runnable again, the Scheduler will see this bit and
PCLSR
the process instead of letting it continue in Exec
mode.UUOACS
SUUOH
.