swig/Tools/WAD/Papers/usenix2001.tex

%template for producing IEEE-format articles using LaTeX.
%written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
%use at your own risk.  Complaints to /dev/null.
%make two column with no page numbering, default is 10 point
%\documentstyle{article}
\documentstyle[twocolumn]{article}
%\pagestyle{empty}

%set dimensions of columns, gap between columns, and space between paragraphs
%\setlength{\textheight}{8.75in}
\setlength{\textheight}{9.0in}
\setlength{\columnsep}{0.25in}
\setlength{\textwidth}{6.45in}
\setlength{\footheight}{0.0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.0in}
\setlength{\oddsidemargin}{0in}
%\setlength{\oddsidemargin}{-.065in}
%\setlength{\oddsidemargin}{-.17in}
%\setlength{\parindent}{0pc}

%I copied stuff out of art10.sty and modified them to conform to IEEE format

\makeatletter
%as Latex considers descenders in its calculation of interline spacing,
%to get 12 point spacing for normalsize text, must set it to 10 points
\def\@normalsize{\@setsize\normalsize{12pt}\xpt\@xpt
\abovedisplayskip 10pt plus2pt minus5pt\belowdisplayskip \abovedisplayskip
\abovedisplayshortskip \z@ plus3pt\belowdisplayshortskip 6pt plus3pt
minus3pt\let\@listi\@listI}

%need an 11 pt font size for subsection and abstract headings
\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt}

%make section titles bold and 12 point, 2 blank lines before, 1 after
\def\section{\@startsection {section}{1}{\z@}{24pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\large\bf}}

%make subsection titles bold and 11 point, 1 blank line before, 1 after
\def\subsection{\@startsection {subsection}{2}{\z@}{12pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\subsize\bf}}
\makeatother

\newcommand{\ignore}[1]{}
%\renewcommand{\thesubsection}{\arabic{subsection}.}

\begin{document}

%don't want date printed
\date{}

%make title bold and 14 pt font (Latex default is non-bold, 16 pt)
\title{\Large \bf   An Embedded Error Recovery and Debugging Mechanism for Scripting Language Extensions}

%for single author (just remove % characters)
\author{{David M.\ Beazley} \\
{\em Department of Computer Science} \\
{\em University of Chicago }\\
{\em Chicago, Illinois 60637 }\\
{\em beazley@cs.uchicago.edu }}

%  My Department \\
%  My Institute \\
%  My City, ST, zip}

%for two authors (this is what is printed)
%\author{\begin{tabular}[t]{c@{\extracolsep{8em}}c}
%  Roscoe Giles	                        & Pablo Tamayo \\
% \\
%  Department of Electrical, Computer,   & Thinking Machines Corp. \\
%  and Systems Engineering               & Cambridge, MA~~02142.  \\
%  and                                   & \\
%  Center for Computational Science      & \\
%  Boston University, Boston, MA~~02215. &
%\end{tabular}}

\maketitle

%I don't know why I have to reset thispagesyle, but otherwise get page numbers
\thispagestyle{empty}


\subsection*{Abstract}
{\em
In recent years, scripting languages such as Perl, Python, and Tcl
have become popular development tools for the creation of
sophisticated application software.  One of the most useful features
of these languages is their ability to easily interact with compiled
languages such as C and C++.  Although this mixed language approach
has many benefits, one of the greatest drawbacks is the complexity of
debugging that results from using interpreted and compiled code in
the same application.  In part, this is due to the fact that scripting
language interpreters are unable to recover from catastrophic errors in
compiled extension code. Furthermore, traditional C/C++ debuggers do
not provide a satisfactory degree of integration with interpreted
languages.  This paper describes an experimental system in which fatal
extension errors such as segmentation faults, bus errors, and failed
assertions are handled as scripting language exceptions.  This system,
which has been implemented as a general purpose shared library,
requires no modifications to the target scripting language, introduces
no performance overhead, and simplifies the debugging of mixed
interpreted-compiled application software.
}

\section{Introduction}

Slightly more than ten years have passed since John Ousterhout
introduced the Tcl scripting language at the 1990 USENIX technical
conference \cite{ousterhout}.  Since then, scripting languages have
been gaining in popularity as evidenced by the wide-spread use of
systems such as Tcl, Perl, Python, Guile, PHP, and Ruby
\cite{ousterhout,perl,python,guile,php,ruby}.

In part, the success of modern scripting languages is due to their
ability to be easily integrated with software written in compiled
languages such as C, C++, and Fortran.  In addition, a wide variety of wrapper
generation tools can be used
to automatically produce bindings between existing code and a
variety of scripting language environments
\cite{swig,sip,pyfort,f2py,advperl,heidrich,vtk,gwrap,wrappy}.  As a result, a large number of
programmers are using scripting languages to control
complex C/C++ programs or as a tool for re-engineering legacy
software.  This approach is attractive because it allows programmers
to benefit from the flexibility and rapid development of
scripting while retaining the best features of compiled code such as high
performance \cite{ouster1}.

A critical aspect of scripting-compiled code integration is the way in
which it departs from traditional C/C++ development.  Rather than
building large monolithic stand-alone applications, scripting
languages strongly encourage the creation of modular software
components.  As a result, scripted software tends to be constructed as
a mix of dynamically loadable libraries, scripts, and third-party
extension modules. In this sense, one might argue that the benefits of
scripting are achieved at the expense of creating a somewhat more
complicated development environment.

A consequence of this complexity is an increased degree of difficulty
associated with debugging programs that utilize multiple languages,
dynamically loadable modules, and a sophisticated runtime environment.
To address this problem, this paper describes an experimental system
known as WAD (Wrapped Application Debugger) in which an embedded error
recovery and debugging mechanism is added to common scripting
languages.  This system converts catastrophic signals such as
segmentation faults and failed assertions to exceptions that can be
handled by the scripting language interpreter.  In doing so, it
provides more seamless integration between error handling in
scripting language interpreters and compiled extensions.

\section{The Debugging Problem}

Normally, a programming error in a scripted application
results in an exception that describes the problem and the context in
which it occurred.  For example, an error in a Python script might
produce a traceback similar to the following:

\begin{verbatim}
% python foo.py
Traceback (innermost last):
  File "foo.py", line 11, in ?
    foo()
  File "foo.py", line 8, in foo
    bar()
  File "foo.py", line 5, in bar
    spam()
  File "foo.py", line 2, in spam
    doh()
NameError: doh
\end{verbatim}

In this case, a programmer might be able to apply a fix simply based
on information in the traceback.  Alternatively, if the problem is
more complicated, a script-level debugger can be used to provide more information.  In contrast,
a failure in compiled extension code might produce the following result:

\begin{verbatim}
% python foo.py
Segmentation Fault (core dumped)
\end{verbatim}

In this case, the user has no idea of what has happened other
than it appears to be ``very bad.'' Furthermore, script-level
debuggers are unable to identify the problem since they also crash
when the error occurs (they usually run in the same process as
the interpreter).  A user might be able to narrow the source of the
problem through trial-and-error techniques such as inserting print
statements or commenting out sections of script code.  Unfortunately,
neither of these techniques are very attractive for obvious reasons.

Alternatively, a user could run the application under the control of a
traditional debugger such as gdb \cite{gdb}. Unfortunately, this also has
drawbacks.  First, even though the debugger provides information about the error,
the debugger mostly provides information about the internal
implementation of the scripting language interpreter.  Needless
to say, this isn't very useful nor does it provide much insight as to
where the error might have occurred within a script.  Second,
the structure of a scripted application tends to be much more complex
than a traditional stand-alone program.  As a result, a user may not
have a good sense of how to actually attach a C/C++ debugger to their
script.  In addition, execution may occur within a
complex run-time environment involving events, threads, and network
connections.  Because of this, it can be difficult to reproduce
and identify certain types of catastrophic errors (especially if they
depend on timing or peculiar sequences of events). Finally, this approach
assumes that a programmer has a C/C++ development environment installed on
their machine and that they know how to use a low-level
debugger.  Unfortunately, neither of these assumptions may hold in practice.
This is because scripting languages are often used to provide programmability to
applications in which end-users might write scripts, yet would not be expected
to write low-level C code.

Even if a traditional debugger such as gdb were modified to
provide better integration with scripting languages, it is not clear
that this would be the most natural solution to the problem.
For one, the whole notion of having to run a separate debugging process to debug
extension code is unnatural when no such requirement exists for
a script.  Furthermore, even if such a debugger existed, an inexperienced user may not
have the expertise or inclination to use it.  Finally,
obscure fatal errors may occur long after an application has been deployed.
Unless the debugger is distributed along with the application in some manner, it will be
extraordinary difficult to obtain useful diagnostics when such errors occur.

\begin{figure*}[t]
{\small
\begin{verbatim}
% python foo.py
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "foo.py", line 16, in ?
    foo()
  File "foo.py", line 13, in foo
    bar()
  File "foo.py", line 10, in bar
    spam()
  File "foo.py", line 7, in spam
    doh.doh(a,b,c)

SegFault: [ C stack trace ]

#2   0x00027774 in call_builtin(func=0x1c74f0,arg=0x1a1ccc,kw=0x0) in 'ceval.c', line 2650
#1   0xff083544 in _wrap_doh(self=0x0,args=0x1a1ccc) in 'foo_wrap.c', line 745
#0   0xfe7e0568 in doh(a=0x3,b=0x4,c=0x0) in 'foo.c', line 28

/u0/beazley/Projects/WAD/Python/foo.c, line 28

    int doh(int a, int b, int *c) {
 =>   *c = a + b;
      return *c;
    }
\end{verbatim}
}
\caption{Cross language traceback generated for a segmentation fault in a Python extension}
\end{figure*}

The easiest solution to the debugging problem is
to simply add as much error checking as possible.  Although this is never
a bad thing to do, it's usually not enough to completely eliminate the problem.
For one, scripting languages are sometimes used to control hundreds
of thousands to millions of lines of compiled code.  In this case, it is improbable
that a programmer will be able to foresee every conceivable error.
Second, scripting languages are often used to put new user interfaces on legacy software. In this
case, scripting may introduce new modes of execution that cause a formerly ``bug-free''
application to fail in an unexpected manner.  Finally, certain types
of errors such as floating-point exceptions can be particularly
difficult to eliminate because they might be generated algorithmically (e.g.,
as the result of a numerical method). Therefore, even when a programmer has worked hard to eliminate
crashes, there is always a small probability that a complex application
will fail.

\section{Embedded Error Recovery}

Rather than modifying an existing debugger to support scripting
languages, an alternative approach is to add a more powerful error
handling and recovery mechanism to the scripting language interpreter.
This approach has been implemented in the form of an
experimental system known as WAD.  WAD
is packaged as dynamically loadable shared library that can either be
loaded as a scripting language extension or linked to existing
extension modules as a library.  The core of the system is generic and
requires no modifications to the scripting interpreter or existing
extension modules.  Furthermore, the system does not introduce a performance penalty as it
does not rely upon program instrumentation or tracing.

WAD works by converting fatal signals such as SIGSEGV,
SIGBUS, SIGFPE, and SIGABRT into scripting language exceptions that contain
debugging information collected from the call-stack of compiled
extension code.  By handling errors in this manner, the scripting
language interpreter is able to produce a cross-language stack trace that
contains information from both the script code and extension code as
shown for Python and Tcl/Tk in Figures 1 and 2.  In this case, the user
is given a very clear idea of what has happened without having
to launch a separate debugger.

The advantage to this approach is that it provides
more seamless integration between error handling
in scripts and error handling in extensions.  In addition, it eliminates
the most common debugging step that a developer is likely to perform
in the event of a fatal error--running a separate debugger on a core
file and typing 'where' to get a stack trace.  Finally, this allows
end-users to provide extension writers with useful debugging
information since they can supply a stack trace as opposed to a vague
complaint that the program ``crashed.''

\begin{figure*}[t]
\begin{picture}(400,250)(0,0)
\put(50,-110){\special{psfile = tcl.ps hscale = 60 vscale = 60}}
\end{picture}
\caption{Dialogue box with traceback information for a failed assertion in a Tcl/Tk extension}
\end{figure*}

\section{Scripting Language Internals}

In order to provide embedded error recovery, it is critical to understand how
scripting language interpreters interface with extension code.  Despite the wide variety
of scripting languages, essentially every implementation uses a similar
technique for accessing foreign code.

The most widely used extension mechanism is a foreign function
interface in which compiled procedures can be called from the scripting language
interpreter. This is accomplished by writing a collection of wrapper functions that conform
to a specified calling convention. The primary purpose of the wrappers are to
marshal arguments and return values between the two languages and to handle errors.
For example, in Tcl, every wrapper
function must conform to the following prototype:

\begin{verbatim}
int
wrap_foo(ClientData clientData,
         Tcl_Interp *interp,
         int objc,
         Tcl_Obj *CONST objv[])
{
    /* Convert arguments */
    ...
    /* Call a function */

    result = foo(args);

    /* Set result */
    ...
    if (success) {
        return TCL_OK;
    } else {
        return TCL_ERROR;
    }
}
\end{verbatim}

The other extension mechanism is an object/type interface that allows programmers to create new
kinds of fundamental types or attach special properties to objects in
the interpreter.  This usually involves setting up tables of function
pointers that define various properties of an object.  For example, if
you wanted to add complex numbers to an interpreter, you might fill in a special
data structure with pointers to various methods like this:

\begin{verbatim}
NumberMethods ComplexMethods {
    complex_add,
    complex_sub,
    complex_mul,
    complex_div,
    ...
};
\end{verbatim}

\noindent
Once registered with the interpreter, the methods in this structure
would be invoked by various interpreter operators such as $+$,
$-$, $*$, and $/$.

Most interpreters handle errors as a two-step process in which
detailed error information is first registered with the interpreter
and then a special error code is returned. For example, in Tcl, errors
are handled by setting error information in the interpreter and
returning a value of TCL\_ERROR.  Similarly in Python, errors are
handled by raising an exception and returning NULL.  In both cases,
this triggers the interpreter's error handler---possibly resulting in
a stack trace of the running script.  In some cases, an interpreter
might handle errors using a form of the C {\tt longjmp} function.
For example, Perl provides a special function {\tt die} that jumps back
to the interpreter with a fatal error \cite{advperl}.

The precise implementation details of these mechanisms aren't so
important for our discussion.  The critical point is that scripting
languages always access extension code though a well-defined interface
that precisely defines how arguments are to be passed, values are to be
returned, and errors are to be handled.

\section{Scripting Languages and Signals}

Under normal circumstances, errors in extension code are handled
through the error-handling API provided by the scripting language
interpreter.  For example, if an invalid function parameter is passed,
a program can simply set an error message and return to the
interpreter.  Similarly, automatic wrapper generators such as SWIG can produce
code to convert C++ exceptions and other C-related error handling
schemes to scripting language errors \cite{swigexcept}. On the other
hand, segmentation faults, failed assertions, and similar problems
produce signals that cause the interpreter to crash.

Most scripting languages provide limited support for Unix signal
handling \cite{stevens}.  However, this support is not sufficiently advanced to
recover from fatal signals produced by extension code.
First, unlike signals generated for asynchronous events such as I/O,
execution can {\em not} be resumed at the point of a fatal signal.
Therefore, even if such a signal could be caught and handled by a script,
there isn't much that it can do except to print a diagnostic
message and abort before the signal handler returns.  Second,
some interpreters block signal delivery while executing
extension code--opting to handle signals at a time when it is more convenient.
In this case, a signal such as SIGSEGV would simply cause the whole application
to freeze since there is no way for execution to continue to a point where
the signal could be delivered.  Because of these issues, scripting languages
either ignore the problem or label it as an ``limitation.''

\section{Overview of WAD}

WAD installs a reliable signal handler for
SIGSEGV, SIGBUS, SIGABRT, SIGILL, and SIGFPE using {\tt sigaction}
\cite{stevens}.  Since none of these signals are normally used in the implementation
of the scripting interpreter or by any user scripts, this typically does not override any previous
signal handling.  Afterwards, when one of these signals occurs, a two-phase
recovery process executes. First,
information is collected about the execution context including a
full stack-trace, symbol table entries, and debugging information.
Second, the current stream of execution is aborted and an error is
returned to the interpreter.  This process is illustrated in Figure~3.

The collection of context and debugging information is a relatively
straightforward process involving the following steps:

\begin{itemize}
\item The program counter and stack pointer are obtained from
context information passed to the WAD signal handler.

\item The virtual memory map of the process is obtained from /proc
and used to associate virtual memory addresses with executable files,
shared libraries, and dynamically loaded extension modules \cite{proc}.

\item The call stack is unwound to collect traceback information.
each step of the stack traceback, symbol table and debugging
information is gathered and stored in a generic data structure for later use
in the recovery process.  This data is obtained by memory-mapping
the ELF format object files associated with the process and extracting
symbol table and stabs debugging information \cite{elf,stabs}.
\end{itemize}

Once debugging information has been collected, the signal handler
enters an error-recovery phase that
attempts to raise an exception and return to a suitable location in the
interpreter.  To do this, the following steps are performed:

\begin{itemize}

\item The stack trace is examined to see if there are any locations to which
control can be returned.

\item If a suitable return location is found, the CPU context is modified in
a manner that makes the signal handler return to the interpreter
with an error.  This return process is assisted by a small
trampoline function (partially written in assembly language) that arranges a proper
return to the interpreter after the signal handler returns.
\end{itemize}

\noindent
Of the two phases, the return to the interpreter is of greater interest. Therefore, it
is now described in greater detail.

\begin{figure*}[t]
\begin{picture}(480,340)(5,60)

\put(50,330){\framebox(200,70){}}
\put(60,388){\tt >>> {\bf foo()}}
\put(60,376){\tt Traceback (most recent call last):}
\put(70,364){\tt   File "<stdin>", line 1, in ?}
\put(60,352){\tt SegFault: [ C stack trace ]}
\put(60,340){\tt ...}

\put(55,392){\line(-1,0){25}}
\put(30,392){\line(0,-1){80}}
\put(30,312){\line(1,0){95}}
\put(125,312){\vector(0,-1){10}}
\put(175,302){\line(0,1){10}}
\put(175,312){\line(1,0){95}}
\put(270,312){\line(0,1){65}}
\put(270,377){\vector(-1,0){30}}

\put(50,285){\framebox(200,15)[c]{[Python internals]}}
\put(125,285){\vector(0,-1){10}}
\put(175,275){\vector(0,1){10}}
\put(50,260){\framebox(200,15)[c]{call\_builtin()}}
\put(125,260){\vector(0,-1){10}}
%\put(175,250){\vector(0,1){10}}
\put(50,235){\framebox(200,15)[c]{wrap\_foo()}}
\put(125,235){\vector(0,-1){10}}
\put(50,210){\framebox(200,15)[c]{foo()}}
\put(125,210){\vector(0,-1){10}}
\put(50,185){\framebox(200,15)[c]{doh()}}
\put(125,185){\vector(0,-1){20}}
\put(110,148){SIGSEGV}
\put(160,152){\vector(1,0){100}}
\put(260,70){\framebox(200,100){}}
\put(310,155){WAD signal handler}
\put(265,140){1. Unwind C stack}
\put(265,125){2. Gather symbols and debugging info}
\put(265,110){3. Find safe return location}
\put(265,95){4. Raise Python exception}
\put(265,80){5. Modify CPU context and return}

\put(260,185){\framebox(200,15)[c]{return assist}}
\put(365,174){Return from signal}
\put(360,170){\vector(0,1){15}}
\put(360,200){\line(0,1){65}}

%\put(360,70){\line(0,-1){10}}
%\put(360,60){\line(1,0){110}}
%\put(470,60){\line(0,1){130}}
%\put(470,190){\vector(-1,0){10}}

\put(360,265){\vector(-1,0){105}}
\put(255,250){NULL}
\put(255,270){Return to interpreter}

\end{picture}

\caption{Control Flow of the Error Recovery Mechanism for Python}
\label{wad}
\end{figure*}

\section{Returning to the Interpreter}

To return to the interpreter, WAD maintains a table of symbolic names
and return values that correspond to locations within the interpreter responsible for invoking
wrapper functions and object/type methods.  For example, Table 1 shows a partial list of
return locations used in the Python implementation.  When an error
occurs, the call stack is scanned for the first occurrence of any
symbol in this table.  If a match is found, control is returned to that location
by emulating the return of a wrapper function with the error code from the table. If
no match is found, the error handler simply prints a stack trace to
standard output and aborts.

When a symbolic match is found, WAD invokes a special user-defined
handler function that is written for a specific scripting language.
The primary role of this handler is to take debugging information
gathered from the call stack and generate an appropriate scripting language error.
One peculiar problem of this step is that the generation
of an error may require the use of parameters passed to a
wrapper function.  For example, in the Tcl wrapper shown earlier, one
of the arguments was an object of type ``{\tt Tcl\_Interp *}''.
This object contains information specific to the state of the
interpreter (and multiple interpreter objects may exist in a single
application).   Unfortunately, no reference to the interpreter object is
available in the signal handler.  Furthermore, the interpreter
object may not be available in the context of a function that generated the error.


\begin{table}[t]
\begin{center}
\begin{tabular}{ll}
Python symbol                 &   Return value \\ \hline
call\_builtin                 &   NULL \\
PyObject\_Print               & -1 \\
PyObject\_CallFunction        & NULL \\
PyObject\_CallMethod          & NULL \\
PyObject\_CallObject          & NULL \\
PyObject\_Cmp                 & -1 \\
PyObject\_DelAttrString       & -1 \\
PyObject\_DelItem             & -1 \\
PyObject\_GetAttrString       & NULL \\
\end{tabular}
\end{center}
\label{returnpoints}
\caption{A partial list of symbolic return locations in the Python interpreter}
\end{table}

To work around this problem, WAD implements a feature
known as argument stealing.  When examining the call-stack, the signal
handler has full access to all function arguments and local variables.
Therefore, if the handler knows that an error was generated while
calling a wrapper function (as determined by looking at the symbol names),
it can grab the interpreter object from the stack frame of the wrapper and
use it to set an appropriate error code before returning to the interpreter.
Currently, this is managed by allowing the signal handler to steal
arguments from the caller using positional information.
For example, to grab the {\tt Tcl\_Interp *} object from a Tcl wrapper function,
code similar to the following is written:

\begin{verbatim}
Tcl_Interp *interp;
int         err;

interp = (Tcl_Interp *) wad_steal_outarg(
           stack,
           "TclExecuteByteCode",
           1,
           &err);
if (!err) {
   Tcl_SetResult(interp,errtype,TCL_STATIC);
   Tcl_AddErrorInfo(interp,errdetails);
}
\end{verbatim}

In this case, the 2nd argument passed to a wrapper function
is stolen and used to generate an error.  Also, the name {\tt TclExecuteByteCode}
refers to the calling function, not the wrapper function itself.
At this time, argument stealing is only applicable to simple types
such as integers and pointers.  However, this is adequate for generating
scripting language errors.

\section{Register Management}

A final issue concerning the return mechanism has to do with the
precise behavior of the non-local return to the interpreter.  Roughly
speaking, this emulates the behavior of the C {\tt longjmp}
library call.  However, this is done without the use of a matching
{\tt setjmp} in the interpreter.

The primary problem with aborting execution and returning to the
interpreter in this manner is that most compilers use a register management technique
known as callee-save \cite{prag}.  In this case, it is the responsibility of
the called function to save the state of the registers and to restore
them before returning to the caller. By making a non-local jump,
registers may be left in an inconsistent state due to the fact that
they are not restored to their original values.  The {\tt longjmp} function
in the C library avoids this problem by relying upon {\tt setjmp} to save
the registers.  Unfortunately, WAD does not have this
luxury.  As a result, a return from the signal handler may produce a
corrupted set of registers at the point of return in the interpreter.

The severity of this problem depends greatly on the architecture and
compiler.  For example, on the SPARC, register windows effectively
solve the callee-save problem \cite{sparc}.  In this case, each stack frame has its own
register window and the windows are flushed to the stack whenever a
signal occurs.  Therefore, the recovery mechanism can examine the stack and
arrange to restore the registers to their proper values when control
is returned.  Furthermore, certain conventions of the SPARC ABI resolve several related
issues. For example, floating point registers are caller-saved
and the contents of the SPARC global registers are not guaranteed to be preserved
across procedure calls (in fact, they are not even saved by {\tt setjmp}).

On other platforms, the problem of register management becomes much
more interesting.  One approach is to simply ignore the problem
altogether and return to the interpreter with the registers in an
essentially random state.  Surprisingly, this approach actually seems to work (although a considerable degree of
caution might be in order).
This is because the return of an error code tends to trigger
a cascade of procedure returns within the implementation of the interpreter.
As a result, the values of the registers are simply discarded and
overwritten with restored values as the interpreter unwinds itself and prepares to handle an
exception.  A better solution to this problem is to modify the recovery mechanism to discover and
restore saved registers from the stack.  Unfortunately, there is
no standardized way to know exactly where the registers might have been saved.
Therefore, a heuristic scheme that examines the machine code for each procedure would
have to be used to try and identify stack locations. This approach is used by gdb
and other debuggers when they allow users to inspect register values
within arbitrary stack frames \cite{gdb}.  However, this technique has
not yet been implemented in WAD due to its obvious implementation difficulty and the
fact that the WAD prototype has primarily been developed for the SPARC.

As a fall-back, WAD can be configured to return control to a location
previously specified with {\tt setjmp}.  Unfortunately, this either
requires modifications to the interpreter or its extension modules.
Although this kind of instrumentation can be facilitated by automatic
wrapper code generators, it is not a preferred solution and is
not discussed further.

\section{Implementation Details}

Currently, WAD is implemented in ANSI C and small amount of assembly
code to assist in the return to the interpreter.  The current
implementation supports Python, Tcl, and Perl extensions on SPARC Solaris.  An
i386-Linux port has also been developed.  The entire implementation contains
approximately 1500 semicolons and most of this code is related to the gathering of debugging
information.  Furthermore, due to the hostile environment in which the
recovery process must run, the implementation takes great care not to utilize the
process heap.  This allows the signal handler to collect information in situations
where the heap allocator has been corrupted or destroyed in some manner.

Although there are libraries such as the GNU Binary File Descriptor
(BFD) library that can assist with the manipulation of object files
these are not used in the implementation \cite{bfd}.  First, these
libraries tend to be quite large and are oriented more towards
stand-alone tools such as debuggers, linkers, and loaders.  Second,
the behavior of these libraries with respect to memory management
would need to be carefully studied before they could be safely used in
an embedded environment. Finally, given the small size of the
implementation, it didn't seem necessary to rely upon such a
heavyweight solution.

\section{Discussion}

The primary goal of embedded error recovery is to provide an
alternative approach for debugging scripting language extensions.
Although this approach has many benefits, there are a number
drawbacks and issues that must be discussed.

First, like the C {\tt longjmp} function, the error recovery mechanism
does not cleanly unwind the call stack.  For C++, this means that
objects allocated on stack will not be finalized (destructors will not
be invoked) and that memory allocated on the heap may be
leaked. Similarly, this could result in open files, sockets, and other
system resources. Furthermore, in a multi-threaded environment,
deadlock may occur if a procedure holds a lock when an error occurs.

Second, the use of signals may interact adversely with both scripting
language signal handling and signal handling in thread libraries.
Since scripting languages ordinarily do not catch signals such as
SIGSEGV, SIGBUS, and SIGABRT, the use of WAD is unlikely to conflict
with any existing signal handling. However, this does not prevent a
module from overriding the error recovery mechanism with its own
signal handler. Threads present a different sort of signal handling problem
due to the fact that thread libraries tend to override default signal handling \cite{thread}.
In this case, the thread library directs fatal signals to the thread in which the problem occurred.
However, first-hand experience has shown that certain implementations
of user threads do not reliably pass signal context information nor do
they universally support advanced signal operations such as {\tt
sigaltstack}.  Because of this, the WAD recovery mechanism may not be
compatible with a crippled implementation of user threads on certain
platforms.  To further complicate matters, the recovery process itself is
not thread-safe (i.e., it is not possible to concurrently handle fatal errors
occurring different threads).

Third, certain types of errors may result in an unrecoverable crash.
For example, if an application overwrites the heap, it may destroy
critical data structures within the interpreter.
Similarly,
destruction of the call stack (via buffer overflow) makes it
impossible for the recovery mechanism to create a stack-trace and
return to the interpreter.  Although it might be possible to add a heuristic scheme for
recovering a partial stack trace such as backward stack tracing, no such feature has been implemented
\cite{debug}.  Finally, memory management problems such as
double-freeing of heap allocated memory can cause a system to fail in
a way that bears little resemblance to the actual source of the
problem.

Finally, there are a number of issues that pertain
to the interaction of the recovery mechanism with the interpreter.
First, the recovery scheme is unable to return to procedures
that might invoke wrapper functions with conflicting return codes.
This problem manifests itself when the interpreter's virtual
machine is built around a large {\tt switch} statement from which different
types of wrapper functions are called.  For example, in Python, certain
internal procedures call a mix of functions where both NULL and -1 are
returned to indicate errors (depending on the function).  In this case, there
is no way for WAD to easily determine which return value to use.  Second,
the recovery process is extremely inefficient.  This is because the
data collection process relies heavily upon {\tt mmap}, file I/O, and linear search
algorithms for finding symbols and debugging information.  Therefore, it would
probably not be suitable as a general purpose exception handling mechanism.
Finally, even when an error is successfully returned to the interpreter
and presented to the user, it may not be possible to resume execution of
the application (e.g., even though the interpreter is operational, the extension
module may be corrupted in some manner).

Despite these limitations, embedded error recovery is applicable to a
wide range of extension-related errors.  This is because errors such as
failed assertions, bus errors, and floating point exceptions rarely
result in a situation where the recovery process would be unable to run or the
interpreter would crash. Furthermore, more serious errors such as segmentation faults are more
likely to caused by an uninitialized pointer than a blatant
destruction of the heap or stack.

\section{Related Work}

A huge body of literature is devoted to the topic of exception
handling in various languages and systems.  Furthermore, the topic
remains one of active interest in the software community. For
instance, IEEE Transactions on Software Engineering recently devoted
two entire issues to current trends in exception handling
\cite{except1,except2}.  Unfortunately, very little of this work seems
to be directly related to mixed compiled-interpreted exception
handling, recovery from fatal signals, and problems pertaining to
mixed-language debugging.

Perhaps the most directly relevant work is that of advanced programming
environments for Common Lisp \cite{lisp}.  Not only does CL have a foreign function interface,
debuggers such as gdb have previously been modified to walk the Lisp stack
\cite{ffi,wcl}.  Furthermore, certain Lisp development environments have
provided a high degree of integration between compiled code and
the Lisp interpreter\footnote{Note to program committee: I
have been unable to find a suitable reference describing this capability.  However,
discussions with Richard Gabriel and other people in the Lisp community seem to indicate that
such work has been done. Please advise.}

In certain cases, a scripting language module has been used to provide
partial information for fatal signals. For example, the Perl {\tt
sigtrap} module can be used to produce a Perl stack trace when a
problem occurs \cite{perl}.  Unfortunately, this module does not
provide any information from the C stack.   Similarly, advanced software development
environments such as Microsoft's Visual Studio can automatically launch a C/C++
debugger when an error occurs.  Unfortunately, this doesn't provide any information
about the script that was running.

In the area of programming languages, a number of efforts have been made to
map signals to exceptions in the form of asynchronous exception handling
\cite{buhr,ml,haskell}.  Unfortunately, this work tends to
concentrate on the problem of handling asynchronous signals related to I/O as opposed
to synchronously generated signals caused by software faults.

With respect to debugging, little work appears to have been done in the area of
mixed compiled-interpreted debugging.   Although modern debuggers
certainly try to provide advanced capabilities for debugging within a
single language, they tend to ignore the boundary between languages.
As previously mentioned, debuggers have occasionally been modified to
support other languages such as Common Lisp \cite{wcl}.  However, no such work appears
to have been done in the context of modern scripting languages.  One system of possible interest
in the context of mixed compiled-interpreted debugging is the R$^{n}$
system developed at Rice University in the mid-1980's \cite{carle}.  This
system, primarily developed for scientific computing, allowed control
to transparently pass between compiled code and an interpreter.
Furthermore, the system allowed dynamic patching of an executable in
which compiled procedures could be replaced by an interpreted
replacement.  Although this system does not directly pertain to the problem of
debugging of scripting language extensions, it is one of the few
examples of a system in which compiled and interpreted code have been
tightly integrated within a debugger.

\section{Future Directions}

As of this writing, WAD is only an experimental prototype.  Because of
this, there are certainly a wide variety of incremental improvements
that could be made to support additional platforms and scripting
languages. In addition, there are a variety of improvements that could be made
to provide better integration with threads and C++.

A more interesting extension of this work would be to expose a broader
range of debugging capabilities to the scripting interpreter.  For example,
rather than simply raising an exception with limited diagnostic
information, the recovery mechanism might be able to provide the
interpreter with a detailed snapshot of the entire call stack
including symbolic debugging information.  Using this information, it
might be possible to implement an interactive post-mortem debugger
that allows a programmer to inspect the values of local
variables and other aspects of the application without leaving the
interpreter.  Alternatively, it may be possible to integrate this information
into an existing script-level debugger.

\section{Conclusions and Availability}

This paper has presented a mechanism by which fatal errors such as
segmentation faults and failed assertions can be handled as scripting
language exceptions.  This approach, which relies upon advanced
features of Unix signal handling, allows fatal signals to be caught
and transformed into errors from which interpreters can produce an
informative cross-language stack trace. In doing so, it provides more
seamless integration between scripting languages and compiled
extensions.  Furthermore, this has the potential to greatly simplify the
frustrating task of debugging complicated mixed scripted-compiled
software.

The prototype implementation of this system is available at :

\begin{center}
{\tt http://systems.cs.uchicago.edu/wad}.
\end{center}

\noindent
Currently, WAD supports Python,
Tcl, and Perl on SPARC Solaris and i386-Linux systems.  Work to
support additional scripting languages and platforms is ongoing.

\section{Acknowledgments}

Richard Gabriel and Harlan Sexton provided interesting insights concerning similar capabilities
in Common Lisp.

\begin{thebibliography}{99}


\bibitem{ousterhout} J. K. Ousterhout, {\em Tcl: An Embedable Command Language},
Proceedings of the USENIX Association Winter Conference, 1990.

\bibitem{ouster1} J. K. Ousterhout, {\em Scripting: Higher-Level Programming for the 21st Century},
IEEE Computer, Vol 31, No. 3, p. 23-30, 1998.

\bibitem{perl} L. Wall, T. Christiansen, and R. Schwartz, {\em Programming Perl}, 2nd. Ed.
O'Reilly \& Associates, 1996.

\bibitem{python} M. Lutz, {\em Programming Python}, O'Reilly \& Associates, 1996.

\bibitem{guile} Thomas Lord, {\em An Anatomy of Guile, The Interface to
Tcl/Tk}, USENIX 3rd Annual Tcl/Tk Workshop 1995.

\bibitem{php} T. Ratschiller and T. Gerken, {\em Web Application Development with PHP 4.0},
New Riders, 2000.

\bibitem{ruby} D. Thomas, A. Hunt, {\em Programming Ruby}, Addison-Wesley, 2001.

\bibitem{swig} D.M. Beazley, {\em SWIG : An Easy to Use Tool for Integrating Scripting Languages with C and C++}, Proceedings of the 4th USENIX Tcl/Tk Workshop, p. 129-139, July 1996.

\bibitem{sip} P. Thompson, {\em SIP},\\
{\tt http://www.thekompany.com/projects/pykde}.

\bibitem{pyfort} P.~F.~Dubois, {\em Climate Data Analysis Software}, 8th International Python Conference,
Arlington, VA., 2000.

\bibitem{f2py} P. Peterson, J. Martins, and J. Alonso,
{\em Fortran to Python Interface Generator with an application to Aerospace
Engineering}, 9th International Python Conference, submitted, 2000.

\bibitem{advperl} S. Srinivasan, {\em Advanced Perl Programming}, O'Reilly \& Associates, 1997.

\bibitem{heidrich} Wolfgang Heidrich and Philipp Slusallek, {\em Automatic Generation of Tcl Bindings for C and C++ Libraries.},
USENIX 3rd Tcl/Tk Workshop, 1995.

\bibitem{vtk} K. Martin, {\em Automated Wrapping of a C++ Class Library into Tcl},
USENIX 4th Tcl/Tk Workshop, p. 141-148, 1996.

\bibitem{gwrap} C. Lee, {\em G-Wrap: A tool for exporting C libraries into Scheme Interpreters},\\
{\tt http://www.cs.cmu.edu/\~{ }chrislee/
Software/g-wrap}.

\bibitem{wrappy} G. Couch, C. Huang, and T. Ferrin, {\em Wrappy :A Python Wrapper
Generator for C++ Classes}, O'Reilly Open Source Software Convention, 1999.

\bibitem{gdb} R. Stallman and R. Pesch, {\em Using GDB: A Guide to the GNU Source-Level Debugger}.
Free Software Foundation and Cygnus Support, Cambridge, MA, 1991.

\bibitem{swigexcept} D.M. Beazley and P.S. Lomdahl, {\em Feeding a
Large-scale Physics Application to Python}, 6th International Python
Conference, co-sponsored by USENIX, p. 21-28, 1997.

\bibitem{stevens} W. Richard Stevens, {\em UNIX Network Programming: Interprocess Communication, Volume 2}. PTR
Prentice-Hall, 1998.

\bibitem{proc} R. Faulkner and R. Gomes, {\em The Process File System and Process Model in UNIX System V}, USENIX Conference Proceedings,
January 1991.

\bibitem{elf} J.~R.~Levine, {\em Linkers \& Loaders.} Morgan Kaufmann Publishers, 2000.

\bibitem{stabs} Free Software Foundation, {\em The "stabs" debugging format}. GNU info document.

\bibitem{prag} M.L. Scott. {\em Programming Language Pragmatics}, Morgan Kaufmann Publishers, 2000.

\bibitem{sparc} D. Weaver and T. Germond, {\em SPARC Architecture Manual Version 9},
Prentice-Hall, 1993.

\bibitem{bfd} S. Chamberlain. {\em libbfd: The Binary File Descriptor Library}. Cygnus Support, bfd version 3.0 edition, April 1991.

\bibitem{thread} F. Mueller, {\em A Library Implementation of POSIX Threads Under Unix},
USENIX Winter Technical Conference, San Diego, CA., p. 29-42, 1993.

\bibitem{debug} J. B. Rosenberg, {\em How Debuggers Work: Algorithms, Data Structures, and
Architecture}, John Wiley \& Sons, 1996.

\bibitem{except1} D.E. Perry, A. Romanovsky, and A. Tripathi, {\em
Current Trends in Exception Handling-Part I},
IEEE Transactions on Software Engineering, Vol 26, No. 9, p. 817-819, 2000.

\bibitem{except2} D.E. Perry, A. Romanovsky, and A. Tripathi, {\em
Current Trends in Exception Handling-Part II},
IEEE Transactions on Software Engineering, Vol 26, No. 10, p. 921-922, 2000.


\bibitem{lisp} G.L. Steele Jr., {\em Common Lisp: The Language, Second Edition}, Digital Press,
Bedford, MA. 1990.

\bibitem{ffi} H. Sexton, {\em Foreign Functions and Common Lisp}, in Lisp Pointers, Vol 1, No. 5, 1988.

\bibitem{wcl} W. Henessey, {\em WCL: Delivering Efficient Common Lisp Applications Under Unix},
ACM Conference on Lisp and Functional Languages, p. 260-269, 1992.

\bibitem{buhr} P.A. Buhr and W.Y.R. Mok, {\em Advanced Exception Handling Mechanisms}, IEEE Transactions on Software Engineering,
Vol. 26, No. 9, p. 820-836, 2000.

\bibitem{haskell} S. Marlow, S. P. Jones, and A. Moran. {\em
Asynchronous Exceptions in Haskell.} In 4th International Workshop on
High-Level Concurrent Languages, September 2000.

\bibitem{ml} J. H. Reppy, {\em Asynchronous Signals in Standard ML}. Technical Report TR90-1144,
Cornell University, Computer Science Department, 1990.

\bibitem{carle} A. Carle, D. Cooper, R. Hood, K. Kennedy, L. Torczon, S. Warren,
{\em A Practical Environment for Scientific Programming.}
IEEE Computer, Vol 20, No. 11, p. 75-89, 1987.


\end{thebibliography}

\end{document}