Final changes.
git-svn-id: https://swig.svn.sourceforge.net/svnroot/swig/trunk/SWIG@1082 626c5289-ae23-0410-ae9c-e8d60b6d4f22
This commit is contained in:
parent
ed3dd0ea76
commit
d7953e1eb9
1 changed files with 90 additions and 96 deletions
|
|
@ -255,7 +255,7 @@ SegFault: [ C stack trace ]
|
|||
}
|
||||
\end{verbatim}
|
||||
}
|
||||
\caption{Cross language traceback generated for a segmentation fault in a Python extension}
|
||||
\caption{Cross language traceback generated by WAD for a segmentation fault in a Python extension}
|
||||
\end{figure*}
|
||||
|
||||
The current state of the art in extension debugging is to simply add
|
||||
|
|
@ -313,7 +313,7 @@ supply a stack trace as opposed to a vague complaint that the program
|
|||
\begin{picture}(400,250)(0,0)
|
||||
\put(50,-110){\special{psfile = tcl.ps hscale = 60 vscale = 60}}
|
||||
\end{picture}
|
||||
\caption{Dialogue box with traceback information for a failed assertion in a Tcl/Tk extension}
|
||||
\caption{Dialog box with traceback information for a failed assertion in a Tcl/Tk extension}
|
||||
\end{figure*}
|
||||
|
||||
\section{Scripting Language Internals}
|
||||
|
|
@ -323,7 +323,7 @@ scripting language interpreters interface with extension code. Despite the wide
|
|||
of scripting languages, essentially every implementation uses a similar
|
||||
technique for accessing foreign code.
|
||||
|
||||
The most widely used extension mechanism is a foreign function
|
||||
Virtually all scripting languages provide an extension mechanism in the form of a foreign function
|
||||
interface in which compiled procedures can be called from the scripting language
|
||||
interpreter. This is accomplished by writing a collection of wrapper functions that conform
|
||||
to a specified calling convention. The primary purpose of the wrappers are to
|
||||
|
|
@ -354,12 +354,14 @@ wrap_foo(ClientData clientData,
|
|||
}
|
||||
\end{verbatim}
|
||||
|
||||
The other extension mechanism is an object/type interface that allows programmers to create new
|
||||
Another common extension mechanism is an object/type interface that allows programmers to create new
|
||||
kinds of fundamental types or attach special properties to objects in
|
||||
the interpreter. This usually involves setting up tables of function
|
||||
the interpreter. For example, both Tcl and Python provide an API for creating new
|
||||
``built-in'' objects that behave like numbers, strings, lists, etc.
|
||||
In most cases, this involves setting up tables of function
|
||||
pointers that define various properties of an object. For example, if
|
||||
you wanted to add complex numbers to an interpreter, you might fill in a special
|
||||
data structure with pointers to various methods like this:
|
||||
data structure with pointers to methods that implement various numerical operations like this:
|
||||
|
||||
\begin{verbatim}
|
||||
NumberMethods ComplexMethods {
|
||||
|
|
@ -544,7 +546,7 @@ it is now described in greater detail.
|
|||
\section{Returning to the Interpreter}
|
||||
|
||||
To return to the interpreter, WAD maintains a table of symbolic names
|
||||
and return values that correspond to locations within the interpreter
|
||||
that correspond to locations within the interpreter
|
||||
responsible for invoking wrapper functions and object/type methods.
|
||||
For example, Table 1 shows a partial list of return locations used in
|
||||
the Python implementation. When an error occurs, the call stack is
|
||||
|
|
@ -571,7 +573,7 @@ the context of a function that generated the error.
|
|||
\begin{table}[t]
|
||||
\begin{center}
|
||||
\begin{tabular}{ll}
|
||||
Python symbol & Return value \\ \hline
|
||||
Python symbol & Error return value \\ \hline
|
||||
call\_builtin & NULL \\
|
||||
PyObject\_Print & -1 \\
|
||||
PyObject\_CallFunction & NULL \\
|
||||
|
|
@ -755,13 +757,16 @@ be added to an extension module to make it work. In addition, due to
|
|||
the way in which the loader resolves and initializes libraries, the
|
||||
initialization of WAD is guaranteed to execute before any of the code
|
||||
in the extension module to which it has been linked. The primary
|
||||
downside to this approach is that WAD shared object file can not be
|
||||
linked directly to an interpreter (since its initialization would
|
||||
occur before any code in the interpreter started and the
|
||||
initialization of WAD may require the interpreter to be active).
|
||||
However, such limitations would be easy to fix by simply relinking
|
||||
WAD without the C++ initializer and placing an initialization call
|
||||
within the interpreter startup code.
|
||||
downside to this approach is that the WAD shared object file can not be
|
||||
linked directly to an interpreter. This is because WAD sometimes needs to call the
|
||||
interpreter to properly initialize its exception handling mechanism (for instance, in Python,
|
||||
four new types of exceptions are added to the interpreter). Clearly this type of initialization
|
||||
is impossible if WAD is linked directly to an interpreter as
|
||||
its initialization process would execute before before the main program of the
|
||||
interpreter started. However,
|
||||
if you wanted to permanently add WAD to an interpreter, the problem is easily
|
||||
corrected by first removing the C++ initializer from WAD and then replacing it with an
|
||||
initialization call someplace within the interpreter's startup function.
|
||||
|
||||
\section{Exception Objects}
|
||||
|
||||
|
|
@ -859,52 +864,6 @@ difficulties of accurately recovering register values).
|
|||
\caption{Cross-language debugging session in Python where user is walking up the call stack.}
|
||||
\end{figure*}
|
||||
|
||||
\section{Failure Modes and Debugging}
|
||||
\label{failure}
|
||||
|
||||
Since WAD lives in the same process as the faulting application, it
|
||||
must operate in a potentially hostile environment where significant
|
||||
parts of the application may be broken or corrupted. Moreover, WAD
|
||||
itself may fail while collecting information or trying to recover from
|
||||
a catastrophic error. WAD is also much more limited than a standard
|
||||
debugger in that it does not support common features such as
|
||||
breakpointing, single step execution, or a full range of data
|
||||
inspection. Thus, a common question to ask is to what extent does WAD
|
||||
complicate debugging when it doesn't work.
|
||||
|
||||
To handle potential problems in the implementation of WAD itself,
|
||||
great care is taken to avoid the use of library functions and
|
||||
functions that rely on heap allocation (malloc, free, etc.). For
|
||||
instance, to provide dynamic memory allocation, WAD implements its own
|
||||
memory allocator using mmap. In addition, signals are disabled
|
||||
immediately upon entry to the WAD signal handler. Should a fatal
|
||||
error occur inside WAD, the application will dump core and exit. Since
|
||||
the resulting core file contains the stack trace of both WAD and the
|
||||
faulting application, a traditional C debugger can be used to identify
|
||||
the problem as before. The only difference is that a few additional
|
||||
stack frames will be added to the traceback.
|
||||
|
||||
In some situations, an application might fail after the WAD signal
|
||||
handler has completed execution. For instance, memory or stack frames
|
||||
within the interpreter might be corrupted in a way that prevents
|
||||
exception handling from operating correctly. In this case, the
|
||||
application will fail in a manner that does not represent the original
|
||||
programming error. This might also cause the WAD signal handler to be
|
||||
reinvoked with a different process state--causing it to report
|
||||
information about a different type of failure. To address these kinds
|
||||
of problems, WAD attempts to create a tracefile {\tt wadtrace} in the
|
||||
current working directory that contains information about each error
|
||||
that it has handled. If no recovery was possible, a programmer can
|
||||
look at this file to obtain all of the stack traces that were generated
|
||||
by WAD.
|
||||
|
||||
Finally, if an application is experiencing a very serious problem, WAD
|
||||
does not prevent a standard debugger from being attached to the
|
||||
process. This is because the debugger overrides the current signal
|
||||
handling so that it can catch fatal errors. As a result, even if
|
||||
WAD is loaded, fatal signals are simply redirected to the
|
||||
attached debugger.
|
||||
|
||||
\section{Implementation Details}
|
||||
|
||||
Currently, WAD is implemented in ANSI C and small amount of assembly
|
||||
|
|
@ -922,7 +881,7 @@ specific to a particular scripting language (170 semicolons for Python
|
|||
and 50 semicolons for Tcl).
|
||||
|
||||
Although there are libraries such as the GNU Binary File Descriptor
|
||||
(BFD) library that can assist with the manipulation of object files
|
||||
(BFD) library that can assist with the manipulation of object files,
|
||||
these are not used in the implementation \cite{bfd}. These
|
||||
libraries tend to be quite large and are oriented more towards
|
||||
stand-alone tools such as debuggers, linkers, and loaders. In addition,
|
||||
|
|
@ -1006,13 +965,14 @@ leaked. Similarly, this could result in open files, sockets, and other
|
|||
system resources. In a multi-threaded environment,
|
||||
deadlock may occur if a procedure holds a lock when an error occurs.
|
||||
|
||||
The of signals may also interact adversely with both scripting
|
||||
language signal handling.
|
||||
Since scripting languages ordinarily do not catch signals such as
|
||||
In certain cases, the use of signals in WAD may interact adversely with scripting
|
||||
language signal handling. Since scripting languages ordinarily do not catch signals such as
|
||||
SIGSEGV, SIGBUS, and SIGABRT, the use of WAD is unlikely to conflict
|
||||
with any existing signal handling. However, this does not prevent a
|
||||
module from overriding the error recovery mechanism with its own
|
||||
signal handler.
|
||||
with any existing signal handling. However, most scripting languages would not
|
||||
prevent a user from disabling the WAD error recovery mechanism by
|
||||
simply specifying a new handler for one or more of these signals. In addition, the use of
|
||||
certain extensions such as the Perl sigtrap module would completely
|
||||
disable WAD \cite{perl}.
|
||||
|
||||
A more difficult signal handling problem arises when thread libraries
|
||||
are used. These libraries tend to override default signal handling
|
||||
|
|
@ -1021,12 +981,13 @@ thread \cite{thread}. In general, asynchronous signals can be
|
|||
delivered to any thread within a process. However, this does not
|
||||
appear to be a problem for WAD since hardware exceptions are delivered
|
||||
to a signal handler that runs within the same thread in which the
|
||||
error occurred. Unfortunately, even in this case, it appears that
|
||||
certain implementations of user thread libraries do not reliably pass
|
||||
error occurred. Unfortunately, even in this case, personal experience has
|
||||
shown that certain implementations of user thread libraries (particularly on older versions
|
||||
of Linux) do not reliably pass
|
||||
signal context information nor do they universally support advanced
|
||||
signal operations such as {\tt sigaltstack}. Because of this, WAD may
|
||||
be incompatible with a crippled implementation of user threads on
|
||||
certain platforms.
|
||||
these platforms.
|
||||
|
||||
A even more subtle problem with threads is that the recovery process
|
||||
itself is not thread-safe (i.e., it is not possible to concurrently
|
||||
|
|
@ -1040,24 +1001,57 @@ within the interpreter at once. A consequence of this restriction is
|
|||
that extension functions are not interruptible by thread-switching
|
||||
unless they explicitly release the interpreter lock. Currently, the
|
||||
behavior of WAD is undefined if extension code releases the lock and
|
||||
proceeds to generates a fault. In this case, the recovery process may
|
||||
proceeds to generate a fault. In this case, the recovery process may
|
||||
either cause an exception to be raised in an entirely different
|
||||
thread or cause execution to violate the interpreter's mutual exclusion
|
||||
constraint.
|
||||
constraint on the interpreter.
|
||||
|
||||
In certain cases, errors may result in an unrecoverable crash. For
|
||||
example, if an application overwrites the heap, it may destroy
|
||||
critical data structures within the interpreter. Similarly,
|
||||
destruction of the call stack (via buffer overflow) makes it
|
||||
impossible for the recovery mechanism to create a stack-trace and
|
||||
return to the interpreter. In the future, it might be possible to add
|
||||
a heuristic scheme for recovering a partial stack trace such as
|
||||
backward stack tracing, no such feature has yet been implemented
|
||||
\cite{debug}. Finally, memory management problems such as
|
||||
double-freeing of heap allocated memory can cause a system to fail in
|
||||
a way that bears little resemblance to the actual source of the
|
||||
problem. Section \ref{failure} describes some of the ways in which WAD responds
|
||||
to these kinds of errors.
|
||||
return to the interpreter. More subtle memory management problems
|
||||
such as double-freeing of heap allocated memory can also cause a system
|
||||
to fail in a manner that bears little resemblance to actual source
|
||||
of the problem. Given that WAD lives in the same process as the
|
||||
faulting application and that such errors may occur, a common
|
||||
question to ask is to what extent does WAD complicate debugging when it
|
||||
doesn't work.
|
||||
|
||||
To handle potential problems in the implementation of WAD itself,
|
||||
great care is taken to avoid the use of library functions and
|
||||
functions that rely on heap allocation (malloc, free, etc.). For
|
||||
instance, to provide dynamic memory allocation, WAD implements its own
|
||||
memory allocator using mmap. In addition, signals are disabled
|
||||
immediately upon entry to the WAD signal handler. Should a fatal
|
||||
error occur inside WAD, the application will dump core and exit. Since
|
||||
the resulting core file contains the stack trace of both WAD and the
|
||||
faulting application, a traditional C debugger can be used to identify
|
||||
the problem as before. The only difference is that a few additional
|
||||
stack frames will appear on the traceback.
|
||||
|
||||
An application may also fail after the WAD signal handler has completed
|
||||
execution if memory or stack frames within the interpreter have been
|
||||
corrupted in a way that prevents proper exception handling. In this case, the
|
||||
application may fail in a manner that does not represent the original
|
||||
programming error. It might also cause the WAD signal handler to be
|
||||
immediately reinvoked with a different process state--causing it to
|
||||
report information about a different type of failure. To address
|
||||
these kinds of problems, WAD tries to create a tracefile {\tt
|
||||
wadtrace} in the current working directory that contains information
|
||||
about each error that it has handled. If no recovery was possible, a
|
||||
programmer can look at this file to obtain all of the stack traces
|
||||
that were generated by WAD.
|
||||
|
||||
Finally, if an application is experiencing a very serious problem, WAD
|
||||
does not prevent a standard debugger from being attached to the
|
||||
process. This is because the debugger overrides the current signal
|
||||
handling so that it can catch fatal errors. As a result, even if WAD
|
||||
is loaded, fatal signals are simply redirected to the attached
|
||||
debugger. Such an approach also allows for more complex debugging
|
||||
tasks such as single-step execution, breakpoints, and
|
||||
watchpoints--none of which are easily added to WAD itself.
|
||||
|
||||
%
|
||||
% Add comments about what WAD does in this case?
|
||||
|
|
@ -1065,27 +1059,25 @@ to these kinds of errors.
|
|||
|
||||
Finally, there are a number of issues that pertain
|
||||
to the interaction of the recovery mechanism with the interpreter.
|
||||
First, the recovery scheme is unable to return to procedures
|
||||
For instance, the recovery scheme is unable to return to procedures
|
||||
that might invoke wrapper functions with conflicting return codes.
|
||||
This problem manifests itself when the interpreter's virtual
|
||||
machine is built around a large {\tt switch} statement from which different
|
||||
types of wrapper functions are called. For example, in Python, certain
|
||||
internal procedures call a mix of functions where both NULL and -1 are
|
||||
returned to indicate errors (depending on the function). In this case, there
|
||||
is no way for WAD to easily determine which return value to use. Second,
|
||||
the recovery process is extremely inefficient. This is because the
|
||||
data collection process relies heavily upon {\tt mmap}, file I/O, and linear search
|
||||
algorithms for finding symbols and debugging information. Therefore, WAD would
|
||||
probably not be suitable as a general purpose exception handling mechanism.
|
||||
Finally, even when an error is successfully returned to the interpreter
|
||||
and presented to the user, it may not be possible to resume execution of
|
||||
the application (e.g., even though the interpreter is operational, the extension
|
||||
module may be corrupted in some manner).
|
||||
is no way to specify a proper error return value because there will be
|
||||
conflicting entries in the WAD return table (although you could compromise and
|
||||
return the error value for the most common case). The recovery
|
||||
process is also extremely inefficient due to its heavy reliance on
|
||||
{\tt mmap}, file I/O, and linear search algorithms for finding symbols
|
||||
and debugging information. Therefore, WAD would
|
||||
unsuitable as a more general purpose extension related exception handler.
|
||||
|
||||
Despite these limitations, embedded error recovery is still a useful
|
||||
capability that can be applied to a wide of extension related errors.
|
||||
This is because errors such as failed assertions, bus errors, and
|
||||
floating point exceptions rarely result in a situation where the
|
||||
capability that can be applied to a wide variety of extension related
|
||||
errors. This is because errors such as failed assertions, bus errors,
|
||||
and floating point exceptions rarely result in a situation where the
|
||||
recovery process would be unable to run or the interpreter would
|
||||
crash. Furthermore, more serious errors such as segmentation faults
|
||||
are more likely to caused by an uninitialized pointer than a blatant
|
||||
|
|
@ -1161,7 +1153,9 @@ As of this writing, WAD is only an experimental prototype. Because of
|
|||
this, there are certainly a wide variety of incremental improvements
|
||||
that could be made to support additional platforms and scripting
|
||||
languages. In addition, there are a variety of improvements that could be made
|
||||
to provide better integration with threads and C++.
|
||||
to provide better integration with threads and C++. One could also
|
||||
investigate heuristic schemes such as backward stack tracing that might be able
|
||||
to recover partial debugging information from corrupted call stacks \cite{debug}.
|
||||
|
||||
A more interesting extension of this work would be to see how the
|
||||
exception handling approach of WAD could be incorporated with
|
||||
|
|
@ -1274,7 +1268,7 @@ January 1991.
|
|||
|
||||
\bibitem{elf} J.~R.~Levine, {\em Linkers \& Loaders.} Morgan Kaufmann Publishers, 2000.
|
||||
|
||||
\bibitem{stabs} Free Software Foundation, {\em The "stabs" debugging format}. GNU info document.
|
||||
\bibitem{stabs} Free Software Foundation, {\em The ``stabs'' debugging format}. GNU info document.
|
||||
|
||||
\bibitem{prag} M.L. Scott. {\em Programming Language Pragmatics}, Morgan Kaufmann Publishers, 2000.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue