From d7953e1eb918c25584b9b16d349f797f1ed7d525 Mon Sep 17 00:00:00 2001 From: Dave Beazley Date: Sun, 15 Apr 2001 04:55:57 +0000 Subject: [PATCH] Final changes. git-svn-id: https://swig.svn.sourceforge.net/svnroot/swig/trunk/SWIG@1082 626c5289-ae23-0410-ae9c-e8d60b6d4f22 --- Tools/WAD/Papers/usenix2001.tex | 186 ++++++++++++++++---------------- 1 file changed, 90 insertions(+), 96 deletions(-) diff --git a/Tools/WAD/Papers/usenix2001.tex b/Tools/WAD/Papers/usenix2001.tex index 90f93328e..92856ef15 100644 --- a/Tools/WAD/Papers/usenix2001.tex +++ b/Tools/WAD/Papers/usenix2001.tex @@ -255,7 +255,7 @@ SegFault: [ C stack trace ] } \end{verbatim} } -\caption{Cross language traceback generated for a segmentation fault in a Python extension} +\caption{Cross language traceback generated by WAD for a segmentation fault in a Python extension} \end{figure*} The current state of the art in extension debugging is to simply add @@ -313,7 +313,7 @@ supply a stack trace as opposed to a vague complaint that the program \begin{picture}(400,250)(0,0) \put(50,-110){\special{psfile = tcl.ps hscale = 60 vscale = 60}} \end{picture} -\caption{Dialogue box with traceback information for a failed assertion in a Tcl/Tk extension} +\caption{Dialog box with traceback information for a failed assertion in a Tcl/Tk extension} \end{figure*} \section{Scripting Language Internals} @@ -323,7 +323,7 @@ scripting language interpreters interface with extension code. Despite the wide of scripting languages, essentially every implementation uses a similar technique for accessing foreign code. -The most widely used extension mechanism is a foreign function +Virtually all scripting languages provide an extension mechanism in the form of a foreign function interface in which compiled procedures can be called from the scripting language interpreter. This is accomplished by writing a collection of wrapper functions that conform to a specified calling convention. The primary purpose of the wrappers are to @@ -354,12 +354,14 @@ wrap_foo(ClientData clientData, } \end{verbatim} -The other extension mechanism is an object/type interface that allows programmers to create new +Another common extension mechanism is an object/type interface that allows programmers to create new kinds of fundamental types or attach special properties to objects in -the interpreter. This usually involves setting up tables of function +the interpreter. For example, both Tcl and Python provide an API for creating new +``built-in'' objects that behave like numbers, strings, lists, etc. +In most cases, this involves setting up tables of function pointers that define various properties of an object. For example, if you wanted to add complex numbers to an interpreter, you might fill in a special -data structure with pointers to various methods like this: +data structure with pointers to methods that implement various numerical operations like this: \begin{verbatim} NumberMethods ComplexMethods { @@ -544,7 +546,7 @@ it is now described in greater detail. \section{Returning to the Interpreter} To return to the interpreter, WAD maintains a table of symbolic names -and return values that correspond to locations within the interpreter +that correspond to locations within the interpreter responsible for invoking wrapper functions and object/type methods. For example, Table 1 shows a partial list of return locations used in the Python implementation. When an error occurs, the call stack is @@ -571,7 +573,7 @@ the context of a function that generated the error. \begin{table}[t] \begin{center} \begin{tabular}{ll} -Python symbol & Return value \\ \hline +Python symbol & Error return value \\ \hline call\_builtin & NULL \\ PyObject\_Print & -1 \\ PyObject\_CallFunction & NULL \\ @@ -755,13 +757,16 @@ be added to an extension module to make it work. In addition, due to the way in which the loader resolves and initializes libraries, the initialization of WAD is guaranteed to execute before any of the code in the extension module to which it has been linked. The primary -downside to this approach is that WAD shared object file can not be -linked directly to an interpreter (since its initialization would -occur before any code in the interpreter started and the -initialization of WAD may require the interpreter to be active). -However, such limitations would be easy to fix by simply relinking -WAD without the C++ initializer and placing an initialization call -within the interpreter startup code. +downside to this approach is that the WAD shared object file can not be +linked directly to an interpreter. This is because WAD sometimes needs to call the +interpreter to properly initialize its exception handling mechanism (for instance, in Python, +four new types of exceptions are added to the interpreter). Clearly this type of initialization +is impossible if WAD is linked directly to an interpreter as +its initialization process would execute before before the main program of the +interpreter started. However, +if you wanted to permanently add WAD to an interpreter, the problem is easily +corrected by first removing the C++ initializer from WAD and then replacing it with an +initialization call someplace within the interpreter's startup function. \section{Exception Objects} @@ -859,52 +864,6 @@ difficulties of accurately recovering register values). \caption{Cross-language debugging session in Python where user is walking up the call stack.} \end{figure*} -\section{Failure Modes and Debugging} -\label{failure} - -Since WAD lives in the same process as the faulting application, it -must operate in a potentially hostile environment where significant -parts of the application may be broken or corrupted. Moreover, WAD -itself may fail while collecting information or trying to recover from -a catastrophic error. WAD is also much more limited than a standard -debugger in that it does not support common features such as -breakpointing, single step execution, or a full range of data -inspection. Thus, a common question to ask is to what extent does WAD -complicate debugging when it doesn't work. - -To handle potential problems in the implementation of WAD itself, -great care is taken to avoid the use of library functions and -functions that rely on heap allocation (malloc, free, etc.). For -instance, to provide dynamic memory allocation, WAD implements its own -memory allocator using mmap. In addition, signals are disabled -immediately upon entry to the WAD signal handler. Should a fatal -error occur inside WAD, the application will dump core and exit. Since -the resulting core file contains the stack trace of both WAD and the -faulting application, a traditional C debugger can be used to identify -the problem as before. The only difference is that a few additional -stack frames will be added to the traceback. - -In some situations, an application might fail after the WAD signal -handler has completed execution. For instance, memory or stack frames -within the interpreter might be corrupted in a way that prevents -exception handling from operating correctly. In this case, the -application will fail in a manner that does not represent the original -programming error. This might also cause the WAD signal handler to be -reinvoked with a different process state--causing it to report -information about a different type of failure. To address these kinds -of problems, WAD attempts to create a tracefile {\tt wadtrace} in the -current working directory that contains information about each error -that it has handled. If no recovery was possible, a programmer can -look at this file to obtain all of the stack traces that were generated -by WAD. - -Finally, if an application is experiencing a very serious problem, WAD -does not prevent a standard debugger from being attached to the -process. This is because the debugger overrides the current signal -handling so that it can catch fatal errors. As a result, even if -WAD is loaded, fatal signals are simply redirected to the -attached debugger. - \section{Implementation Details} Currently, WAD is implemented in ANSI C and small amount of assembly @@ -922,7 +881,7 @@ specific to a particular scripting language (170 semicolons for Python and 50 semicolons for Tcl). Although there are libraries such as the GNU Binary File Descriptor -(BFD) library that can assist with the manipulation of object files +(BFD) library that can assist with the manipulation of object files, these are not used in the implementation \cite{bfd}. These libraries tend to be quite large and are oriented more towards stand-alone tools such as debuggers, linkers, and loaders. In addition, @@ -1006,13 +965,14 @@ leaked. Similarly, this could result in open files, sockets, and other system resources. In a multi-threaded environment, deadlock may occur if a procedure holds a lock when an error occurs. -The of signals may also interact adversely with both scripting -language signal handling. -Since scripting languages ordinarily do not catch signals such as +In certain cases, the use of signals in WAD may interact adversely with scripting +language signal handling. Since scripting languages ordinarily do not catch signals such as SIGSEGV, SIGBUS, and SIGABRT, the use of WAD is unlikely to conflict -with any existing signal handling. However, this does not prevent a -module from overriding the error recovery mechanism with its own -signal handler. +with any existing signal handling. However, most scripting languages would not +prevent a user from disabling the WAD error recovery mechanism by +simply specifying a new handler for one or more of these signals. In addition, the use of +certain extensions such as the Perl sigtrap module would completely +disable WAD \cite{perl}. A more difficult signal handling problem arises when thread libraries are used. These libraries tend to override default signal handling @@ -1021,12 +981,13 @@ thread \cite{thread}. In general, asynchronous signals can be delivered to any thread within a process. However, this does not appear to be a problem for WAD since hardware exceptions are delivered to a signal handler that runs within the same thread in which the -error occurred. Unfortunately, even in this case, it appears that -certain implementations of user thread libraries do not reliably pass +error occurred. Unfortunately, even in this case, personal experience has +shown that certain implementations of user thread libraries (particularly on older versions +of Linux) do not reliably pass signal context information nor do they universally support advanced signal operations such as {\tt sigaltstack}. Because of this, WAD may be incompatible with a crippled implementation of user threads on -certain platforms. +these platforms. A even more subtle problem with threads is that the recovery process itself is not thread-safe (i.e., it is not possible to concurrently @@ -1040,24 +1001,57 @@ within the interpreter at once. A consequence of this restriction is that extension functions are not interruptible by thread-switching unless they explicitly release the interpreter lock. Currently, the behavior of WAD is undefined if extension code releases the lock and -proceeds to generates a fault. In this case, the recovery process may +proceeds to generate a fault. In this case, the recovery process may either cause an exception to be raised in an entirely different thread or cause execution to violate the interpreter's mutual exclusion -constraint. +constraint on the interpreter. In certain cases, errors may result in an unrecoverable crash. For example, if an application overwrites the heap, it may destroy critical data structures within the interpreter. Similarly, destruction of the call stack (via buffer overflow) makes it impossible for the recovery mechanism to create a stack-trace and -return to the interpreter. In the future, it might be possible to add -a heuristic scheme for recovering a partial stack trace such as -backward stack tracing, no such feature has yet been implemented -\cite{debug}. Finally, memory management problems such as -double-freeing of heap allocated memory can cause a system to fail in -a way that bears little resemblance to the actual source of the -problem. Section \ref{failure} describes some of the ways in which WAD responds -to these kinds of errors. +return to the interpreter. More subtle memory management problems +such as double-freeing of heap allocated memory can also cause a system +to fail in a manner that bears little resemblance to actual source +of the problem. Given that WAD lives in the same process as the +faulting application and that such errors may occur, a common +question to ask is to what extent does WAD complicate debugging when it +doesn't work. + +To handle potential problems in the implementation of WAD itself, +great care is taken to avoid the use of library functions and +functions that rely on heap allocation (malloc, free, etc.). For +instance, to provide dynamic memory allocation, WAD implements its own +memory allocator using mmap. In addition, signals are disabled +immediately upon entry to the WAD signal handler. Should a fatal +error occur inside WAD, the application will dump core and exit. Since +the resulting core file contains the stack trace of both WAD and the +faulting application, a traditional C debugger can be used to identify +the problem as before. The only difference is that a few additional +stack frames will appear on the traceback. + +An application may also fail after the WAD signal handler has completed +execution if memory or stack frames within the interpreter have been +corrupted in a way that prevents proper exception handling. In this case, the +application may fail in a manner that does not represent the original +programming error. It might also cause the WAD signal handler to be +immediately reinvoked with a different process state--causing it to +report information about a different type of failure. To address +these kinds of problems, WAD tries to create a tracefile {\tt +wadtrace} in the current working directory that contains information +about each error that it has handled. If no recovery was possible, a +programmer can look at this file to obtain all of the stack traces +that were generated by WAD. + +Finally, if an application is experiencing a very serious problem, WAD +does not prevent a standard debugger from being attached to the +process. This is because the debugger overrides the current signal +handling so that it can catch fatal errors. As a result, even if WAD +is loaded, fatal signals are simply redirected to the attached +debugger. Such an approach also allows for more complex debugging +tasks such as single-step execution, breakpoints, and +watchpoints--none of which are easily added to WAD itself. % % Add comments about what WAD does in this case? @@ -1065,27 +1059,25 @@ to these kinds of errors. Finally, there are a number of issues that pertain to the interaction of the recovery mechanism with the interpreter. -First, the recovery scheme is unable to return to procedures +For instance, the recovery scheme is unable to return to procedures that might invoke wrapper functions with conflicting return codes. This problem manifests itself when the interpreter's virtual machine is built around a large {\tt switch} statement from which different types of wrapper functions are called. For example, in Python, certain internal procedures call a mix of functions where both NULL and -1 are returned to indicate errors (depending on the function). In this case, there -is no way for WAD to easily determine which return value to use. Second, -the recovery process is extremely inefficient. This is because the -data collection process relies heavily upon {\tt mmap}, file I/O, and linear search -algorithms for finding symbols and debugging information. Therefore, WAD would -probably not be suitable as a general purpose exception handling mechanism. -Finally, even when an error is successfully returned to the interpreter -and presented to the user, it may not be possible to resume execution of -the application (e.g., even though the interpreter is operational, the extension -module may be corrupted in some manner). +is no way to specify a proper error return value because there will be +conflicting entries in the WAD return table (although you could compromise and +return the error value for the most common case). The recovery +process is also extremely inefficient due to its heavy reliance on +{\tt mmap}, file I/O, and linear search algorithms for finding symbols +and debugging information. Therefore, WAD would +unsuitable as a more general purpose extension related exception handler. Despite these limitations, embedded error recovery is still a useful -capability that can be applied to a wide of extension related errors. -This is because errors such as failed assertions, bus errors, and -floating point exceptions rarely result in a situation where the +capability that can be applied to a wide variety of extension related +errors. This is because errors such as failed assertions, bus errors, +and floating point exceptions rarely result in a situation where the recovery process would be unable to run or the interpreter would crash. Furthermore, more serious errors such as segmentation faults are more likely to caused by an uninitialized pointer than a blatant @@ -1161,7 +1153,9 @@ As of this writing, WAD is only an experimental prototype. Because of this, there are certainly a wide variety of incremental improvements that could be made to support additional platforms and scripting languages. In addition, there are a variety of improvements that could be made -to provide better integration with threads and C++. +to provide better integration with threads and C++. One could also +investigate heuristic schemes such as backward stack tracing that might be able +to recover partial debugging information from corrupted call stacks \cite{debug}. A more interesting extension of this work would be to see how the exception handling approach of WAD could be incorporated with @@ -1274,7 +1268,7 @@ January 1991. \bibitem{elf} J.~R.~Levine, {\em Linkers \& Loaders.} Morgan Kaufmann Publishers, 2000. -\bibitem{stabs} Free Software Foundation, {\em The "stabs" debugging format}. GNU info document. +\bibitem{stabs} Free Software Foundation, {\em The ``stabs'' debugging format}. GNU info document. \bibitem{prag} M.L. Scott. {\em Programming Language Pragmatics}, Morgan Kaufmann Publishers, 2000.