git-svn-id: https://swig.svn.sourceforge.net/svnroot/swig/trunk/SWIG@145 626c5289-ae23-0410-ae9c-e8d60b6d4f22
363 lines
16 KiB
HTML
363 lines
16 KiB
HTML
<html>
|
|
<head>
|
|
<title>SWIG Project Overview</title>
|
|
</head>
|
|
<body bgcolor="#ffffff">
|
|
<center>
|
|
<h1>The SWIG Redevelopment Effort</h1>
|
|
|
|
<b>David Beazley <br>
|
|
Department of Computer Science <br>
|
|
University of Chicago <br>
|
|
Chicago, IL 60637 <br>
|
|
beazley@cs.uchicago.edu <br>
|
|
</b>
|
|
</center>
|
|
|
|
<p>
|
|
<b>$Header$</b>
|
|
|
|
<p>
|
|
|
|
<h2>1. An Introduction</h2>
|
|
|
|
One of the biggest problems faced by people writing software is the
|
|
problem how to make software easier to use, more interactive, and more
|
|
modular. Typically, the computer science community has approached
|
|
these problems by focusing on formal design methodology and highly
|
|
specified frameworks built around notions of software components,
|
|
object-oriented programming, and anything labeled as "best practice"
|
|
(whatever that means). Although this type of approach is perhaps
|
|
appropriate for very large software projects involving hundreds of
|
|
programmers, software engineers, and managers, I've never met a sane
|
|
programmer who really enjoys writing software in such an environment.
|
|
Furthermore, a large number of software projects are undertaken by
|
|
small groups of people who would not classify themselves as
|
|
professional software developers or software engineers. Typical
|
|
examples might include scientific computing software, specialized
|
|
systems for engineering applications, or just about any kind of
|
|
experimental research and development project. These are the types of
|
|
programming projects "in the small" that are my primary interest.
|
|
|
|
<p>
|
|
First, programming projects in the small should not be confused with
|
|
the toy programs one might write as part of a class project or when
|
|
solving exceedingly trivial problems. More often that not, a software
|
|
package written by only a few people may have been developed over a
|
|
period of several years and may contain of hundreds of thousands of
|
|
lines of source code. Furthermore, due to limited manpower, these
|
|
projects are likely to rely on a variety of third-party packages and
|
|
programming libraries to accomplish certain tasks. Finally, it is not
|
|
uncommon for such software to have been developed in a relatively
|
|
piecemeal fashion with little if any formal design. The developers
|
|
may also be burdened with the task of supporting a large base of
|
|
legacy code that is critical to the application, but which is too
|
|
complicated to simply rewrite from scratch. As a result, the software
|
|
developed in such an environment may be a tangled web of code that
|
|
gets the job done, but which is less than ideal in terms of its
|
|
usuability and overall design.
|
|
|
|
<p>
|
|
Of course, one does not need to look very far to see examples of this
|
|
kind of development. For instance, I would claim that just about
|
|
every successful project within the Open Source community has been
|
|
developed in this way. As a more specific example, Swig itself was
|
|
developed in a relatively adhoc manner over a period of two years.
|
|
Although it was my intent to have a relatively clean design at the
|
|
start, the system has since evolved into a very tangled mess of
|
|
monolithic C++ code. It's not that I wanted to end up in this
|
|
situation--rather the experience gained by Swig's early users pushed
|
|
the system in an unanticipated direction that the original design
|
|
failed to address. In many ways, it is ironic that SWIG should end up
|
|
in this particular state given that this is <em>exactly</em> the type
|
|
of situation that Swig was built to address!
|
|
|
|
<p>
|
|
Naturally, this brings us to the overall motivation behind SWIG itself.
|
|
In a nutshell, SWIG is a software development tool that aims to make it
|
|
easier to do the following:
|
|
|
|
<ul>
|
|
<li>Build user interfaces to existing software. For example, the
|
|
primary reason for Swig's emphasis on scripting languages is not that
|
|
scripting languages are cool (which they are). It is that interpreters
|
|
make great user interfaces for a wide variety of applications.
|
|
Furthermore, interpreters can be used to build more advanced user
|
|
interfaces using toolkits such as Tk.
|
|
|
|
<p>
|
|
<li>Repackage an existing system as a collection of modules. The primary
|
|
motivation for this is that working with software organized as a collection
|
|
of loosely coupled modules generally results in greater flexibility and
|
|
reduced maintainance cost in comparison to a huge monolithic package. Since
|
|
scripting languages naturally promote the creation of modules and Swig makes it
|
|
easy to integrate scripting languages with existing software, Swig also serves
|
|
as a module building tool.
|
|
|
|
<p>
|
|
<li>Work with software in a rapidly changing, experimental, and
|
|
underspecified environment. One of the reasons why people don't like
|
|
formal component frameworks and over-specification is that they may
|
|
not know how a system is actually going to look or evolve when they
|
|
start a project. As a result, excessive formality is viewed as more
|
|
of a burden than a benefit. Swig, in a sense, turns this whole
|
|
scenario around by being highly adaptable and allowing the programmer
|
|
to write the software however they want as opposed to forcing programs to
|
|
be written within a rigidly defined set of rules.
|
|
|
|
<p>
|
|
<li>Serve as a rapid prototyping and testing tool. Given the
|
|
non-invasive way in which Swig works with existing software, it allows
|
|
developers to experiment with different modules, languages, and
|
|
methods of organizing a system. As a result, Swig can be used in the
|
|
prototyping and development stages of a project even if the final
|
|
package makes no use of Swig, scripting, or any of its related
|
|
modules.
|
|
|
|
</ul>
|
|
|
|
I also want to emphasize that <b>the target users of Swig are not professional
|
|
software engineers.</b> Rather the system is designed to be very easy to use for
|
|
more ordinary people who just happen to be working on programming projects as
|
|
part of their work or for fun (physicists, engineers, hackers, etc...). It is also
|
|
designed to provide a certain element of "instant gratification" if you will. I believe that
|
|
the following quotes from a SWIG user survey put things in the right perspective:
|
|
|
|
<ul>
|
|
<li> "Easy to use, no need to worry about language internals. It is a boon for application
|
|
developers, like me."
|
|
|
|
<p>
|
|
<li>"I really love the fact that the learning curve is short and flat."
|
|
|
|
<p>
|
|
<li>"Since SWIG has proven to be rather easy to use, I find I can carry out
|
|
the types of wrapping activities which would otherwise have been the responsibility
|
|
of a computer scientist."
|
|
|
|
<p>
|
|
<li>"I came, I saw, I wrapped. And it ran. Woo hoo!"
|
|
</ul>
|
|
|
|
<h2>2. Problems with SWIG</h2>
|
|
|
|
Despite the early success of SWIG, the system suffers from a number of serious
|
|
limitations. Furthermore, these problems are not easily fixed within the current
|
|
design.
|
|
|
|
<ul>
|
|
<li><b>The C/C++ parser is incomplete</b>. SWIG only understands a
|
|
limited subset of C and is based on an incorrect representation of C
|
|
datatypes that prevents the proper handling of "const", references,
|
|
pointers to functions, and other more complex types. In addition,
|
|
fundamental things like C++ function overloading still don't
|
|
work. Although 99% of the common cases work and there are workarounds
|
|
for certain situations, these limitations are still annoying.
|
|
|
|
<p>
|
|
<li><b>The SWIG module system is all wrong</b>. In the current
|
|
implementation, SWIG modules are created using C++ inheritance. This
|
|
has a number of unintended consequences. First, it restricts the
|
|
functionality of a module to a fixed
|
|
set of virtual function calls made deep inside the parsing engine. As
|
|
a result, it is not possible to write highly specialized modules that
|
|
don't quite fit into the normal module scheme. Second, it makes the
|
|
module system unnecesarily complicated and too tightly coupled. For
|
|
instance, there is no way to write a module that operates outside of
|
|
the SWIG framework or which might be useful on its own. Finally, I
|
|
believe that the C++ module system alienates the user community
|
|
because it is too complicated and there aren't that many C++ programmers. With a simpler
|
|
module interface, I believe that the system would be much more accessible
|
|
to the user community and people who want to write modules.
|
|
|
|
<p>
|
|
<li><b>Why stop at C and scripting?</b>. Although SWIG does a great
|
|
job of building scripting interfaces, there is no practical reason to
|
|
restrict its functionality in this way. For one, it is probably
|
|
worthwhile to consider alternative input languages including Fortran
|
|
and CORBA IDL. Second, there are a variety of secondary tasks that
|
|
one might be able to do with such a system such as analyze the
|
|
structure of application interfaces, generate documentation, provide
|
|
interfaces to databases, and provide tools to help modularize existing
|
|
software. Although these sound like lofty goals, I believe that the system
|
|
should be flexible enough to allow such applications.
|
|
</ul>
|
|
|
|
Of course, the real trick is how one goes about solving these issues
|
|
without making Swig excessively complicated--both from the point of
|
|
development and use.
|
|
|
|
<h2>3. SWIG Redevelopment: Modules</h2>
|
|
|
|
Simply stated, the primary goal of SWIG redevlopment is to redesign
|
|
the SWIG compiler as an extensible set of loosely coupled modules
|
|
(<b>Note: it is not my intent to radically change the way in which an
|
|
end-user uses SWIG</b>).
|
|
In this context, my intent is to allow a module to be virtually anything
|
|
that might be part of a compiler or which would interact with a
|
|
compiler in some manner. For example:
|
|
|
|
<ul>
|
|
<li> Preprocessors.
|
|
<li> Parsers.
|
|
<li> Code generators.
|
|
<li> Code browsers.
|
|
<li> Documentation generators.
|
|
<li> Optimizers.
|
|
<li> Testing tools.
|
|
<li> Other development environments.
|
|
</ul>
|
|
|
|
Unfortunately, as programs go, compilers tend to be extremely
|
|
complicated. Therefore, to make any sort of module system work, the
|
|
mechanism by which modules interact and exchange data needs to be
|
|
extremely powerful and extremely simple.
|
|
|
|
<p>
|
|
To address these problems, SWIG redevelopment is based on a few fundamental ideas:
|
|
|
|
<ol>
|
|
<li>All data will be internally represented using an XML-like scheme
|
|
in which every piece of data is identified by a unique element "tag"
|
|
and a set of associated attributes. Manipulation of the data in turn
|
|
will involve nothing more than making an appropriate association of
|
|
the "tags" with some sort of "action" to be performed. Unlike an
|
|
approach in which objects are placed into a rigid C++ class hierarchy,
|
|
the XML-based approach allows a virtually unlimited number of
|
|
different object types and attributes to be created and manipulated without ever
|
|
having to recompile anything. As a result, this would allow modules to easily
|
|
extend the system in novel ways. It should also be added that this
|
|
data representation greatly simplifies the underlying core of
|
|
the system because an XML-like representation can be
|
|
built entirely using nothing more than a hash-table object and a
|
|
few fundamental datatypes such as strings and lists.
|
|
|
|
<p>
|
|
<li>All underlying data structures will be built using a dynamic type
|
|
handling mechanism and a small collection of fundamental datatypes
|
|
including strings, lists, and hash tables. There are several
|
|
advantages to this approach. First, dynamic typing generally results
|
|
in substantially less code if done correctly. For instance, in my
|
|
own experiences using Objective-C vs. C++, I found that my dynamically
|
|
typed Objective-C programs were up to 5 times smaller than their C++
|
|
counterparts. Furthermore, dynamic typing is also one of the reasons
|
|
why scripting languages are so powerful.
|
|
|
|
<p>
|
|
<li>
|
|
Modules will interact with each other and exchange data using the XML-scheme
|
|
previously described. Due to the flexibility of this approach, this allows
|
|
modules to be written in a relatively stand-alone manner. Furthermore, the
|
|
use of XML may simplify the development of external tools that do not share
|
|
any commonality with the SWIG executable or its internal data structures.
|
|
|
|
<p>
|
|
<li>Dynamic loading. Closely associated with loose-coupling, the SWIG module
|
|
system should optionally support dynamic loading of compiler modules. This might
|
|
be accomplished in two ways. First, I believe that SWIG itself should
|
|
provide a scripting interface that allows its modules to be dynamically
|
|
loaded into a variety of scripting languages. Second, SWIG
|
|
should probably implement some sort of module loading system that allows modules
|
|
to be used without the optional scripting interace.
|
|
|
|
</ol>
|
|
|
|
Finally, it should be noted that the implementation language of choice for
|
|
the SWIG redevelopment effort is ANSI C. There are several reasons for this:
|
|
|
|
<ul>
|
|
<li> ANSI C is highly portable and available everywhere.
|
|
<li> C provides the performance necessary to implement a few critical aspects of a compiler.
|
|
<li> C is the ultimate glue-language in the sense that it can be interfaced
|
|
with just about anything if you know what you are doing. This will be especially important
|
|
if we want to interface with third-party compiler construction tools.
|
|
<li> It is perhaps the most widely spoken programming language--making it a good choice
|
|
to encourage community involvement and the creation of additional SWIG modules.
|
|
<li> Dave likes it.
|
|
</ul>
|
|
|
|
<h2>4. The Initial Module Set</h2>
|
|
|
|
The following list describes the proposed modules that will be part of the new
|
|
system:
|
|
|
|
<ul>
|
|
<li><b>Swig</b>. The Swig module contains a small core of functionality that is used
|
|
by the rest of the system. Features include access to the Swig library, command line
|
|
parsing, error handling, and a few common datatypes including a somewhat generic representation of
|
|
types.
|
|
|
|
<p>
|
|
<li><b>DOH</b>. DOH is the dynamic type library that provides the fundamental
|
|
data structures used by the system as well as run-time support for dynamic typing.
|
|
|
|
<p>
|
|
<li><b>Preprocessor</b>. A full C/C++ preprocessor with some extended macro handling
|
|
capabilities.
|
|
|
|
<p>
|
|
<li><b>LParse</b>. A SWIG1.1 compatible parser generator that can read the older SWIG interface
|
|
files and produce an appropriate parse-tree compatible with the new system. This parser
|
|
will primarily be used for backwards compatibility as well as issuing appropriate warnings
|
|
to the user about deprecated features. This parsing module will also be used until the
|
|
CParse module is completed.
|
|
|
|
<p>
|
|
<li><b>CParse</b>. A completely redesigned C/C++ parser that attempts to fix all of the parsing
|
|
problems in SWIG1.1. In particular, it will treat C/C++ datatypes correctly and support a
|
|
number of new C++ constructs. However, it is somewhat unlikely that this parser will
|
|
fully support all of C++ (at least not initially).
|
|
|
|
<p>
|
|
<li><b>SWIM</b>. The SWIG Monitor. This is utility module that allows users to browse
|
|
through internal compiler data structures using a web browser. This is primarily intended
|
|
for development purposes, but may evolve into a general purpose interface browsing
|
|
tool.
|
|
|
|
<p>
|
|
<li><b>SWILL</b>. The SWIG Web Interface Link Library. This is a generic library that
|
|
can be used to add a web server to an application. Although developed independently of
|
|
SWIG, it is used by the SWIM module above.
|
|
|
|
<p>
|
|
<li><b>XMLParse</b>. A parsing module that can read XML files and turn them into a SWIG parse
|
|
tree structure. The initial plan is to simply put a thin wrapper around the expat for this.
|
|
|
|
<p>
|
|
<li><b>XMLWriter</b>. A code generation module that can simply dump all of the internal
|
|
data structures out as a huge XML document.
|
|
|
|
<p>
|
|
<li><b>Tcl</b>. A code generator for Tcl.
|
|
|
|
<p>
|
|
<li><b>Perl</b>. A code generator for Perl.
|
|
|
|
<p>
|
|
<li><b>Python</b>. A code generator for Python.
|
|
|
|
<p>
|
|
<li><b>Guile</b>. A code generator for Guile.
|
|
|
|
<p>
|
|
<li><b>Java</b>. A code generator for Java.
|
|
|
|
<p>
|
|
<li><b>Testing</b>. A testing module that is designed to aid in the construction
|
|
of testing scripts. More details to be provided later.
|
|
|
|
<p>
|
|
<li><b>Documentation</b>. A replacement for the SWIG1.1 documentation generation
|
|
system. The precise details need to be determined, but it is likely that this
|
|
system will produce both plain ASCII files or XML files.
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|
|
|
|
|
|
|
|
|