*** empty log message ***
git-svn-id: https://swig.svn.sourceforge.net/svnroot/swig/trunk@144 626c5289-ae23-0410-ae9c-e8d60b6d4f22
This commit is contained in:
parent
8d45fab2af
commit
cd8257c4dd
1 changed files with 362 additions and 0 deletions
362
SWIG/Doc/whitepaper.html
Normal file
362
SWIG/Doc/whitepaper.html
Normal file
|
|
@ -0,0 +1,362 @@
|
|||
<html>
|
||||
<head>
|
||||
<title>SWIG Project Overview</title>
|
||||
</head>
|
||||
<body bgcolor="#ffffff">
|
||||
<center>
|
||||
<h1>The SWIG Redevelopment Effort</h1>
|
||||
|
||||
<b>David Beazley <br>
|
||||
Department of Computer Science <br>
|
||||
University of Chicago <br>
|
||||
Chicago, IL 60637 <br>
|
||||
beazley@cs.uchicago.edu <br>
|
||||
</b>
|
||||
</center>
|
||||
|
||||
<p>
|
||||
<b>$Header$</b>
|
||||
|
||||
<p>
|
||||
|
||||
<h2>1. An Introduction</h2>
|
||||
|
||||
One of the biggest problems faced by people writing software is the
|
||||
problem how to make software easier to use, more interactive, and more
|
||||
modular. Typically, the computer science community has approached
|
||||
these problems by focusing on formal design methodology and highly
|
||||
specified frameworks built around notions of software components,
|
||||
object-oriented programming, and anything labeled as "best practice"
|
||||
(whatever that means). Although this type of approach is perhaps
|
||||
appropriate for very large software projects involving hundreds of
|
||||
programmers, software engineers, and managers, I've never met a sane
|
||||
programmer who really enjoys writing software in such an environment.
|
||||
Furthermore, a large number of software projects are undertaken by
|
||||
small groups of people who would not classify themselves as
|
||||
professional software developers or software engineers. Typical
|
||||
examples might include scientific computing software, specialized
|
||||
systems for engineering applications, or just about any kind of
|
||||
experimental research and development project. These are the types of
|
||||
programming projects "in the small" that are my primary interest.
|
||||
|
||||
<p>
|
||||
First, programming projects in the small should not be confused with
|
||||
the toy programs one might write as part of a class project or when
|
||||
solving exceedingly trivial problems. More often that not, a software
|
||||
package written by only a few people may have been developed over a
|
||||
period of several years and may contain of hundreds of thousands of
|
||||
lines of source code. Furthermore, due to limited manpower, these
|
||||
projects are likely to rely on a variety of third-party packages and
|
||||
programming libraries to accomplish certain tasks. Finally, it is not
|
||||
uncommon for such software to have been developed in a relatively
|
||||
piecemeal fashion with little if any formal design. The developers
|
||||
may also be burdened with the task of supporting a large base of
|
||||
legacy code that is critical to the application, but which is too
|
||||
complicated to simply rewrite from scratch. As a result, the software
|
||||
developed in such an environment may be a tangled web of code that
|
||||
gets the job done, but which is less than ideal in terms of its
|
||||
usuability and overall design.
|
||||
|
||||
<p>
|
||||
Of course, one does not need to look very far to see examples of this
|
||||
kind of development. For instance, I would claim that just about
|
||||
every successful project within the Open Source community has been
|
||||
developed in this way. As a more specific example, Swig itself was
|
||||
developed in a relatively adhoc manner over a period of two years.
|
||||
Although it was my intent to have a relatively clean design at the
|
||||
start, the system has since evolved into a very tangled mess of
|
||||
monolithic C++ code. It's not that I wanted to end up in this
|
||||
situation--rather the experience gained by Swig's early users pushed
|
||||
the system in an unanticipated direction that the original design
|
||||
failed to address. In many ways, it is ironic that SWIG should end up
|
||||
in this particular state given that this is <em>exactly</em> the type
|
||||
of situation that Swig was built to address!
|
||||
|
||||
<p>
|
||||
Naturally, this brings us to the overall motivation behind SWIG itself.
|
||||
In a nutshell, SWIG is a software development tool that aims to make it
|
||||
easier to do the following:
|
||||
|
||||
<ul>
|
||||
<li>Build user interfaces to existing software. For example, the
|
||||
primary reason for Swig's emphasis on scripting languages is not that
|
||||
scripting languages are cool (which they are), it is that interpreters
|
||||
make great user interfaces for a wide variety of applications.
|
||||
Furthermore, interpreters can be used to build more advanced user
|
||||
interfaces using toolkits such as Tk.
|
||||
|
||||
<p>
|
||||
<li>Repackage an existing system as a collection of modules. The primary
|
||||
motivation for this is that working with software organized as a collection
|
||||
of loosely coupled modules generally results in greater flexibility and
|
||||
reduced maintainance cost in comparison to a huge monolithic package. Since
|
||||
scripting languages naturally promote the creation of modules and Swig makes it
|
||||
easy to integrate scripting languages with existing software, Swig also serves
|
||||
as a module building tool.
|
||||
|
||||
<p>
|
||||
<li>Work with software in a rapidly changing, experimental, and
|
||||
underspecified environment. One of the reasons why people don't like
|
||||
formal component frameworks and over-specification is that they may
|
||||
not know how a system is actually going to look or evolve when they
|
||||
start a project. As a result, excessive formality is viewed as more
|
||||
of a burden than a benefit. Swig, in a sense, turns this whole
|
||||
scenario around by being highly adaptable and allowing the programmer
|
||||
to write the software however they want as opposed to forcing programs to
|
||||
be written within a rigidly defined set of rules.
|
||||
|
||||
<p>
|
||||
<li>Serve as a rapid prototyping and testing tool. Given the
|
||||
non-invasive way in which Swig works with existing software, it allows
|
||||
developers to experiment with different modules, languages, and
|
||||
methods of organizing a system. As a result, Swig can be used in the
|
||||
prototyping and development stages of a project even if the final
|
||||
package makes no use of Swig, scripting, or any of its related
|
||||
modules.
|
||||
|
||||
</ul>
|
||||
|
||||
I also want to emphasize that <b>the target users of Swig are not professional
|
||||
software engineers.</b> Rather the system is designed to be very easy to use for
|
||||
more ordinary people who just happen to be working on programming projects as
|
||||
part of their work or for fun (physicists, engineers, hackers, etc...). It is also
|
||||
designed to provide a certain element of "instant gratification" if you will. I believe that
|
||||
the following quotes from a SWIG user survey put things in the right perspective:
|
||||
|
||||
<ul>
|
||||
<li> "Easy to use, no need to worry about language internals. It is a boon for application
|
||||
developers, like me."
|
||||
|
||||
<p>
|
||||
<li>"I really love the fact that the learning curve is short and flat."
|
||||
|
||||
<p>
|
||||
<li>"Since SWIG has proven to be rather easy to use, I find I can carry out
|
||||
the types of wrapping activities which would otherwise have been the responsibility
|
||||
of a computer scientist."
|
||||
|
||||
<p>
|
||||
<li>"I came, I saw, I wrapped. And it ran. Woo hoo!"
|
||||
</ul>
|
||||
|
||||
<h2>2. Problems with SWIG</h2>
|
||||
|
||||
Despite the early success of SWIG, the system suffers from a number of serious
|
||||
limitations. Furthermore, these problems are not easily fixed within the current
|
||||
design.
|
||||
|
||||
<ul>
|
||||
<li><b>The C/C++ parser is incomplete</b>. SWIG only understands a
|
||||
limited subset of C and is based on an incorrect representation of C
|
||||
datatypes that prevents the proper handling of "const", references,
|
||||
pointers to functions, and other more complex types. In addition,
|
||||
fundamental things like C++ function overloading still don't
|
||||
work. Although 99% of the common cases work and there are workarounds
|
||||
for certain situations, these limitations are still annoying.
|
||||
|
||||
<p>
|
||||
<li><b>The SWIG module system is all wrong</b>. In the current
|
||||
implementation, SWIG modules are created using C++ inheritance. This
|
||||
has a number of unintended consequences. First, it restricts the
|
||||
functionality of a module to a fixed
|
||||
set of virtual function calls made deep inside the parsing engine. As
|
||||
a result, it is not possible to write highly specialized modules that
|
||||
don't quite fit into the normal module scheme. Second, it makes the
|
||||
module system unnecesarily complicated and too tightly coupled. For
|
||||
instance, there is no way to write a module that operates outside of
|
||||
the SWIG framework or which might be useful on its own. Finally, I
|
||||
believe that the C++ module system alienates the user community
|
||||
because it is too complicated and there aren't that many C++ programmers. With a simpler
|
||||
module interface, I believe that the system would be much more accessible
|
||||
to the user community and people who want to write modules.
|
||||
|
||||
<p>
|
||||
<li><b>Why stop at C and scripting?</b>. Although SWIG does a great
|
||||
job of building scripting interfaces, there is no practical reason to
|
||||
restrict it's functionality in this way. For one, it is probably
|
||||
worthwhile to consider alternative input languages including Fortran
|
||||
and CORBA IDL. Second, there are a variety of secondary tasks that
|
||||
one might be able to do with such a system such as analyze the
|
||||
structure of application interfaces, generate documentation, provide
|
||||
interfaces to databases, and provide tools to help modularize existing
|
||||
software. Although these sound like lofty goals, I believe that the system
|
||||
should be flexible enough to allow such applications.
|
||||
</ul>
|
||||
|
||||
Of course, the real trick is how one goes about solving these issues without
|
||||
making Swig excessively complicated--both from the point of development and use.
|
||||
|
||||
<h2>3. SWIG Redevelopment: Modules</h2>
|
||||
|
||||
Simply stated, the primary goal of SWIG redevlopment is to redesign
|
||||
the SWIG compiler as an extensible set of loosely coupled modules
|
||||
(<b>Note: it is not my intent to radically change the way in which an
|
||||
end-user uses SWIG</b>).
|
||||
In this context, my intent is to allow a module to be virtually anything
|
||||
that might be part of a compiler or which would interact with a
|
||||
compiler in some manner. For example:
|
||||
|
||||
<ul>
|
||||
<li> Preprocessors.
|
||||
<li> Parsers.
|
||||
<li> Code generators.
|
||||
<li> Code browsers.
|
||||
<li> Documentation generators.
|
||||
<li> Optimizers.
|
||||
<li> Testing tools.
|
||||
<li> Other development environments.
|
||||
</ul>
|
||||
|
||||
Unfortunately, as programs go, compilers tend to be extremely
|
||||
complicated. Therefore, to make any sort of module system work, the
|
||||
mechanism by which modules interact and exchange data needs to be
|
||||
extremely powerful and extremely simple.
|
||||
|
||||
<p>
|
||||
To address these problems, SWIG redevelopment is based on a few fundamental ideas:
|
||||
|
||||
<ol>
|
||||
<li>All data will be internally represented using an XML-like scheme
|
||||
in which every piece of data is identified by a unique element "tag"
|
||||
and a set of associated attributes. Manipulation of the data in turn
|
||||
will involve nothing more than making an appropriate association of
|
||||
the "tags" with some sort of "action" to be performed. Unlike an
|
||||
approach in which objects are placed into a rigid C++ class hierarchy,
|
||||
the XML-based approach allows a virtually unlimited number of
|
||||
different object types and attributes to be created and manipulated without ever
|
||||
having to recompile anything. As a result, this would allow modules to easily
|
||||
extend the system in novel ways. It should also be added that this
|
||||
data representation greatly simplifies the underlying core of
|
||||
the system because an XML-like representation can be
|
||||
built entirely using nothing more than a hash-table object and a
|
||||
few fundamental datatypes such as strings and lists.
|
||||
|
||||
<p>
|
||||
<li>All underlying data structures will be built using a dynamic type
|
||||
handling mechanism and a small collection of fundamental datatypes
|
||||
including strings, lists, and hash tables. There are several
|
||||
advantages to this approach. First, dynamic typing generally results
|
||||
in substantially less code if done correctly. For instance, in my
|
||||
own experiences using Objective-C vs. C++, I found that my dynamically
|
||||
typed Objective-C programs were up to 5 times smaller than their C++
|
||||
counterparts. Furthermore, dynamic typing is also one of the reasons
|
||||
why scripting languages are so powerful.
|
||||
|
||||
<p>
|
||||
<li>
|
||||
Modules will interact with each other and exchange data using the XML-scheme
|
||||
previously described. Due to the flexibility of this approach, this allows
|
||||
modules to be written in a relatively stand-alone manner. Furthermore, the
|
||||
use of XML may simplify the development of external tools that do not share
|
||||
any commonality with the SWIG executable or its internal data structures.
|
||||
|
||||
<p>
|
||||
<li>Dynamic loading. Closely associated with loose-coupling, the SWIG module
|
||||
system should optionally support dynamic loading of compiler modules. This might
|
||||
be accomplished in two ways. First, I believe that SWIG itself should
|
||||
provide a scripting interface that allows its modules to be dynamically
|
||||
loaded into a variety of scripting languages. Second, SWIG
|
||||
should probably implement some sort of module loading system that allows modules
|
||||
to be used without the optional scripting interace.
|
||||
|
||||
</ol>
|
||||
|
||||
Finally, it should be noted that the implementation language of choice for
|
||||
the SWIG redevelopment effort is ANSI C. There are several reasons for this:
|
||||
|
||||
<ul>
|
||||
<li> ANSI C is highly portable and available everywhere.
|
||||
<li> C provides the performance necessary to implement a few critical aspects of a compiler.
|
||||
<li> C is the ultimate glue-language in the sense that it can be interfaced
|
||||
with just about anything if you know what you are doing. This will be especially important
|
||||
if we want to interface with third-party compiler construction tools.
|
||||
<li> It is perhaps the most widely spoken programming language--making it a good choice
|
||||
to encourage community involvement and the creation of additional SWIG modules.
|
||||
<li> Dave likes it.
|
||||
</ul>
|
||||
|
||||
<h2>4. The Initial Module Set</h2>
|
||||
|
||||
The following list describes the proposed modules that will be part of the new
|
||||
system:
|
||||
|
||||
<ul>
|
||||
<li><b>Swig</b>. The Swig module contains a small core of functionality that is used
|
||||
by the rest of the system. Features include access to the Swig library, command line
|
||||
parsing, error handling, and a few common datatypes including a somewhat generic representation of
|
||||
types.
|
||||
|
||||
<p>
|
||||
<li><b>DOH</b>. DOH is the dynamic type library that provides the fundamental
|
||||
data structures used by the system as well as run-time support for dynamic typing.
|
||||
|
||||
<p>
|
||||
<li><b>Preprocessor</b>. A full C/C++ preprocessor with some extended macro handling
|
||||
capabilities.
|
||||
|
||||
<p>
|
||||
<li><b>LParse</b>. A SWIG1.1 compatible parser generator that can read the older SWIG interface
|
||||
files and produce an appropriate parse-tree compatible with the new system. This parser
|
||||
will primarily be used for backwards compatibility as well as issuing appropriate warnings
|
||||
to the user about deprecated features. This parsing module will also be used until the
|
||||
CParse module is completed.
|
||||
|
||||
<p>
|
||||
<li><b>CParse</b>. A completely redesigned C/C++ parser that attempts to fix all of the parsing
|
||||
problems in SWIG1.1. In particular, it will treat C/C++ datatypes correctly and support a
|
||||
number of new C++ constructs. However, it is somewhat unlikely that this parser will
|
||||
fully support all of C++ (at least not initially).
|
||||
|
||||
<p>
|
||||
<li><b>SWIM</b>. The SWIG Monitor. This is utility module that allows users to browse
|
||||
through internal compiler data structures using a web browser. This is primarily intended
|
||||
for development purposes, but which may evolve into a general purpose interface browsing
|
||||
tool.
|
||||
|
||||
<p>
|
||||
<li><b>SWILL</b>. The SWIG Web Interface Link Library. This is a generic library that
|
||||
can be used to add a web server to an application. Although developed independently of
|
||||
SWIG, it is used by the SWIM module above.
|
||||
|
||||
<p>
|
||||
<li><b>XMLParse</b>. A parsing module that can read XML files and turn them into a SWIG parse
|
||||
tree structure. The initial plan is to simply put a thin wrapper around the expat for this.
|
||||
|
||||
<p>
|
||||
<li><b>XMLWriter</b>. A code generation module that can simply dump all of the internal
|
||||
data structures out as a huge XML document.
|
||||
|
||||
<p>
|
||||
<li><b>Tcl</b>. A code generator for Tcl.
|
||||
|
||||
<p>
|
||||
<li><b>Perl</b>. A code generator for Perl.
|
||||
|
||||
<p>
|
||||
<li><b>Python</b>. A code generator for Python.
|
||||
|
||||
<p>
|
||||
<li><b>Guile</b>. A code generator for Guile.
|
||||
|
||||
<p>
|
||||
<li><b>Java</b>. A code generator for Java.
|
||||
|
||||
<p>
|
||||
<li><b>Testing</b>. A testing module that is designed to aid in the construction
|
||||
of testing scripts. More details to be provided later.
|
||||
|
||||
<p>
|
||||
<li><b>Documentation</b>. A replacement for the SWIG1.1 documentation generation
|
||||
system. The precise details need to be determined, but it is likely that this
|
||||
system will produce both plain ASCII files or XML files.
|
||||
|
||||
</ul>
|
||||
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
||||
|
||||
|
||||
|
||||
Loading…
Add table
Add a link
Reference in a new issue