swig/Doc/swim.html

<html>
<head>
<title>SWIM Project Overview</title>
</head>
<body bgcolor="#ffffff">
<center>
<h1>SWIM (The Swig Monitor)</h1>

<b>David Beazley <br>
Department of Computer Science <br>
University of Chicago <br>
Chicago, IL  60637 <br>
beazley@cs.uchicago.edu <br>
</b>
</center>

<p>
<b>$Header$</b>

<p>

<h2>1. The Problem</h2>

One of the biggest problems in compiler construction is the process of
debugging and monitoring the behavior of various compiler components.
Typically, internal data is represented as a large tree structure
(e.g., an abstract syntax tree) where the nodes of the tree represent
different structural elements of a program (functions, classes,
scopes, etc...).  In additional, a compiler may maintain symbol tables
and other auxiliary data structures.  The main problem with this is
that even a moderately complex parse tree may contain hundreds to tens
of thousands of nodes that are extremely difficult to examine in any
detail. For instance, simply sticking a few "printf" statements in the
compiler is likely to produce a lot of very confusing output.
Likewise, efforts to create a code "browser" may be as difficult as
writing the compiler itself (especially if it includes a fancy
graphical user interface).

<h2>2. SWIM </h2>

To address the problem of compiler debugging, the goal of this project
is produce a component known as the SWIG Monitor (SWIM).  Simply
stated, the purpose of SWIM is to allow easy navigation through the
internal parse trees and other data structures generated by the SWIG
compiler.  To do this, SWIM will operate as a embedded HTTP server
that allows a user to navigate through internal data structures using
nothing more than an Internet browser.

<p>
There are several reasons why using a browser is a good match for this particular
purpose:

<ul>
<li> The browser provides a graphical user interface.  Therefore, we don't need to write one.
<li> Writing an embedded web-server is easy.  Well, much easier than the alternative.
<li> The web consists of millions of pages interconnected by a collection of links.  A parse
tree consists of thousands of elements interconnected by a collection of links.  Am I the
only one who sees the commonality here?
<li> A web server interface is cross-platform and can be made to work easily on Unix,
Windows, and the Macintosh.
<li> Everyone knows how to use a browser.
</ul>

<h2>3. The SWIM API </h2>

The invocation of SWIM is going to be very easy. You only need to implement one
public function:

<ul>
<li><tt>void Swim_view(DOH *obj);</tt><br>
Given an arbitrary object, this function opens up a web server on an arbitrary user
port and waits for incoming HTTP requests.  When the user connects, the object passed
in <tt>obj</tt> should be displayed.  The function should only return when a
user clicks on some kind of exit link.
</ul>

The way that this will probably be used as follows:

<blockquote>
<pre>
% swig -c++ -swim -I/usr/lib foo.i
SWIM: HTTP server running on http://localhost:7253
Waiting for connections...
</pre>
</blockquote>

At this point, a user could paste the above URL into their browser to start viewing
internal data structures.

<p>
<b>Note:</b> On Unix, you can force Netscape to automatically load a new page by
issuing the following system call from C:

<blockquote>
<tt>system("netscape -remote 'openURL(http://localhost:7253)'");</tt>
</blockquote>

<p>
This would eliminate the need to paste the URL into the browser (provided the
user is running Netscape).

<h2>4. Connection Protocol</h2>

At a minimum, the SWIM HTTP server should accept two different kinds of connection
requests:

<ul>
<li>The initial request of "/".   When the user first connects, they should be presented
with a display that shows the contents of the object passed to the <tt>Swim_view()</tt>
function.

<p>
<li>Requests to view other objects.  These requests should be of the form "<tt>/obj.html?id=<em>addr</em></tt>"
where <tt><em>addr</em></tt> is either an address or an identifier that specifies a
particular object in memory.

</ul>

You are welcome to add additional types of HTTP requests should it enhance the use of
the monitoring software.

<h2>5. Object Model </h2>

All objects within SWIG are represented as a single type "<tt>DOH *</tt>".   What is a DOH
you ask?  Well, a DOH is an opaque pointer that refers to any of the following basic
objects (and possibly a few others not yet created):

<ul>
<li> A string
<li> A list of other objects.
<li> A mapping object (a hash table of objects).
<li> A void * container (used to hold arbitrary C/C++ objects).
<li> Files.
<li> Callable functions.
</ul>

Fortunately, it is easy to see what an object is--you can just print it out. For example:

<blockquote>
<pre>
DOH *obj;
...
Printf(stdout,"%s\n", obj);
</pre>
</blockquote>

(note: it is "Printf" not "printf").

<p>
Now, this is where things start to get more complicated  First off, all elements of SWIG parse
trees are represented as DOH mapping objects.  In a nutshell, these are hash tables containing
other DOH objects.  The interface to these hash tables is as follows:

<ul>
<li><tt>DOH *Getattr(DOH *obj, DOH *key);</tt> <br>
This retrieves an object from <tt>obj</tt> indexed by <tt>key</tt>. Returns NULL if no such
object exists.  The <tt>key</tt> value may be either a DOH string object or a simple "char *".

<p>
<li><tt>int Setattr(DOH *obj, DOH *key, DOH *value);</tt><br>
This adds an object to <tt>obj</tt> replacing any existing object with the same key value.
The <tt>key</tt> value may be a DOH string object or a "char *."  The <tt>value</tt> may be
any DOH object or a "char *" (which is automatically converted to a DOH string). Returns -1
on error.

<p>
<li><tt>int Delattr(DOH *obj, DOH *key);</tt><br>
Deletes an object from the hash table.

<p>
<li><tt>DOH *Firstkey(DOH *obj);</tt><br>
Returns the first hash table key which is usually a DOH string object.

<p>
<li><tt>DOH *Nextkey(DOH *obj);</tt><br>
Returns the next hash table key.  This should only be used after making a call to Firstkey().

<p>
<li><tt>int Len(DOH *obj);</tt><br>
Returns the number of items stored in the hash table.

</ul>

Here is an example of using these functions:

<blockquote>
<pre>
void example(DOH *obj) {
   DOH *tag;
   DOH *key;

   /* Get the 'tag' attibute of an object */
   tag = Getattr(obj,"tag");
   if (!tag) {
       printf("Hey, we're missing a tag here\n");
   }

   /* Change the 'tag' attribute of an object */
   Setattr(obj,"tag","Function");

   /* Iterate over all of the keys */
   Printf(stdout,"The object has the following attributes:\n");

   for (key = Firstkey(obj); key; key = Nextkey(obj)) {
       Printf(stdout,"    %s\n", key);
   }
}
</pre>
</blockquote>

In addition, to hash tables, you may encounter an occasional list object.  List objects
support the following functions (note: strings also support many of these methods):

<ul>
<li><tt>DOH *Getitem(DOH *obj, int n);</tt> <br>
Get the nth item from a list.   Returns NULL if no such object exists.

<p>
<li><tt>int Setitem(DOH *obj, int n, DOH *value);</tt><br> Sets the
nth item of the list, replacing any previously existing value.
<tt>value</tt> may be any DOH object or a "char *" (which is
automatically converted to a DOH string). Returns -1 on error.

<p>
<li><tt>int Delitem(DOH *obj, int n);</tt><br>
Deletes the nth object from the list.  All items with index greater than n are shifted
down by one position.

<p>
<li><tt>int Insertitem(DOH *obj, int n, DOH *value);</tt><br>
Insert an item into the list at position n.  Items with index greater than n are shifted to
accomodate the new item.  <tt>value</tt> can be a DOH object or a "char *".

<p>
<li><tt>int Append(DOH *obj, DOH *value);</tt><br>
Appends a new item to the end of a list.

<p>
<li><tt>int Len(DOH *obj);</tt><br>
Returns the number of items in the list.

</ul>

For example, the following code shows how to print out all of the elements in a list:

<blockquote>
<pre>
int print_list(DOH *lobj) {
    int i, len;
    DOH *item;

    Printf(stdout,"The list contains the following items:\n");
    len = Len(lobj);
    for (i = 0; i &lt; len; i++) {
        item = Getitem(lobj,i);
        Printf(stdout,"   %s\n", item);
    }
}
</pre>
</blockquote>

Finally, you may want to know what type of object you are dealing with.
The following functions can be used:

<ul>
<p>
<li><tt>int DohIsString(DOH *obj);</tt><br>
Returns 1 if the object is a string.

<p>
<li><tt>int DohIsSequence(DOH *obj); </tt><br>
Returns 1 if an object is a sequence (list).  Note: this also returns 1 for string objects.

<p>
<li><tt>int DohIsMapping(DOH *obj); </tt><br>
Returns 1 if an object is a mapping.

<p>
<li><tt>int DohIsFile(DOH *obj); </tt><br>
Returns 1 if an object is a file.

<p>
<li><tt>int DohIsCallable(DOH *obj); </tt><br>
Returns 1 if an object is a callable object.
</ul>

<h2>6. SWILL</h2>

SWILL is a simple web server library designed to make it easy to
create application servers.  Unlike using a normal web server (e.g.,
Apache) and CGI scripts to talk to a program, an application server is
a web server that runs directly inside of an application.  The
advantage to this aproach is that it is easy for the web server to
access internal data structures and other elements of a program that
would otherwise be inaccessible to a stand-alone server (for instance,
we can use this to navigate through the internal data structures of
SWIG).

<p>
SWILL is pretty easy to use--especially since it makes use of the DOH
object model. Here are a few critical functions:

<ul>
<li><tt>Swill *Swill_create_server(char *hostname, int port, char *title, char *logfile);</tt><br>
Creates a new web server object listening on the specified hostname and port.  <tt>hostname</tt> is usually
set to NULL (to allow connections on any hostname).  If <tt>port</tt> is set to 0, the server will
pick a free port number automatically.  <tt>title</tt> is the name of the web-server.  <tt>logfile</tt>
is an optional name of a log file in which to record requests.  If you don't want any logging, this
should be set to NULL.   The function returns an object "Swill *" that refers to the server object.

<p>
<li><tt>int Swill_serve(Swill *s);</tt><br>
Waits for a serves a single HTTP request.

<p>
<li><tt>void Swill_handle(Swill *s, char *document, int (*handler)(DOH *in, DOH *out, void *clientdata), void *clientdata);</tt><br>
Creates a handler function with the web server.  <tt>document</tt> is the name of the document that will
trigger this handler (e.g., "foo.html").  <tt>handler</tt> is a pointer to the handler function that
will be called.  <tt>clientdata</tt> is an optional pointer to an object that will be passed to the handler
function (this is normally NULL).

<p>
<li><tt>void Swill_file(Swill *s, char *filename);</tt><br>
Attaches an ordinary file to the web server.

<p>
<li><tt>void Swill_directory(Swill *s, char *directory);</tt><br>
Attaches a directory of files to the web server.

</ul>

Here is a really simple example of using SWILL to serve a directory
of files (try this out on your own):

<blockquote>
<pre>
#include "swill.h"

int main(int argc, char **argv) {
    Swill *web;
    web = Swill_create_server(0,4000,"My web server",0);
    Swill_directory(web,"/home/beazley/public_html");
    while (1) {
          Swill_serve(web);
    }
}
</pre>
</blockquote>

Now, in order to make things really work right, you will want to write handler functions.
Here is how you might start writing the Swim_view() function.

<blockquote>
<pre>
#include "swill.h"

/* Print the main page with the original object (kind of bogus)*/
int index_handler(DOH *in, DOH *out, void *clientdata) {
   DOH *obj = (DOH *) clientdata;
   Printf(out,"&lt;html&gt;&lt;h1&gt;Here's your object&lt;/h1&gt;\n");
   Printf(out,"&lt;pre&gt;\n");
   Printf(out,"%s\n", obj);
   Printf(out,"&lt;/pre&gt;\n");
   Printf(out,"&lt;/html&gt;\n");
   return 0;
}

void Swim_view(DOH *obj) {
   Swill *web;
   web = Swill_create_server(0,0,"Swim server",0);
   Swill_handle(web,"index.html", index_handler, obj);
   Printf(stdout,"SWIM: HTTP server running on http://localhost:%d\n", web-&gt;port);
   Printf(stdout,"Waiting for connections...\n");
   while (1) {
          Swill_serve(web);
   }
}

</pre>
</blockquote>

Note the use of clientdata in the above example.  Specifically, the
initial object is given to the Swill_handle() function.  When the user later
requests the "index.html" document, this object is given to the
handler function as the clientdata argument.

<p>
The final piece of the puzzle is how to handle form data.   In HTML, form data is typically
encoded as a query string that is attached to a URL.   For example:

<blockquote>
<pre>
http://foo.com/bar.html?x=34&name=spam
</pre>
</blockquote>

SWILL makes it easy to extract these query parameters in the handler functions.  In the above case,
you would just do this:

<blockquote>
<pre>
int foo_handler(DOH *in, DOH *out, void *clientdata) {
   int   x;
   char *name;

   x = GetInt(in,"x");
   name = GetChar(in,"name");

   ...
}
</pre>
</blockquote>

Where you might use this in SWIM is the extraction of object ids.  For example,
here is a really simple handler that extracts an object pointer and displays some information
(along with links to other objects).

<blockquote>
<pre>
static int
obj_handler(DOH *in, DOH *out, void *clientdata)
{
  DOH *obj;
  char *id;

  id = GetChar(in,"id");           /* Get object identifier */
  if (!id) {
      Printf(out,"No object specified.\n");
      return 0;
  }
  obj = (DOH *) strtoul(id,0,16);

  Printf(out,"&lt;html&gt;&lt;body bgcolor=\"#ffffff\"&gt;\n");
  Printf(out,"&lt;h1&gt;Object %x (%s)&lt;/h1&gt;\n", obj, Objname(obj));
  Printf(out,"&lt;pre&gt;\n");

  if (DohIsString(obj)) {
    Printf(out,"%s\n", obj);
  } else if (DohIsSequence(obj)) {
    int i;
    int l = Len(obj);
    for (i = 0; i &lt; l; i++) {
      DOH *item = Getitem(obj,i);
      Printf(out,"[%d] - &lt;a href=\"obj.html?id=%x\"&gt;%s&lt;/a&gt;\n", i, item, Objname(item));
    }
  } else if (DohIsMapping(obj)) {
    DOH *key;
    for (key = Firstkey(obj); key; key = Nextkey(obj)) {
      DOH *item = Getattr(obj,key);
      Printf(out,"[%-20s] - &lt;a href=\"obj.html?id=%x\"&gt;%s&lt;/a&gt;\n", key, item, Objname(item));
    }
  } else {
    Printf(out,"%s\n", obj);
  }
  Printf(out,"&lt;/pre&gt;&lt;/body&gt;&lt;/html&gt;\n");
  return 0;
}
...
void Swim_view(DOH *obj) {
   Swill *web;
   ...
   Swill_handle(web,"obj.html", obj_handler, 0);
   ...
}
</pre>
</blockquote>

Of course, you'll want to do something better than this (as the information displayed isn't all that
user-friendly).

<h2>7. How to get started </h2>

The SWILL library and all of the necessary components are included in the SWIG CVS repository.
However, as this is experimental, these components are not built by default. <p>

Here's how to build everything:

<ol>
<li> Check out the SWIG repository.
<p>
<li> Build everything by doing the following:
<pre>
% autoconf
% ./configure --prefix=/home/yourdirectory/packages
% make
</pre>

<p>
<li> Go to the Source/SWILL directory and do this:
<pre>
% autoconf
% ./configure
% make
</pre>

<p>
<li> Go to the Source/Swim directory and copy the 'Makefile.in' file to 'Makefile'.

<p>
<li> Build the experimental system by typing 'make experiment' in the top level SWIG
directory.  This builds everything and builds an experimental system that includes
all of new components.

<p>
<li> Run the resulting system by typing 'swig'.   The SWIM extension can be invoked by typing
'swig -swim filename.i' where filename is a SWIG interface file.

</ol>

Now, a super-lame version of a browser is already included in the Source/Swim directory.
This version does *NOT* necessarily reflect what I want to have in the final version--rather it's
only a starting point to get an idea of how the system is actually put together and how some
of the pieces work.   You should start with this and either rewrite it completely or
hack it to pieces in order to produce something better.

<p>
I don't expect that you will need to modify other parts of SWIG at this time. However,
if need be, you should feel  welcome to modify SWILL should it be missing functionality that
would be helpful in your implementation.

<h2>8. Resources</h2>

Since this project is going to involve a lot of web stuff, you may want to get a good HTML
book to help with some of the details.

<h2>9. Where this is going</h2>

I see the SWIG browser to be an integral part of the system--especially for debugging and development.
As a result, I don't think that we should set our goals too low.  In particular, I want you to
consider a variety of add-ons and features such as :

<ul>
<li>Being able to select different views (files, classes, etc...).
<li>Filters to select or omit certain classes of tags.
<li>Perhaps the ability to edit the parse tree through the browser interface.
<li>Various types of data collection related to internal data (size of the parse tree,
occurences of each kind of tag, etc...).
</ul>

Here is a <a href="swimmock.html">mock-up</a> page of what this might look like (although this
is only an idea).

</body>
</html>