Revert "Cleaned up some of the highlightin in getting_started.rst and llvmpy-package.rst"

This reverts commit 83f24e90d3.
This commit is contained in:
Maggie Mari 2012-08-10 11:34:32 -05:00
commit 68e6dc693e
172 changed files with 4 additions and 28972 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,4 +0,0 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 37862f52ab1a5aee0638e82db6615f29
tags: fbb0d17656682115ca4d033fb2f83ba1

View file

@ -1,147 +0,0 @@
+--------------------------------+
| layout: page |
+--------------------------------+
| title: Comparison Operations |
+--------------------------------+
Integer Comparision # {#icmp}
=============================
Predicates for use with ``icmp`` instruction are listed below. All of
these are integer constants defined in the ``llvm.core`` module.
``ICMP_EQ``
-----------
Equality
``ICMP_NE``
-----------
Inequality
``ICMP_UGT``
------------
Unsigned greater than
``ICMP_UGE``
------------
Unsigned greater than or equal
``ICMP_ULT``
------------
Unsigned less than
``ICMP_ULE``
------------
Unsigned less than or equal
``ICMP_SGT``
------------
Signed greater than
``ICMP_SGE``
------------
Signed greater than or equal
``ICMP_SLT``
------------
Signed less than
``ICMP_SLE``
------------
Signed less than or equal
Float Comparision # {#fcmp}
===========================
Predicates for use with ``fcmp`` instruction are listed below. All of
these are integer constants defined in the ``llvm.core`` module.
``FCMP_FALSE``
--------------
Always false
``FCMP_OEQ``
------------
True if ordered and equal
``FCMP_OGT``
------------
True if ordered and greater than
``FCMP_OGE``
------------
True if ordered and greater than or equal
``FCMP_OLT``
------------
True if ordered and less than
``FCMP_OLE``
------------
True if ordered and less than or equal
``FCMP_ONE``
------------
True if ordered and operands are unequal
``FCMP_ORD``
------------
True if ordered (no NaNs)
``FCMP_UNO``
------------
True if unordered: ``isnan(X) | isnan(Y)``
``FCMP_UEQ``
------------
True if unordered or equal
``FCMP_UGT``
------------
True if unordered or greater than
``FCMP_UGE``
------------
True if unordered, greater than or equal
``FCMP_ULT``
------------
True if unordered, or less than
``FCMP_ULE``
------------
True if unordered, less than or equal
``FCMP_UNE``
------------
True if unordered or not equal
``FCMP_TRUE``
-------------
Always true

View file

@ -1,12 +0,0 @@
********************************
Examples and LLVM Tutorials
********************************
.. toctree::
:maxdepth: 1
firstexample.rst
examples/index.rst
kaleidoscope/index.rst

View file

@ -1,36 +0,0 @@
A First Function
==================
.. code-block:: python
#!/usr/bin/env python
from llvm.core import *
#create a module
module = Module.new("tut1")
#create a function type taking 3 32-bit integers, return a 32-bit integer
ty_int = Type.int(32)
func_type = Type.function(ty_int, (ty_int,)*3)
#create a function of that type
mul_add = Function.new (module, func_type, "mul_add")
mul_add.calling_convention = CC_C
x = mul_add.args[0]; x.name = "x"
y = mul_add.args[1]; y.name = "y"
z = mul_add.args[2]; z.name = "z"
#implement the function
#new block
blk = mul_add.append_basic_block("entry")
#IR builder
bldr = Builder.new(blk)
tmp_1 = bldr.mul(x, y, "tmp_1")
tmp_2 = bldr.add(tmp_1, z, "tmp_2")
bldr.ret(tmp_2)
print module

View file

@ -1,55 +0,0 @@
A More Complicated Function
====================
.. code-block:: python
#!/usr/bin/env python
from llvm.core import *
#create a module
module = Module.new("tut2")
#create a function type taking 2 integers, return a 32-bit integer
ty_int = Type.int(32)
func_type = Type.function(ty_int, (ty_int, ty_int))
#create a function of that type
gcd = Function.new(module, func_type, "gcd")
#name function args
x = gcd.args[0]; x.name = "x"
y = gcd.args[1]; y.name = "y"
#implement the function
#blocks...
entry = gcd.append_basic_block("entry")
ret = gcd.append_basic_block("return")
cond_false = gcd.append_basic_block("cond_false")
cond_true = gcd.append_basic_block("cond_true")
cond_false_2 = gcd.append_basic_block("cond_false_2")
#create a llvm::IRBuilder
bldr = Builder.new(entry)
x_eq_y = bldr.icmp(IPRED_EQ, x, y, "tmp")
bldr.cbranch(x_eq_y, ret, cond_false)
bldr.position_at_end (ret)
bldr.ret(x)
bldr.position_at_end(cond_false)
x_lt_y = bldr.icmp(IPRED_ULT, x, y, "tmp")
bldr.cbranch(x_lt_y, cond_true, cond_false_2)
bldr.position_at_end(cond_true)
y_sub_x = bldr.sub(y, x, "tmp")
recur_1 = bldr.call(gcd, (x, y_sub_x,), "tmp")
bldr.ret(recur_1)
bldr.position_at_end(cond_false_2)
x_sub_y = bldr.sub(x, y, "x_sub_y")
recur_2 = bldr.call(gcd, (x_sub_y, y,), "tmp")
bldr.ret(recur_2)
print module

View file

@ -1,10 +0,0 @@
LLVM Tutorials
==============
The following JIT tutorials were contributed by Sebastien Binet.
.. toctree::
:titlesonly:
JITTutorial1.rst
JITTutorial2.rst

View file

@ -1,125 +0,0 @@
Examples
========
A Simple Function
-----------------
Let's create a (LLVM) module containing a single function, corresponding
to the ``C`` function:
.. code-block:: c
int sum(int a, int b)
{
return a + b;
}
Here's how it looks in llvmpy:
.. code-block:: python
#!/usr/bin/env python
# Import the llvmpy modules.
from llvm import *
from llvm.core import *
# Create an (empty) module.
my_module = Module.new('my_module')
# All the types involved here are "int"s. This type is represented
# by an object of the llvm.core.Type class:
ty_int = Type.int() # by default 32 bits
# We need to represent the class of functions that accept two integers
# and return an integer. This is represented by an object of the
# function type (llvm.core.FunctionType):
ty_func = Type.function(ty_int, [ty_int, ty_int])
# Now we need a function named 'sum' of this type. Functions are not
# free-standing (in llvmpy); it needs to be contained in a module.
f_sum = my_module.add_function(ty_func, "sum")
# Let's name the function arguments as 'a' and 'b'.
f_sum.args[0].name = "a"
f_sum.args[1].name = "b"
# Our function needs a "basic block" -- a set of instructions that
# end with a terminator (like return, branch etc.). By convention
# the first block is called "entry".
bb = f_sum.append_basic_block("entry")
# Let's add instructions into the block. For this, we need an
# instruction builder:
builder = Builder.new(bb)
# OK, now for the instructions themselves. We'll create an add
# instruction that returns the sum as a value, which we'll use
# a ret instruction to return.
tmp = builder.add(f_sum.args[0], f_sum.args[1], "tmp")
builder.ret(tmp)
# We've completed the definition now! Let's see the LLVM assembly
# language representation of what we've created:
print my_module
Here is the output:
.. code-block:: llvm
; ModuleID = 'my_module'
define i32 @sum(i32 %a, i32 %b) {
entry:
%tmp = add i32 %a, %b ; <i32> [#uses=1]
ret i32 %tmp
}
Adding JIT Compilation
----------------------
Let's compile this function in-memory and run it.
.. code-block:: python
#!/usr/bin/env python
# Import the llvmpy modules.
from llvm import *
from llvm.core import *
from llvm.ee import * # new import: ee = Execution Engine
#Create a module, as in the previous example.
my_module = Module.new('my_module')
ty_int = Type.int() # by default 32 bits
ty_func = Type.function(ty_int, [ty_int, ty_int])
f_sum = my_module.add_function(ty_func, "sum")
f_sum.args[0].name = "a"
f_sum.args[1].name = "b"
bb = f_sum.append_basic_block("entry")
builder = Builder.new(bb)
tmp = builder.add(f_sum.args[0], f_sum.args[1], "tmp")
builder.ret(tmp)
# Create an execution engine object. This will create a JIT compiler
# on platforms that support it, or an interpreter otherwise.
ee = ExecutionEngine.new(my_module)
# The arguments needs to be passed as "GenericValue" objects.
arg1 = GenericValue.int(ty_int, 100)
arg2 = GenericValue.int(ty_int, 42)
# Now let's compile and run!
retval = ee.run_function(f_sum, [arg1, arg2])
# The return value is also GenericValue. Let's print it.
print "returned", retval.as_int()
And here's the output:
::
returned 142

View file

@ -1,149 +0,0 @@
+--------------------+
| layout: page |
+--------------------+
| title: Functions |
+--------------------+
Functions are represented by
`llvm.core.Function <llvm.core.Function.html>`_ objects. They are
contained within modules, and can be created either with the method
``module_obj.add_function`` or the static constructor ``Function.new``.
References to functions already present in a module can be retrieved via
``module.get_function_named`` or by the static constructor method
``Function.get``. All functions in a module can be enumerated by
iterating over ``module_obj.functions``.
.. code-block:: python
# create a type, representing functions that take
an integer and return # a floating point value. ft = Type.function(
Type.float(), [ Type.int() ] )
# create a function of this type
f1 = module_obj.add_function(ft, "func1")
# or equivalently, like this:
f2 = Function.new(module_obj, ft, "func2")
# get a reference to an existing function
f3 = module_obj.get_function_named("func3")
# or like this:
f4 = Function.get(module_obj, "func4")
# list all function names in a module
for f in module_obj.functions: print f.name
Intrinsic
=========
References to intrinsic functions can be got via the static constructor
``intrinsic``. This returns a ``Function`` object, calling which is
equivalent to invoking the intrinsic. The ``intrinsic`` method has to be
called with a module object, an intrinsic ID (which is a numeric
constant) and a list of the types of arguments (which LLVM uses to
resolve overloaded intrinsic functions).
.. code-block:: python
# get a reference to the llvm.bswap intrinsic
bswap = Function.intrinsic(mod, INTR_BSWAP, [Type.int()])
# call it
builder.call(bswap, [value])
Here, the constant ``INTR_BSWAP``, available from ``llvm.core``,
represents the LLVM intrinsic
`llvm.bswap <http://www.llvm.org/docs/LangRef.html#int_bswap>`_. The
``[Type.int()]`` selects the version of ``llvm.bswap`` that has a single
32-bit integer argument. The list of intrinsic IDs defined as integer
constants in ``llvm.core``. These are:
{% include intrinsics.csv %}
There are also target-specific intrinsics (which correspond to that
target's CPU instructions) available, but are omitted here for brevity.
Full list can be seen from
[*intrinsic\_ids.py](https://github.com/numba/llvmpy/blob/master/llvm/*\ intrinsic\_ids.py).
See the `LLVM Language
Reference <http://www.llvm.org/docs/LangRef.html>`_ for more information
on the intrinsics, and the
`test <https://github.com/numba/llvmpy/blob/master/test/intrinsic.py>`_
directory in the source distribution for more examples. The intrinsic ID
can be retrieved from a function object with the read-only property
``intrinsic_id``.
**Auto-generation of Intrinsic IDs**
A script (tool/intrgen.py in source tree) generates the intrinsic
IDs automatically. This is necessary when compiling llvmpy with a
different version of LLVM.
Calling Convention # {#callconv}
================================
The function's calling convention can be set using the
``calling_convention`` property. The following (integer) constants
defined in ``llvm.core`` can be used as values:
Value \| Equivalent LLVM Assembly Keyword \|
------\|----------------------------------\| ``CC_C`` \| ``ccc`` \|
``CC_FASTCALL`` \| ``fastcc`` \| ``CC_COLDCALL`` \| ``coldcc`` \|
``CC_X86_STDCALL`` \| ``x86_stdcallcc`` \| ``CC_X86_FASTCALL`` \|
``x86_fastcallcc`` \|
See the `LLVM docs <http://www.llvm.org/docs/LangRef.html#callingconv>`_
for more information on each. Backend-specific numbered conventions can
be directly passed as integers.
An arbitrary string identifying which garbage collector to use can be
set or got with the property ``collector``.
The value objects corresponding to the arguments of a function can be
got using the read-only property ``args``. These can be iterated over,
and also be indexed via integers. An example:
.. code-block:: python
# list all argument names and types for arg in
fn.args: print arg.name, "of type", arg.type
# change the name of the first argument
fn.args[0].name = "objptr"
Basic blocks (see later) are contained within functions. When newly
created, a function has no basic blocks. They have to be added
explicitly, using the ``append_basic_block`` method, which adds a new,
empty basic block as the last one in the function. The first basic block
of the function can be retrieved using the ``get_entry_basic_block``
method. The existing basic blocks can be enumerated by iterating over
using the read-only property ``basic_blocks``. The number of basic
blocks can be got via ``basic_block_count`` method. Note that
``get_entry_basic_block`` is slightly faster than ``basic_blocks[0]``
and so is ``basic_block_count``, over ``len(f.basic_blocks)``.
.. code-block:: python
# add a basic block b1 =
fn.append_basic_block("entry")
# get the first one
b2 = fn.get_entry_basic_block() b2 = fn.basic_mdblocks[0] # slower
than previous method
# print names of all basic blocks
for b in fn.basic_blocks: print b.name
# get number of basic blocks
n = fn.basic_block_count n = len(fn.basic_blocks) # slower than
previous method

View file

@ -1,112 +0,0 @@
Introduction
============
`LLVM <http://www.llvm.org/>`_ (Low-Level Virtual Machine) provides
enough infrastructure to use it as the backend for your compiled, or
JIT-compiled language. It provides extensive optimization support, and
static and dynamic (JIT) backends for many platforms. See the website at
http://www.llvm.org/ to discover more.
Python bindings for LLVM provides a gentler learning curve for working
with the LLVM APIs. It should also be easier to create working
prototypes and experimental languages using this medium.
Together with `clang <http://clang.llvm.org/>`_ or
`llvm-gcc <http://llvm.org/cmds/llvmgcc.html>`_ it also a provides a
means to quickly instrument C and C++ sources. For e.g., llvm-gcc can be
used to generate the LLVM assembly for a given C source file, which can
then be loaded and manipulated (adding profiling code to every function,
say) using a llvmpy based Python script.
License
-------
Both LLVM and llvmpy are distributed under (different) permissive open
source licenses. llvmpy uses the `new BSD
license <http://opensource.org/licenses/bsd-license.php>`_. More
information is available
`here <https://github.com/numba/llvmpy/blob/master/LICENSE>`_.
Platforms
---------
llvmpy has been built/tested/reported to work on various GNU/Linux
flavours, BSD, Mac OS X; on i386 and amd64 architectures. Windows is not
supported, for a variety of reasons.
Versions
--------
llvmpy 0.8.2 requires version 3.1 of LLVM. It may not work with
previous versions.
llvmpy has been built and tested with Python 2.7. It should work with
earlier versions. It has not been tried with Python 3.x (patches
welcome).
Installation
============
The Git repo of llvmpy is at https://github.com/numba/llvmpy.git.
You'll need to build and install it before it can be used. At least the
following will be required for this:
- C and C++ compilers (gcc/g++)
- Python itself
- Python development files (headers and libraries)
- LLVM, either installed or built
On debian-based systems, the first three can be installed with the
command ``sudo apt-get install gcc g++ python python-dev``. Ensure that
your distro's repository has the appropriate version of LLVM!
It does not matter which compiler LLVM itself was built with (``g++``,
``llvm-g++`` or any other); llvmpy can be built with any compiler. It
has been tried only with gcc/g++ though.
LLVM and ``--enable-pic``
-------------------------
The result of an LLVM build is a set of static libraries and object
files. The llvmpy contains an extension package that is built into a
shared object (\_core.so) which links to these static libraries and
object files. It is therefore required that the LLVM libraries and
object files be built with the ``-fPIC`` option (generate position
independent code). Be sure to use the ``--enable-pic`` option while
configuring LLVM (default is no PIC), like this:
.. code-block:: bash
$ ~/llvm ./configure --enable-pic --enable-optimized
llvm-config
-----------
In order to build llvmpy, it's build script needs to know from where it
can invoke the llvm helper program, ``llvm-config``. If you've installed
LLVM, then this will be available in your ``PATH``, and nothing further
needs to be done. If you've built LLVM yourself, or for any reason
``llvm-config`` is not in your ``PATH``, you'll need to pass the full
path of ``llvm-config`` to the build script.
You'll need to be 'root' to install llvmpy. Remember that your ``PATH``
is different from that of 'root', so even if ``llvm-config`` is in your
``PATH``, it may not be available when you do ``sudo``.
Steps
-----
Get 3.1 version of LLVM, build it. Make sure '--enable-pic' is passed to
LLVM's 'configure'.
Get llvmpy and install it:
.. code-block:: bash
$ git clone git@github.com:numba/llvmpy.git
$ cd llvmpy
$ python setup.py install

View file

@ -1,323 +0,0 @@
*************************************************
Chapter 1: Tutorial Introduction and the Lexer
*************************************************
Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Max
Shawabkeh <http://max99x.com>`_
Introduction
=========
Welcome to the "Implementing a language with LLVM" tutorial. This
tutorial runs through the implementation of a simple language, showing
how fun and easy it can be. This tutorial will get you up and started as
well as help to build a framework you can extend to other languages. The
code in this tutorial can also be used as a playground to hack on other
LLVM specific things.
It is useful to point out ahead of time that this tutorial is really
about teaching compiler techniques and LLVM specifically, *not* about
teaching modern and sane software engineering principles. In practice,
this means that we'll take a number of shortcuts to simplify the
exposition. If you dig in and use the code as a basis for future
projects, fixing its deficiencies shouldn't be hard.
We've tried to put this tutorial together in a way that makes chapters
easy to skip over if you are already familiar with or are uninterested
in the various pieces. The structure of the tutorial is:
- **`Chapter 1 <#language>`_: Introduction to the Kaleidoscope
language, and the definition of its Lexer** -- This shows where we
are going and the basic functionality that we want it to do. In order
to make this tutorial maximally understandable and hackable, we
choose to implement everything in Python instead of using lexer and
parser generators. LLVM obviously works just fine with such tools,
feel free to use one if you prefer.
- **`Chapter 2 <PythonLangImpl2.html>`_: Implementing a Parser and
AST** -- With the lexer in place, we can talk about parsing
techniques and basic AST construction. This tutorial describes
recursive descent parsing and operator precedence parsing. Nothing in
Chapters 1 or 2 is LLVM-specific, the code doesn't even import the
LLVM modules at this point. :)
- **`Chapter 3 <PythonLangImpl3.html>`_: Code generation to LLVM IR**
-- With the AST ready, we can show off how easy generation of LLVM IR
really is.
- **`Chapter 4 <PythonLangImpl4.html>`_: Adding JIT and Optimizer
support** -- Because a lot of people are interested in using LLVM as
a JIT, we'll dive right into it and show you the 3 lines it takes to
add JIT support. LLVM is also useful in many other ways, but this is
one simple and "sexy" way to shows off its power. :)
- **`Chapter 5 <PythonLangImpl5.html>`_: Extending the Language:
Control Flow** -- With the language up and running, we show how to
extend it with control flow operations (if/then/else and a 'for'
loop). This gives us a chance to talk about simple SSA construction
and control flow.
- **`Chapter 6 <PythonLangImpl6.html>`_: Extending the Language:
User-defined Operators** -- This is a silly but fun chapter that
talks about extending the language to let the user program define
their own arbitrary unary and binary operators (with assignable
precedence!). This lets us build a significant piece of the
"language" as library routines.
- **`Chapter 7 <PythonLangImpl7.html>`_: Extending the Language:
Mutable Variables** -- This chapter talks about adding user-defined
local variables along with an assignment operator. The interesting
part about this is how easy and trivial it is to construct SSA form
in LLVM: no, LLVM does *not* require your front-end to construct SSA
form!
- **`Chapter 8 <PythonLangImpl8.html>`_: Conclusion and other useful
LLVM tidbits** -- This chapter wraps up the series by talking about
potential ways to extend the language, but also includes a bunch of
pointers to info about "special topics" like adding garbage
collection support, exceptions, debugging, support for "spaghetti
stacks", and a bunch of other tips and tricks.
By the end of the tutorial, we'll have written a bit less than 540 lines
of non-comment, non-blank, lines of code. With this small amount of
code, we'll have built up a very reasonable compiler for a non-trivial
language including a hand-written lexer, parser, AST, as well as code
generation support with a JIT compiler. While other systems may have
interesting "hello world" tutorials, I think the breadth of this
tutorial is a great testament to the strengths of LLVM and why you
should consider it if you're interested in language or compiler design.
A note about this tutorial: we expect you to extend the language and
play with it on your own. Take the code and go crazy hacking away at it,
compilers don't need to be scary creatures - it can be a lot of fun to
play with languages!
--------------
The Basic Language # {#language}
================================
This tutorial will be illustrated with a toy language that we'll call
"`Kaleidoscope <http://en.wikipedia.org/wiki/Kaleidoscope>`_\ " (derived
from "meaning beautiful, form, and view"). Kaleidoscope is a procedural
language that allows you to define functions, use conditionals, math,
etc. Over the course of the tutorial, we'll extend Kaleidoscope to
support the if/then/else construct, a for loop, user defined operators,
JIT compilation with a simple command line interface, etc.
Because we want to keep things simple, the only datatype in Kaleidoscope
is a 64-bit floating point type. As such, all values are implicitly
double precision and the language doesn't require type declarations.
This gives the language a very nice and simple syntax. For example, the
following simple example computes `Fibonacci
numbers <http://en.wikipedia.org/wiki/Fibonacci_number>`_:
.. code-block::
# Compute the x'th fibonacci number.
def fib(x):
if x < 3:
return 1
else:
return fib(x-1)+fib(x-2)
# This expression will compute the 40th number.
fib(40)
We also allow Kaleidoscope to call into standard library functions (the
LLVM JIT makes this completely trivial). This means that you can use the
'extern' keyword to define a function before you use it (this is also
useful for mutually recursive functions). For example:
.. code-block::
extern sin(arg);
extern cos(arg);
extern atan2(arg1 arg2);
atan2(sin(0.4), cos(42))
A more interesting example is included in Chapter 6 where we write a
little Kaleidoscope application that
`displays <PythonLangImpl6.html#example>`_ a Mandelbrot Set at various
levels of magnification.
Lets dive into the implementation of this language!
--------------
The Lexer # {#lexer}
====================
When it comes to implementing a language, the first thing needed is the
ability to process a text file and recognize what it says. The
traditional way to do this is to use a
`lexer <http://en.wikipedia.org/wiki/Lexical_analysis>`_" (aka
'scanner') to break the input up into "tokens". Each token returned by
the lexer includes a token type and potentially some metadata (e.g. the
numeric value of a number). First, we define the possibilities:
.. code-block:: python
# The lexer yields one of these types for each token.
class EOFToken(object): pass
class DefToken(object): pass
class ExternToken(object): pass
class IdentifierToken(object):
def __init__(self, name):
self.name = name
class NumberToken(object):
def __init__(self, value):
self.value = value
class CharacterToken(object):
def __init__(self, char):
self.char = char
def __eq__(self, other):
return isinstance(other, CharacterToken) and self.char == other.char
def __ne__(self, other):
return not self == other
Each token yielded by our lexer will be of one of the above types. For
simple tokens that are always the same, like the "def" keyword, the
lexer will yield ``DefToken()``>. Identifiers, numbers and characters,
on the other hand, have extra data, so when the lexer encounteres the
number 123.45, it will emit it as ``NumberToken(123.45)``. An identifier
``foo`` will be emitted as ``IdentifierToken('foo')``. And finally, an
unknown character like '+' will be returned as ``CharacterToken('+')``.
You may notice that we overload the equality and inequality operators
for the characters; this will later simplify character comparisons in
the parser code.
The actual implementation of the lexer is a single function called
``Tokenize``, which takes a string and
`yields <http://docs.python.org/reference/simple_stmts.html#the-yield-statement>`_
tokens. For simplicity, we will use `regular
expressions <http://docs.python.org/library/re.html>`_ to parse out the
tokens. This is terribly inefficient, but perfectly sufficient for our
needs.
First, we define the regular expressions for our tokens. Numbers and
strings of digits, optionally followed by a period and another string of
digits. Identifiers (and keywords) are alphanumeric string starting with
a letter and comments are anything between a hash (``#``) and the end of
the line.
.. code-block:: python
import re
...
# Regular expressions that tokens and comments of our language.
REGEX_NUMBER = re.compile('[0-9]+(?:.[0-9]+)?')
REGEX_IDENTIFIER = re.compile('[a-zA-Z][a-zA-Z0-9]\ *')
REGEX_COMMENT = re.compile('#.*')
Next, let's start defining the ``Tokenize`` function itself. The first
thing we need to do is set up a loop that scans the string, while
ignoring whitespace between tokens:
.. code-block:: python
def Tokenize(string):
while string: # Skip whitespace.
if string[0].isspace():
string = string[1:]
continue
::
...
Next we want to find out what the next token is. For this we run the
regexes we defined above on the remainder of the string. To simplify the
rest of the code, we run all three regexes each time. As mentioned
above, inefficiencies are ignored for the purpose of this tutorial:
.. code-block:: python
# Run regexes.
comment_match = REGEX_COMMENT.match(string)
number_match = REGEX_NUMBER.match(string)
identifier_match = REGEX_IDENTIFIER.match(string)
Now we check if any of the regexes matched. For comments, we simply
ignore the captured match:
.. code-block:: python
# Check if any of the regexes matched and yield
# the appropriate result.
if comment_match:
comment = comment_match.group(0)
string = string[len(comment):]
For numbers, we yield the captured match, converted to a float and
tagged with the appropriate token type:
.. code-block:: python
elif number_match:
number = number_match.group(0)
yield NumberToken(float(number))
string = string[len(number):]
The identifier case is a little more complex. We have to check for
keywords to decide whether we have captured an identifier or a keyword:
.. code-block:: python
elif identifier_match:
identifier = identifier_match.group(0)
# Check if we matched a keyword.
if identifier == 'def':
yield DefToken()
elif identifier == 'extern':
yield ExternToken()
else:
yield IdentifierToken(identifier)
string = string[len(identifier):]
Finally, if we haven't recognized a comment, a number of an identifier,
we yield the current character as an "unknown character" token. This is
used, for example, for operators like ``+`` or ``*``:
.. code-block:: python
else: # Yield the unknown character.
yield CharacterToken(string[0])
string = string[1:]
Once we're done with the loop, we return a final end-of-file token:
.. code-block:: python
yield EOFToken()

View file

@ -1,963 +0,0 @@
***************************************************
Chapter 2: Implementing a Parser and AST
***************************************************
Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Max
Shawabkeh <http://max99x.com>`_
Introduction # {#intro}
=======================
Welcome to Chapter 2 of the `Implementing a language with
LLVM <http://www.llvm.org/docs/tutorial/index.html>`_ tutorial. This
chapter shows you how to use the lexer, built in `Chapter
1 <PythonLangImpl1.html>`_, to build a full
`parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope
language. Once we have a parser, we'll define and build an `Abstract
Syntax Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
The parser we will build uses a combination of `Recursive Descent
Parsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and
`Operator-Precedence
Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
parse the Kaleidoscope language (the latter for binary expressions and
the former for everything else). Before we get to parsing though, lets
talk about the output of the parser: the Abstract Syntax Tree.
--------------
The Abstract Syntax Tree (AST) # {#ast}
=======================================
The AST for a program captures its behavior in such a way that it is
easy for later stages of the compiler (e.g. code generation) to
interpret. We basically want one object for each construct in the
language, and the AST should closely model the language. In
Kaleidoscope, we have expressions, a prototype, and a function object.
We'll start with expressions first:
.. code-block:: python
# Base class for all expression nodes. class
ExpressionNode(object): pass
# Expression class for numeric literals like "1.0".
class NumberExpressionNode(ExpressionNode): def **init**\ (self, value):
self.value = value
The code above shows the definition of the base ExpressionNode class and
one subclass which we use for numeric literals. The important thing to
note about this code is that the NumberExpressionNode class captures the
numeric value of the literal as an instance variable. This allows later
phases of the compiler to know what the stored numeric value is.
Right now we only create the AST, so there are no useful methods on
them. It would be very easy to add a virtual method to pretty print the
code, for example. Here are the other expression AST node definitions
that we'll use in the basic form of the Kaleidoscope language:
.. code-block:: python
# Expression class for referencing a variable,
like "a". class VariableExpressionNode(ExpressionNode): def
**init**\ (self, name): self.name = name
# Expression class for a binary operator.
class BinaryOperatorExpressionNode(ExpressionNode): def **init**\ (self,
operator, left, right): self.operator = operator self.left = left
self.right = right
# Expression class for function calls.
class CallExpressionNode(ExpressionNode): def **init**\ (self, callee,
args): self.callee = callee self.args = args
This is all (intentionally) rather straight-forward: variables capture
the variable name, binary operators capture their opcode (e.g. '+'), and
calls capture a function name as well as a list of any argument
expressions. One thing that is nice about our AST is that it captures
the language features without talking about the syntax of the language.
Note that there is no discussion about precedence of binary operators,
lexical structure, etc.
For our basic language, these are all of the expression nodes we'll
define. Because it doesn't have conditional control flow, it isn't
Turing-complete; we'll fix that in a later installment. The two things
we need next are a way to talk about the interface to a function, and a
way to talk about functions themselves:
.. code-block:: python
# This class represents the "prototype" for a
function, which captures its name, # and its argument names (thus
implicitly the number of arguments the function # takes). class
PrototypeNode(object): def **init**\ (self, name, args): self.name =
name self.args = args
# This class represents a function definition itself.
class FunctionNode(object): def **init**\ (self, prototype, body):
self.prototype = prototype self.body = body
In Kaleidoscope, functions are typed with just a count of their
arguments. Since all values are double precision floating point, the
type of each argument doesn't need to be stored anywhere. In a more
aggressive and realistic language, the ``ExpressionNode`` class would
probably have a type field.
With this scaffolding, we can now talk about parsing expressions and
function bodies in Kaleidoscope.
--------------
Parser Basics # {#parserbasics}
===============================
Now that we have an AST to build, we need to define the parser code to
build it. The idea here is that we want to parse something like
``x + y`` (which is returned as three tokens by the lexer) into an AST
that could be generated with calls like this:
.. code-block:: python
x = VariableExpressionNode('x') y =
VariableExpressionNode('y') result = BinaryOperatorExpressionNode('+',
x, y)
In order to do this, we'll start by defining a lightweight ``Parser``
class with some basic helper routines:
.. code-block:: python
class Parser(object):
def **init**\ (self, tokens, binop_precedence): self.tokens = tokens
self.binop_precedence = binop_precedence self.Next()
# Provide a simple token buffer. Parser.current is the current token the
# parser is looking at. Parser.Next() reads another token from the lexer
and # updates Parser.current with its results. def Next(self):
self.current = self.tokens.next()
This implements a simple token buffer around the lexer. This allows us
to look one token ahead at what the lexer is returning. Every function
in our parser will assume that ``self.current`` is the current token
that needs to be parsed. Note that the first token is read as soon as
the parser is instantiated. Let us ignore the ``binop_precedence``
parameter for now. It will be explained when we start `parsing binary
operators <#parserbinops>`_.
With these basic helper functions, we can implement the first piece of
our grammar: numeric literals.
--------------
Basic Expression Parsing # {#parserprimexprs}
=============================================
We start with numeric literals, because they are the simplest to
process. For each production in our grammar, we'll define a function
which parses that production. For numeric literals, we have:
.. code-block:: python
# numberexpr ::= number def
ParseNumberExpr(self): result = NumberExpressionNode(self.current.value)
self.Next() # consume the number. return result
This method is very simple: it expects to be called when the current
token is a ``NumberToken``. It takes the current number value, creates a
``NumberExpressionNode``, advances to the next token, and finally
returns.
There are some interesting aspects to this. The most important one is
that this routine eats all of the tokens that correspond to the
production and returns the lexer buffer with the next token (which is
not part of the grammar production) ready to go. This is a fairly
standard way to go for recursive descent parsers. For a better example,
the parenthesis operator is defined like this:
.. code-block:: python
# parenexpr ::= '(' expression ')' def
ParseParenExpr(self): self.Next() # eat '('.
::
contents = self.ParseExpression()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")".')
self.Next() # eat ')'.
return contents
This function illustrates an interesting aspect of the parser. The
function uses recursion by calling ``ParseExpression`` (we will soon see
that ``ParseExpression`` can call ``ParseParenExpr``). This is powerful
because it allows us to handle recursive grammars, and keeps each
production very simple. Note that parentheses do not cause construction
of AST nodes themselves. While we could do it this way, the most
important role of parentheses are to guide the parser and provide
grouping. Once the parser constructs the AST, parentheses are not
needed.
The next simple production is for handling variable references and
function calls:
.. code-block:: python
# identifierexpr ::= identifier \| identifier '('
expression\* ')' def ParseIdentifierExpr(self): identifier_name =
self.current.name self.Next() # eat identifier.
::
if self.current != CharacterToken('('): # Simple variable reference.
return VariableExpressionNode(identifier_name);
# Call.
self.Next() # eat '('.
args = []
if self.current != CharacterToken(')'):
while True:
args.append(self.ParseExpression())
if self.current == CharacterToken(')'):
break
elif self.current != CharacterToken(','):
raise RuntimeError('Expected ")" or "," in argument list.')
self.Next()
self.Next() # eat ')'.
return CallExpressionNode(identifier_name, args)
This routine follows the same style as the other routines. It expects to
be called if the current token is an ``IdentifierToken``. It also has
recursion and error handling. One interesting aspect of this is that it
uses *look-ahead* to determine if the current identifier is a stand
alone variable reference or if it is a function call expression. It
handles this by checking to see if the token after the identifier is a
'(' token, constructing either a ``VariableExpressionNode`` or
``CallExpressionNode`` as appropriate.
Now that we have all of our simple expression-parsing logic in place, we
can define a helper function to wrap it together into one entry point.
We call this class of expressions "primary" expressions, for reasons
that will become more clear `later in the
tutorial <PythonLangImpl6.html#unary>`_. In order to parse an arbitrary
primary expression, we need to determine what sort of expression it is:
.. code-block:: python
# primary ::= identifierexpr \| numberexpr \|
parenexpr def ParsePrimary(self): if isinstance(self.current,
IdentifierToken): return self.ParseIdentifierExpr() elif
isinstance(self.current, NumberToken): return self.ParseNumberExpr();
elif self.current == CharacterToken('('): return self.ParseParenExpr()
else: raise RuntimeError('Unknown token when expecting an expression.')
Now that you see the definition of this function, it is more obvious why
we can assume the state of ``Parser.current`` in the various functions.
This uses look-ahead to determine which sort of expression is being
inspected, and then parses it with a function call.
Now that basic expressions are handled, we need to handle binary
expressions. They are a bit more complex.
--------------
Binary Expression Parsing # {#parserbinops}
===========================================
Binary expressions are significantly harder to parse because they are
often ambiguous. For example, when given the string ``x+y*z``, the
parser can choose to parse it as either ``(x+y)*z`` or ``x+(y*z)``. With
common definitions from mathematics, we expect the later parse, because
``*`` (multiplication) has higher *precedence* than ``+`` (addition).
There are many ways to handle this, but an elegant and efficient way is
to use `Operator-Precedence
Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
This parsing technique uses the precedence of binary operators to guide
recursion. To start with, we need a table of precedences. Remember the
``binop_precedence`` parameter we passed to the ``Parser`` constructor?
Now is the time to use it:
.. code-block:: python
def main(): # Install standard binary operators.
# 1 is lowest possible precedence. 40 is the highest.
operator_precedence = { '<': 10, '+': 20, '-': 20, '\*': 40 }
# Run the main ``interpreter loop``. while True:
::
...
parser = Parser(Tokenize(raw), operator_precedence)
For the basic form of Kaleidoscope, we will only support 4 binary
operators (this can obviously be extended by you, our brave and intrepid
reader). Having a dictionary makes it easy to add new operators and
makes it clear that the algorithm doesn't depend on the specific
operators involved, but it would be easy enough to eliminate the map and
hardcode the comparisons.
We also define a helper function to get the precedence of the current
token, or -1 if the token is not a binary operator:
.. code-block:: python
# Gets the precedence of the current token, or -1
if the token is not a binary # operator. def
GetCurrentTokenPrecedence(self): if isinstance(self.current,
CharacterToken): return self.binop_precedence.get(self.current.char,
-1) else: return -1
With the helper above defined, we can now start parsing binary
expressions. The basic idea of operator precedence parsing is to break
down an expression with potentially ambiguous binary operators into
pieces. Consider, for example, the expression ``a+b+(c+d)*e*f+g``.
Operator precedence parsing considers this as a stream of primary
expressions separated by binary operators. As such, it will first parse
the leading primary expression ``a``, then it will see the pairs
``[+, b] [+, (c+d)] [*, e] [*, f] and [+, g]``. Note that because
parentheses are primary expressions, the binary expression parser
doesn't need to worry about nested subexpressions like (c+d) at all.
To start, an expression is a primary expression potentially followed by
a sequence of ``[binop,primaryexpr]`` pairs:
.. code-block:: python
# expression ::= primary binoprhs def
ParseExpression(self): left = self.ParsePrimary() return
self.ParseBinOpRHS(left, 0)
``ParseBinOpRHS`` is the function that parses the sequence of pairs for
us. It takes a precedence and a pointer to an expression for the part
that has been parsed so far. Note that ``x`` is a perfectly valid
expression: As such, ``binoprhs`` is allowed to be empty, in which case
it returns the expression that is passed into it. In our example above,
the code passes the expression for ``a`` into ``ParseBinOpRHS`` and the
current token is ``+``.
The precedence value passed into ``ParseBinOpRHS`` indicates the \*
minimal operator precedence\* that the function is allowed to eat. For
example, if the current pair stream is ``[+, x]`` and ``ParseBinOpRHS``
is passed in a precedence of 40, it will not consume any tokens (because
the precedence of '+' is only 20). With this in mind, ``ParseBinOpRHS``
starts with:
.. code-block:: python
# binoprhs ::= (operator primary)\* def
ParseBinOpRHS(self, left, left_precedence): # If this is a binary
operator, find its precedence. while True: precedence =
self.GetCurrentTokenPrecedence()
::
# If this is a binary operator that binds at least as tightly as the
# current one, consume it; otherwise we are done.
if precedence < left_precedence:
return left
This code gets the precedence of the current token and checks to see if
if is too low. Because we defined invalid tokens to have a precedence of
-1, this check implicitly knows that the pair-stream ends when the token
stream runs out of binary operators. If this check succeeds, we know
that the token is a binary operator and that it will be included in this
expression:
.. code-block:: python
binary_operator = self.current.char self.Next()
# eat the operator.
::
# Parse the primary expression after the binary operator.
right = self.ParsePrimary()
As such, this code eats (and remembers) the binary operator and then
parses the primary expression that follows. This builds up the whole
pair, the first of which is ``[+, b]`` for the running example.
Now that we parsed the left-hand side of an expression and one pair of
the RHS sequence, we have to decide which way the expression associates.
In particular, we could have ``(a+b) binop unparsed`` or
``a + (b binop unparsed)``. To determine this, we look ahead at
``binop`` to determine its precedence and compare it to BinOp's
precedence (which is '+' in this case):
.. code-block:: python
# If binary_operator binds less tightly with
right than the operator after # right, let the pending operator take
right as its left. next_precedence = self.GetCurrentTokenPrecedence()
if precedence < next_precedence:
If the precedence of the binop to the right of ``RHS`` is lower or equal
to the precedence of our current operator, then we know that the
parentheses associate as ``(a+b) binop ...``. In our example, the
current operator is ``+`` and the next operator is ``+``, we know that
they have the same precedence. In this case we'll create the AST node
for ``a+b``, and then continue parsing:
.. code-block:: python
if precedence < next_precedence: ... if body
omitted ...
::
# Merge left/right.
left = BinaryOperatorExpressionNode(binary_operator, left, right);
In our example above, this will turn ``a+b+`` into ``(a+b)`` and execute
the next iteration of the loop, with ``+`` as the current token. The
code above will eat, remember, and parse ``(c+d)`` as the primary
expression, which makes the current pair equal to ``[+, (c+d)]``. It
will then evaluate the 'if' conditional above with ``*`` as the binop to
the right of the primary. In this case, the precedence of ``*`` is
higher than the precedence of ``+`` so the if condition will be entered.
The critical question left here is
``how can the if condition parse the right hand side in full``? In
particular, to build the AST correctly for our example, it needs to get
all of ``( c + d ) * e * f`` as the RHS expression variable. The code to
do this is surprisingly simple (code from the above two blocks
duplicated for context):
.. code-block:: python
# If binary_operator binds less tightly with
right than the operator after # right, let the pending operator take
right as its left. next_precedence = self.GetCurrentTokenPrecedence()
if precedence < next_precedence: right = self.ParseBinOpRHS(right,
precedence + 1)
::
# Merge left/right.
left = BinaryOperatorExpressionNode(binary_operator, left, right)
At this point, we know that the binary operator to the RHS of our
primary has higher precedence than the binop we are currently parsing.
As such, we know that any sequence of pairs whose operators are all
higher precedence than ``+`` should be parsed together and returned as
``RHS``. To do this, we recursively invoke the ``ParseBinOpRHS``
function specifying ``precedence + 1`` as the minimum precedence
required for it to continue. In our example above, this will cause it to
return the AST node for ``(c+d)*e*f`` as RHS, which is then set as the
RHS of the '+' expression.
Finally, on the next iteration of the while loop, the ``+g`` piece is
parsed and added to the AST. With this little bit of code (11
non-trivial lines), we correctly handle fully general binary expression
parsing in a very elegant way. This was a whirlwind tour of this code,
and it is somewhat subtle. I recommend running through it with a few
tough examples to see how it works.
This wraps up handling of expressions. At this point, we can point the
parser at an arbitrary token stream and build an expression from it,
stopping at the first token that is not part of the expression. Next up
we need to handle function definitions, etc.
--------------
Parsing the Rest # {#parsertop}
===============================
The next thing missing is handling of function prototypes. In
Kaleidoscope, these are used both for 'extern' function declarations as
well as function body definitions. The code to do this is
straight-forward and not very interesting (once you've survived
expressions):
.. code-block:: python
# prototype ::= id '(' id\* ')' def
ParsePrototype(self): if not isinstance(self.current, IdentifierToken):
raise RuntimeError('Expected function name in prototype.')
::
function_name = self.current.name
self.Next() # eat function name.
if self.current != CharacterToken('('):
raise RuntimeError('Expected "(" in prototype.')
self.Next() # eat '('.
arg_names = []
while isinstance(self.current, IdentifierToken):
arg_names.append(self.current.name)
self.Next()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")" in prototype.')
# Success.
self.Next() # eat ')'.
return PrototypeNode(function_name, arg_names)
Given this, a function definition is very simple, just a prototype plus
an expression to implement the body:
.. code-block:: python
# definition ::= 'def' prototype expression def
ParseDefinition(self): self.Next() # eat def. proto =
self.ParsePrototype() body = self.ParseExpression() return
FunctionNode(proto, body)
In addition, we support 'extern' to declare functions like 'sin' and
'cos' as well as to support forward declaration of user functions. These
'extern's are just prototypes with no body:
.. code-block:: python
# external ::= 'extern' prototype def
ParseExtern(self): self.Next() # eat extern. return
self.ParsePrototype()
Finally, we'll also let the user type in arbitrary top-level expressions
and evaluate them on the fly. We will handle this by defining anonymous
nullary (zero argument) functions for them:
.. code-block:: python
# toplevelexpr ::= expression def
ParseTopLevelExpr(self): proto = PrototypeNode('', []) return
FunctionNode(proto, self.ParseExpression())
Now that we have all the pieces, let's build a little driver that will
let us actually *execute* this code we've built!
--------------
The Driver # {#driver}
======================
The driver for this simply invokes all of the parsing pieces with a
top-level dispatch loop. There isn't much interesting here, so I'll just
include the top-level loop. See `below <#code>`_ for full code.
.. code-block:: python
# Run the main "interpreter loop". while True:
print 'ready>', try: raw = raw_input() except KeyboardInterrupt: return
::
parser = Parser(Tokenize(raw), operator_precedence)
while True:
# top ::= definition | external | expression | EOF
if isinstance(parser.current, EOFToken):
break
if isinstance(parser.current, DefToken):
parser.HandleDefinition()
elif isinstance(parser.current, ExternToken):
parser.HandleExtern()
else:
parser.HandleTopLevelExpression()
Here we create a new ``Parser`` for each line read, and try to parse out
all the expressions, declarations and definitions in the line. We also
allow the user to quit using Ctrl+C.
--------------
Conclusions # {#conclusions}
============================
With just under 330 lines of commented code (200 lines of non-comment,
non-blank code), we fully defined our minimal language, including a
lexer, parser, and AST builder. With this done, the executable will
validate Kaleidoscope code and tell us if it is grammatically invalid.
For example, here is a sample interaction:
.. code-block:: python
$ python kaleidoscope.py ready> def foo(x y)
x+foo(y, 4.0) Parsed a function definition. ready> def foo(x y) x+y y
Parsed a function definition. Parsed a top-level expression. ready> def
foo(x y) x+y ) Parsed a function definition. Error: Unknown token when
expecting an expression. ready> extern sin(a); Parsed an extern. ready>
^C $
There is a lot of room for extension here. You can define new AST nodes,
extend the language in many ways, etc. In the `next
installment <PythonLangImpl3.html>`_, we will describe how to generate
LLVM Intermediate Representation (IR) from the AST.
--------------
Full Code Listing # {#code}
===========================
Here is the complete code listing for this and the previous chapter.
Note that it is fully self-contained: you don't need LLVM or any
external libraries at all for this.
.. code-block:: python
#!/usr/bin/env python
import re
Lexer
-----
# The lexer yields one of these types for each token.
class EOFToken(object): pass
class DefToken(object): pass
class ExternToken(object): pass
class IdentifierToken(object): def **init**\ (self, name): self.name =
name
class NumberToken(object): def **init**\ (self, value): self.value =
value
class CharacterToken(object): def **init**\ (self, char): self.char =
char def **eq**\ (self, other): return isinstance(other, CharacterToken)
and self.char == other.char def **ne**\ (self, other): return not self
== other
# Regular expressions that tokens and comments of our language.
REGEX_NUMBER = re.compile('[0-9]+(?:.[0-9]+)?') REGEX_IDENTIFIER =
re.compile('[a-zA-Z][a-zA-Z0-9]\ *') REGEX_COMMENT = re.compile('#.*')
def Tokenize(string): while string: # Skip whitespace. if
string[0].isspace(): string = string[1:] continue
::
# Run regexes.
comment_match = REGEX_COMMENT.match(string)
number_match = REGEX_NUMBER.match(string)
identifier_match = REGEX_IDENTIFIER.match(string)
# Check if any of the regexes matched and yield the appropriate result.
if comment_match:
comment = comment_match.group(0)
string = string[len(comment):]
elif number_match:
number = number_match.group(0)
yield NumberToken(float(number))
string = string[len(number):]
elif identifier_match:
identifier = identifier_match.group(0)
# Check if we matched a keyword.
if identifier == 'def':
yield DefToken()
elif identifier == 'extern':
yield ExternToken()
else:
yield IdentifierToken(identifier)
string = string[len(identifier):]
else:
# Yield the ASCII value of the unknown character.
yield CharacterToken(string[0])
string = string[1:]
yield EOFToken()
Abstract Syntax Tree (aka Parse Tree)
-------------------------------------
# Base class for all expression nodes.
class ExpressionNode(object): pass
# Expression class for numeric literals like "1.0".
class NumberExpressionNode(ExpressionNode): def **init**\ (self, value):
self.value = value
# Expression class for referencing a variable, like "a".
class VariableExpressionNode(ExpressionNode): def **init**\ (self,
name): self.name = name
# Expression class for a binary operator.
class BinaryOperatorExpressionNode(ExpressionNode): def **init**\ (self,
operator, left, right): self.operator = operator self.left = left
self.right = right
# Expression class for function calls.
class CallExpressionNode(ExpressionNode): def **init**\ (self, callee,
args): self.callee = callee self.args = args
# This class represents the "prototype" for a function, which captures its name,
# and its argument names (thus implicitly the number of arguments the function
# takes).
class PrototypeNode(object): def **init**\ (self, name, args): self.name
= name self.args = args
# This class represents a function definition itself.
class FunctionNode(object): def **init**\ (self, prototype, body):
self.prototype = prototype self.body = body
Parser
------
class Parser(object):
def **init**\ (self, tokens, binop_precedence): self.tokens = tokens
self.binop_precedence = binop_precedence self.Next()
# Provide a simple token buffer. Parser.current is the current token the
# parser is looking at. Parser.Next() reads another token from the lexer
and # updates Parser.current with its results. def Next(self):
self.current = self.tokens.next()
# Gets the precedence of the current token, or -1 if the token is not a
binary # operator. def GetCurrentTokenPrecedence(self): if
isinstance(self.current, CharacterToken): return
self.binop_precedence.get(self.current.char, -1) else: return -1
# identifierexpr ::= identifier \| identifier '(' expression\* ')' def
ParseIdentifierExpr(self): identifier_name = self.current.name
self.Next() # eat identifier.
::
if self.current != CharacterToken('('): # Simple variable reference.
return VariableExpressionNode(identifier_name)
# Call.
self.Next() # eat '('.
args = []
if self.current != CharacterToken(')'):
while True:
args.append(self.ParseExpression())
if self.current == CharacterToken(')'):
break
elif self.current != CharacterToken(','):
raise RuntimeError('Expected ")" or "," in argument list.')
self.Next()
self.Next() # eat ')'.
return CallExpressionNode(identifier_name, args)
# numberexpr ::= number def ParseNumberExpr(self): result =
NumberExpressionNode(self.current.value) self.Next() # consume the
number. return result
# parenexpr ::= '(' expression ')' def ParseParenExpr(self): self.Next()
# eat '('.
::
contents = self.ParseExpression()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")".')
self.Next() # eat ')'.
return contents
# primary ::= identifierexpr \| numberexpr \| parenexpr def
ParsePrimary(self): if isinstance(self.current, IdentifierToken): return
self.ParseIdentifierExpr() elif isinstance(self.current, NumberToken):
return self.ParseNumberExpr() elif self.current == CharacterToken('('):
return self.ParseParenExpr() else: raise RuntimeError('Unknown token
when expecting an expression.')
# binoprhs ::= (operator primary)\* def ParseBinOpRHS(self, left,
left_precedence): # If this is a binary operator, find its precedence.
while True: precedence = self.GetCurrentTokenPrecedence()
::
# If this is a binary operator that binds at least as tightly as the
# current one, consume it; otherwise we are done.
if precedence < left_precedence:
return left
binary_operator = self.current.char
self.Next() # eat the operator.
# Parse the primary expression after the binary operator.
right = self.ParsePrimary()
# If binary_operator binds less tightly with right than the operator after
# right, let the pending operator take right as its left.
next_precedence = self.GetCurrentTokenPrecedence()
if precedence < next_precedence:
right = self.ParseBinOpRHS(right, precedence + 1)
# Merge left/right.
left = BinaryOperatorExpressionNode(binary_operator, left, right)
# expression ::= primary binoprhs def ParseExpression(self): left =
self.ParsePrimary() return self.ParseBinOpRHS(left, 0)
# prototype ::= id '(' id\* ')' def ParsePrototype(self): if not
isinstance(self.current, IdentifierToken): raise RuntimeError('Expected
function name in prototype.')
::
function_name = self.current.name
self.Next() # eat function name.
if self.current != CharacterToken('('):
raise RuntimeError('Expected "(" in prototype.')
self.Next() # eat '('.
arg_names = []
while isinstance(self.current, IdentifierToken):
arg_names.append(self.current.name)
self.Next()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")" in prototype.')
# Success.
self.Next() # eat ')'.
return PrototypeNode(function_name, arg_names)
# definition ::= 'def' prototype expression def ParseDefinition(self):
self.Next() # eat def. proto = self.ParsePrototype() body =
self.ParseExpression() return FunctionNode(proto, body)
# toplevelexpr ::= expression def ParseTopLevelExpr(self): proto =
PrototypeNode('', []) return FunctionNode(proto, self.ParseExpression())
# external ::= 'extern' prototype def ParseExtern(self): self.Next() #
eat extern. return self.ParsePrototype()
# Top-Level parsing def HandleDefinition(self):
self.Handle(self.ParseDefinition, 'Parsed a function definition.')
def HandleExtern(self): self.Handle(self.ParseExtern, 'Parsed an
extern.')
def HandleTopLevelExpression(self): self.Handle(self.ParseTopLevelExpr,
'Parsed a top-level expression.')
def Handle(self, function, message): try: function() print message
except Exception, e: print 'Error:', e try: self.Next() # Skip for error
recovery. except: pass
Main driver code.
-----------------
def main(): # Install standard binary operators. # 1 is lowest possible
precedence. 40 is the highest. operator_precedence = { '<': 10, '+':
20, '-': 20, '\*': 40 }
# Run the main "interpreter loop". while True: print 'ready>', try: raw
= raw_input() except KeyboardInterrupt: return
::
parser = Parser(Tokenize(raw), operator_precedence)
while True:
# top ::= definition | external | expression | EOF
if isinstance(parser.current, EOFToken):
break
if isinstance(parser.current, DefToken):
parser.HandleDefinition()
elif isinstance(parser.current, ExternToken):
parser.HandleExtern()
else:
parser.HandleTopLevelExpression()
if **name** == '**main**\ ': main()

View file

@ -1,979 +0,0 @@
*******************************************
Chapter 3: Code generation to LLVM IR
*******************************************
Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Max
Shawabkeh <http://max99x.com>`_
Introduction # {#intro}
=======================
Welcome to Chapter 3 of the `Implementing a language with
LLVM <http://www.llvm.org/docs/tutorial/index.html>`_ tutorial. This
chapter shows you how to transform the `Abstract Syntax
Tree <PythonLangImpl2.html>`_, built in Chapter 2, into LLVM IR. This
will teach you a little bit about how LLVM does things, as well as
demonstrate how easy it is to use. It's much more work to build a lexer
and parser than it is to generate LLVM IR code. :)
**Please note**: the code in this chapter and later requires llvmpy 0.6
and LLVM 2.7. Earlier versions will most likely not work with it. Also
note that you need to use a version of this tutorial that matches your
llvmpy release: If you are using an official llvmpy release, use the
version of the documentation on the `llvmpy examples
page <http://www.mdevan.org/llvmpy/examples.html>`_
--------------
Code Generation Setup # {#basics}
=================================
In order to generate LLVM IR, we want some simple setup to get started.
First we define code generation methods in each AST node class:
.. code-block:: python
# Expression class for numeric literals like
"1.0". class NumberExpressionNode(ExpressionNode):
def **init**\ (self, value): self.value = value
def CodeGen(self): ...
# Expression class for referencing a variable, like "a".
class VariableExpressionNode(ExpressionNode):
def **init**\ (self, name): self.name = name
def CodeGen(self): ...
...
The ``CodeGen`` method says to emit IR for that AST node along with all
the things it depends on, and they all return an LLVM Value object.
"Value" is the class used to represent a "`Static Single Assignment
(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
register" or "SSA value" in LLVM. The most distinct aspect of SSA values
is that their value is computed as the related instruction executes, and
it does not get a new value until (and if) the instruction re-executes.
In other words, there is no way to "change" an SSA value. For more
information, please read up on `Static Single
Assignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
- the concepts are really quite natural once you grok them.
We will also need to define some global variables which we will be used
during code generation:
.. code-block:: python
# The LLVM module, which holds all the IR code.
g_llvm_module = Module.new('my cool jit')
# The LLVM instruction builder. Created whenever a new function is entered.
g_llvm_builder = None
# A dictionary that keeps track of which values are defined in the current scope
# and what their LLVM representation is.
g_named_values = {}
``g_llvm_module`` is the LLVM construct that contains all of the
functions and global variables in a chunk of code. In many ways, it is
the top-level structure that the LLVM IR uses to contain code.
``g_llvm_builder`` is a helper object that makes it easy to generate
LLVM instructions. Instances of the
`llvm.core.Builder <llvm.core.Builder.html>`_ class keep track of the
current place to insert instructions and have methods to create new
instructions. Note that we do not initialize this variable; instead, it
will be initialized whenever we start generating code for a function.
Finally, ``g_named_values`` is a dictionary that keeps track of which
values are defined in the current scope and what their LLVM
representation is. In other words, it is a symbol table for the code. In
this form of Kaleidoscope, the only things that can be referenced are
function parameters. As such, function parameters will be in this map
when generating code for their function body.
With these basics in place, we can start talking about how to generate
code for each expression. Note that this assumes that ``g_llvm_builder``
has been set up to generate code *into* something. For now, we'll assume
that this has already been done, and we'll just use it to emit code.
--------------
Expression Code Generation # {#exprs}
=====================================
Generating LLVM code for expression nodes is very straightforward: less
than 35 lines of commented code for all four of our expression nodes.
First we'll do numeric literals:
.. code-block:: python
def CodeGen(self): return
Constant.real(Type.double(), self.value)
In llvmpy, floating point numeric constants are represented with the
``llvm.core.ConstantFP`` class. To create one, we can use the static
``real()`` method in the ``llvm.core.Constant`` class. This code
basically just creates and returns a ``ConstantFP``. Note that in the
LLVM IR constants are all uniqued together and shared. For this reason,
we create the constant through a factory method instead of instantiating
one directly.
.. code-block:: python
def CodeGen(self): if self.name in
g_named_values: return g_named_values[self.name] else: raise
RuntimeError('Unknown variable name: ' + self.name)
References to variables are also quite simple using LLVM. In the simple
version of Kaleidoscope, we assume that the variable has already been
emitted somewhere and its value is available. In practice, the only
values that can be in the ``g_named_values`` dictionary are function
arguments. This code simply checks to see that the specified name is in
the map (if not, an unknown variable is being referenced) and returns
the value for it. In future chapters, we'll add support for `loop
induction variables <PythonLangImpl5.html#for>`_ in the symbol table,
and for `local variables <PythonLangImpl7.html#localvars>`_.
.. code-block:: python
def CodeGen(self): left = self.left.CodeGen()
right = self.right.CodeGen()
::
if self.operator == '+':
return g_llvm_builder.fadd(left, right, 'addtmp')
elif self.operator == '-':
return g_llvm_builder.fsub(left, right, 'subtmp')
elif self.operator == '*':
return g_llvm_builder.fmul(left, right, 'multmp')
elif self.operator == '<':
result = g_llvm_builder.fcmp(FCMP_ULT, left, right, 'cmptmp')
# Convert bool 0 or 1 to double 0.0 or 1.0.
return g_llvm_builder.uitofp(result, Type.double(), 'booltmp')
else:
raise RuntimeError('Unknown binary operator.')
Binary operators start to get more interesting. The basic idea here is
that we recursively emit code for the left-hand side of the expression,
then the right-hand side, then we compute the result of the binary
expression depending on which operator is being used.
In the example above, the LLVM builder class is starting to show its
value. ``g_llvm_builder`` knows where to insert the newly created
instruction, all you have to do is specify what instruction to create
(e.g. with ``add``), which operands to use (``left`` and ``right`` here)
and optionally provide a name for the generated instruction.
One nice thing about LLVM is that the name is just a hint. For instance,
if the code above emits multiple "addtmp" variables, LLVM will
automatically provide each one with an increasing, unique numeric
suffix. Local value names for instructions are purely optional, but it
makes it much easier to read the IR dumps.
`LLVM instructions <http://www.llvm.org/docs/LangRef.html#instref>`_ are
constrained by strict rules: for example, the Left and Right operators
of an `add instruction <http://www.llvm.org/docs/LangRef.html#i_add>`_
must have the same type, and the result type of the add must match the
operand types. Because all values in Kaleidoscope are doubles, this
makes for very simple code for add, sub and mul.
On the other hand, LLVM specifies that the `fcmp
instruction <http://www.llvm.org/docs/LangRef.html#i_fcmp>`_ always
returns an 'i1' value (a one bit integer). The problem with this is that
Kaleidoscope wants the value to be a 0.0 or 1.0 value. In order to get
these semantics, we combine the fcmp instruction with a `uitofp
instruction <http://www.llvm.org/docs/LangRef.html#i_uitofp>`_. This
instruction converts its input integer into a floating point value by
treating the input as an unsigned value. In contrast, if we used the
`sitofp instruction <http://www.llvm.org/docs/LangRef.html#i_sitofp>`_,
the Kaleidoscope ``<`` operator would return 0.0 and -1.0, depending on
the input value.
.. code-block:: python
def CodeGen(self): # Look up the name in the
global module table. callee =
g_llvm_module.get_function_named(self.callee)
::
# Check for argument mismatch error.
if len(callee.args) != len(self.args):
raise RuntimeError('Incorrect number of arguments passed.')
arg_values = [i.CodeGen() for i in self.args]
return g_llvm_builder.call(callee, arg_values, 'calltmp')
Code generation for function calls is quite straightforward with LLVM.
The code above initially does a function name lookup in the LLVM
Module's symbol table. Recall that the LLVM Module is the container that
holds all of the functions we are JIT'ing. By giving each function the
same name as what the user specifies, we can use the LLVM symbol table
to resolve function names for us.
Once we have the function to call, we codegen each argument that is to
be passed in, and create an LLVM `call
instruction <http://www.llvm.org/docs/LangRef.html#i_call>`_. Note that
LLVM uses the native C calling conventions by default, allowing these
calls to also call into standard library functions like "sin" and "cos",
with no additional effort.
This wraps up our handling of the four basic expressions that we have so
far in Kaleidoscope. Feel free to go in and add some more. For example,
by browsing the `LLVM language
reference <http://www.llvm.org/docs/LangRef.html>`_ you'll find several
other interesting instructions that are really easy to plug into our
basic framework.
--------------
Function Code Generation # {#funcs}
===================================
Code generation for prototypes and functions must handle a number of
details, which make their code less beautiful than expression code
generation, but allows us to illustrate some important points. First,
let's talk about code generation for prototypes: they are used both for
function bodies and external function declarations. The code starts
with:
.. code-block:: python
def CodeGen(self): # Make the function type, eg.
double(double,double). funct_type = Type.function( Type.double(),
[Type.double()] \* len(self.args), False)
::
function = Function.new(g_llvm_module, funct_type, self.name)
The call to ``Type.function`` creates the ``FunctionType`` that should
be used for a given Prototype. Since all function arguments in
Kaleidoscope are of type double, the first line creates a list of "N"
LLVM double types. It then uses the ``Type.function`` method to create a
function type that takes "N" doubles as arguments, returns one double as
a result, and that is not vararg (the False parameter indicates this).
Note that Types in LLVM are uniqued just like Constants are, so you
don't instantiate them directly.
The final line above actually creates the function that the prototype
will correspond to. This indicates the type and name to use, as well as
which module to insert into. Note that by default, the function will
have `external
linkage <<http://www.llvm.org/docs/LangRef.html#linkage>`_, which means
that the function may be defined outside the current module and/or that
it is callable by functions outside the module. The name passed in is
the name the user specified: since ``g_llvm_module`` is specified, this
name is registered in ``g_llvm_module``'s symbol table, which is used by
the function call code above.
.. code-block:: python
# If the name conflicted, there was already
something with the same name. # If it has a body, don't allow
redefinition or reextern. if function.name != self.name:
function.delete() function =
g_llvm_module.get_function_named(self.name)
The Module symbol table works just like the Function symbol table when
it comes to name conflicts: if a new function is created with a name was
previously added to the symbol table, it will get implicitly renamed
when added to the Module. The code above exploits this fact to determine
if there was a previous definition of this function.
In Kaleidoscope, we choose to allow redefinitions of functions in two
cases: first, we want to allow 'extern'ing a function more than once, as
long as the prototypes for the externs match (since all arguments have
the same type, we just have to check that the number of arguments
match). Second, we want to allow 'extern'ing a function and then
defining a body for it. This is useful when defining mutually recursive
functions.
In order to implement this, the code above first checks to see if there
is a collision on the name of the function. If so, it deletes the
function we just created (by calling ``delete``) and then calling
``get_function_named`` to get the existing function with the specified
name.
.. code-block:: python
# If the function already has a body, reject
this. if not function.is_declaration: raise RuntimeError('Redefinition
of function.')
::
# If F took a different number of args, reject.
if len(callee.args) != len(self.args):
raise RuntimeError('Redeclaration of a function with different number '
'of args.')
In order to verify the logic above, we first check to see if the
pre-existing function is a forward declaration. Since we don't allow
anything after a full definition of the function, the code rejects this
case. If the previous reference to a function was an 'extern', we simply
verify that the number of arguments for that definition and this one
match up. If not, we emit an error.
.. code-block:: python
# Set names for all arguments and add them to the
variables symbol table. for arg, arg_name in zip(function.args,
self.args): arg.name = arg_name # Add arguments to variable symbol
table. g_named_values[arg_name] = arg
::
return function
The last bit of code for prototypes loops over all of the arguments in
the function, setting the name of the LLVM Argument objects to match,
and registering the arguments in the ``g_named_values`` map for future
use by the ``VariableExpressionNode``. Note that we don't check for
conflicting argument names here (e.g. "extern foo(a b a)"). Doing so
would be very straight-forward with the mechanics we have already used
above. Once this is all set up, it returns the Function object to the
caller.
.. code-block:: python
def CodeGen(self): # Clear scope.
g_named_values.clear()
::
# Create a function object.
function = self.prototype.CodeGen()
Code generation for function definitions starts out simply enough: we
just clear out the ``g_named_values`` dictionary to make sure that there
isn't anything in it from the last function we compiled and codegen the
prototype. Code generation of the prototype ensures that there is an
LLVM Function object that is ready to go for us.
.. code-block:: python
# Create a new basic block to start insertion
into. block = function.append_basic_block('entry') global
g_llvm_builder g_llvm_builder = Builder.new(block) {% endhighlight
%}
Now we get to the point where ``g_llvm_builder`` is set up. The first
line creates a new `basic
block <http://en.wikipedia.org/wiki/Basic_block>`_ (named "entry"),
which is inserted into the function. The second line declares that the
global ``g_llvm_builder`` object is to be changed. The last line creates
a new builder that is set up to insert new instructions into the basic
block we just created. Basic blocks in LLVM are an important part of
functions that define the `Control Flow
Graph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
don't have any control flow, our functions will only contain one block
at this point. We'll fix this in `Chapter 5 <PythonLangImpl5.html>`_ :).
{% highlight python %} # Finish off the function. try: return_value =
self.body.CodeGen() g_llvm_builder.ret(return_value)
::
# Validate the generated code, checking for consistency.
function.verify()
Once the insertion point is set up, we call the ``CodeGen`` method for
the root expression of the function. If no error happens, this emits
code to compute the expression into the entry block and returns the
value that was computed. Assuming no error, we then create an LLVM `ret
instruction <http://www.llvm.org/docs/LangRef.html#i_ret>`_, which
completes the function. Once the function is built, we call ``verify``,
which is provided by LLVM. This function does a variety of consistency
checks on the generated code, to determine if our compiler is doing
everything right. Using this is important: it can catch a lot of bugs.
Once the function is finished and validated, we return it.
.. code-block:: python
except: function.delete() raise
::
return function
The only piece left here is handling of the error case. For simplicity,
we handle this by merely deleting the function we produced with the
``delete`` method. This allows the user to redefine a function that they
incorrectly typed in before: if we didn't delete it, it would live in
the symbol table, with a body, preventing future redefinition.
This code does have a bug, though. Since the ``PrototypeNode::CodeGen``
can return a previously defined forward declaration, our code can
actually delete a forward declaration. There are a number of ways to fix
this bug; see what you can come up with! Here is a testcase:
.. code-block:: python
extern foo(a b) # ok, defines foo. def foo(a b) c
# error, 'c' is invalid. def bar() foo(1, 2) # error, unknown function
"foo"
--------------
Driver Changes and Closing Thoughts # {#driver}
===============================================
For now, code generation to LLVM doesn't really get us much, except that
we can look at the pretty IR calls. The sample code inserts calls to
CodeGen into the ``Handle*`` functions, and then dumps out the LLVM IR.
This gives a nice way to look at the LLVM IR for simple functions. For
example:
.. code-block:: bash
ready> 4+5 Read a top-level expression: define
double @0() { entry: ret double 9.000000e+00 }
Note how the parser turns the top-level expression into anonymous
functions for us. This will be handy when we add JIT support in the next
chapter. Also note that the code is very literally transcribed, no
optimizations are being performed except simple constant folding done by
the Builder. We will add optimizations explicitly in the next chapter.
.. code-block:: bash
ready> def foo(a b) a\ *a + 2*\ a\ *b + b*\ b Read
a function definition: define double @foo(double %a, double %b) { entry:
%multmp = fmul double %a, %a ; [#uses=1] %multmp1 = fmul double
2.000000e+00, %a ; [#uses=1] %multmp2 = fmul double %multmp1, %b ;
[#uses=1] %addtmp = fadd double %multmp, %multmp2 ; [#uses=1] %multmp3 =
fmul double %b, %b ; [#uses=1] %addtmp4 = fadd double %addtmp, %multmp3
; [#uses=1] ret double %addtmp4 }
This shows some simple arithmetic. Notice the striking similarity to the
LLVM builder calls that we use to create the instructions.
.. code-block:: bash
ready> def bar(a) foo(a, 4.0) + bar(31337) Read a
function definition: define double @bar(double %a) { entry: %calltmp =
call double @foo(double %a, double 4.000000e+00) ; [#uses=1] %calltmp1 =
call double @bar(double 3.133700e+04) ; [#uses=1] %addtmp = fadd double
%calltmp, %calltmp1 ; [#uses=1] ret double %addtmp }
This shows some function calls. Note that this function will take a long
time to execute if you call it. In the future we'll add conditional
control flow to actually make recursion useful :).
.. code-block:: bash
ready> extern cos(x) Read extern: declare double
@cos(double)
ready> cos(1.234) Read a top-level expression: define double @1() {
entry: %calltmp = call double @cos(double 1.234000e+00) ; [#uses=1] ret
double %calltmp }
This shows an extern for the libm "cos" function, and a call to it.
.. code-block:: bash
ready> ^C ; ModuleID = 'my cool jit'
define double @0() { entry: ret double 9.000000e+00 }
define double @foo(double %a, double %b) { entry: %multmp = fmul double
%a, %a ; [#uses=1] %multmp1 = fmul double 2.000000e+00, %a ; [#uses=1]
%multmp2 = fmul double %multmp1, %b ; [#uses=1] %addtmp = fadd double
%multmp, %multmp2 ; [#uses=1] %multmp3 = fmul double %b, %b ; [#uses=1]
%addtmp4 = fadd double %addtmp, %multmp3 ; [#uses=1] ret double %addtmp4
}
define double @bar(double %a) { entry: %calltmp = call double
@foo(double %a, double 4.000000e+00) ; [#uses=1] %calltmp1 = call double
@bar(double 3.133700e+04) ; [#uses=1] %addtmp = fadd double %calltmp,
%calltmp1 ; [#uses=1] ret double %addtmp }
declare double @cos(double)
define double @1() { entry: %calltmp = call double @cos(double
1.234000e+00) ; [#uses=1] ret double %calltmp }
When you quit the current demo, it dumps out the IR for the entire
module generated. Here you can see the big picture with all the
functions referencing each other.
This wraps up the third chapter of the Kaleidoscope tutorial. Up next,
we'll describe how to `add JIT codegen and optimizer
support <PythonLangImpl4.html>`_ to this so we can actually start
running code!
--------------
Full Code Listing # {#code}
===========================
Here is the complete code listing for our running example, enhanced with
the LLVM code generator. Because this uses the llvmpy libraries, you
need to `download <../download.html>`_ and
`install <../userguide.html#install>`_ them.
.. code-block:: python
#!/usr/bin/env python
import re from llvm.core import Module, Constant, Type, Function,
Builder, FCMP_ULT
Globals
-------
# The LLVM module, which holds all the IR code.
g_llvm_module = Module.new('my cool jit')
# The LLVM instruction builder. Created whenever a new function is entered.
g_llvm_builder = None
# A dictionary that keeps track of which values are defined in the current scope
# and what their LLVM representation is.
g_named_values = {}
Lexer
-----
# The lexer yields one of these types for each token.
class EOFToken(object): pass
class DefToken(object): pass
class ExternToken(object): pass
class IdentifierToken(object): def **init**\ (self, name): self.name =
name
class NumberToken(object): def **init**\ (self, value): self.value =
value
class CharacterToken(object): def **init**\ (self, char): self.char =
char def **eq**\ (self, other): return isinstance(other, CharacterToken)
and self.char == other.char def **ne**\ (self, other): return not self
== other
# Regular expressions that tokens and comments of our language.
REGEX_NUMBER = re.compile('[0-9]+(?:.[0-9]+)?') REGEX_IDENTIFIER =
re.compile('[a-zA-Z][a-zA-Z0-9]\ *') REGEX_COMMENT = re.compile('#.*')
def Tokenize(string): while string: # Skip whitespace. if
string[0].isspace(): string = string[1:] continue
::
# Run regexes.
comment_match = REGEX_COMMENT.match(string)
number_match = REGEX_NUMBER.match(string)
identifier_match = REGEX_IDENTIFIER.match(string)
# Check if any of the regexes matched and yield the appropriate result.
if comment_match:
comment = comment_match.group(0)
string = string[len(comment):]
elif number_match:
number = number_match.group(0)
yield NumberToken(float(number))
string = string[len(number):]
elif identifier_match:
identifier = identifier_match.group(0)
# Check if we matched a keyword.
if identifier == 'def':
yield DefToken()
elif identifier == 'extern':
yield ExternToken()
else:
yield IdentifierToken(identifier)
string = string[len(identifier):]
else:
# Yield the ASCII value of the unknown character.
yield CharacterToken(string[0])
string = string[1:]
yield EOFToken()
Abstract Syntax Tree (aka Parse Tree)
-------------------------------------
# Base class for all expression nodes.
class ExpressionNode(object): pass
# Expression class for numeric literals like "1.0".
class NumberExpressionNode(ExpressionNode):
def **init**\ (self, value): self.value = value
def CodeGen(self): return Constant.real(Type.double(), self.value)
# Expression class for referencing a variable, like "a".
class VariableExpressionNode(ExpressionNode):
def **init**\ (self, name): self.name = name
def CodeGen(self): if self.name in g_named_values: return
g_named_values[self.name] else: raise RuntimeError('Unknown variable
name: ' + self.name)
# Expression class for a binary operator.
class BinaryOperatorExpressionNode(ExpressionNode):
def **init**\ (self, operator, left, right): self.operator = operator
self.left = left self.right = right
def CodeGen(self): left = self.left.CodeGen() right =
self.right.CodeGen()
::
if self.operator == '+':
return g_llvm_builder.fadd(left, right, 'addtmp')
elif self.operator == '-':
return g_llvm_builder.fsub(left, right, 'subtmp')
elif self.operator == '*':
return g_llvm_builder.fmul(left, right, 'multmp')
elif self.operator == '<':
result = g_llvm_builder.fcmp(FCMP_ULT, left, right, 'cmptmp')
# Convert bool 0 or 1 to double 0.0 or 1.0.
return g_llvm_builder.uitofp(result, Type.double(), 'booltmp')
else:
raise RuntimeError('Unknown binary operator.')
# Expression class for function calls.
class CallExpressionNode(ExpressionNode):
def **init**\ (self, callee, args): self.callee = callee self.args =
args
def CodeGen(self): # Look up the name in the global module table. callee
= g_llvm_module.get_function_named(self.callee)
::
# Check for argument mismatch error.
if len(callee.args) != len(self.args):
raise RuntimeError('Incorrect number of arguments passed.')
arg_values = [i.CodeGen() for i in self.args]
return g_llvm_builder.call(callee, arg_values, 'calltmp')
# This class represents the "prototype" for a function, which captures its name,
# and its argument names (thus implicitly the number of arguments the function
# takes).
class PrototypeNode(object):
def **init**\ (self, name, args): self.name = name self.args = args
def CodeGen(self): # Make the function type, eg. double(double,double).
funct_type = Type.function( Type.double(), [Type.double()] \*
len(self.args), False)
::
function = Function.new(g_llvm_module, funct_type, self.name)
# If the name conflicted, there was already something with the same name.
# If it has a body, don't allow redefinition or reextern.
if function.name != self.name:
function.delete()
function = g_llvm_module.get_function_named(self.name)
# If the function already has a body, reject this.
if not function.is_declaration:
raise RuntimeError('Redefinition of function.')
# If F took a different number of args, reject.
if len(callee.args) != len(self.args):
raise RuntimeError('Redeclaration of a function with different number '
'of args.')
# Set names for all arguments and add them to the variables symbol table.
for arg, arg_name in zip(function.args, self.args):
arg.name = arg_name
# Add arguments to variable symbol table.
g_named_values[arg_name] = arg
return function
# This class represents a function definition itself.
class FunctionNode(object):
def **init**\ (self, prototype, body): self.prototype = prototype
self.body = body
def CodeGen(self): # Clear scope. g_named_values.clear()
::
# Create a function object.
function = self.prototype.CodeGen()
# Create a new basic block to start insertion into.
block = function.append_basic_block('entry')
global g_llvm_builder
g_llvm_builder = Builder.new(block)
# Finish off the function.
try:
return_value = self.body.CodeGen()
g_llvm_builder.ret(return_value)
# Validate the generated code, checking for consistency.
function.verify()
except:
function.delete()
raise
return function
Parser
------
class Parser(object):
def **init**\ (self, tokens, binop_precedence): self.tokens = tokens
self.binop_precedence = binop_precedence self.Next()
# Provide a simple token buffer. Parser.current is the current token the
# parser is looking at. Parser.Next() reads another token from the lexer
and # updates Parser.current with its results. def Next(self):
self.current = self.tokens.next()
# Gets the precedence of the current token, or -1 if the token is not a
binary # operator. def GetCurrentTokenPrecedence(self): if
isinstance(self.current, CharacterToken): return
self.binop_precedence.get(self.current.char, -1) else: return -1
# identifierexpr ::= identifier \| identifier '(' expression\* ')' def
ParseIdentifierExpr(self): identifier_name = self.current.name
self.Next() # eat identifier.
::
if self.current != CharacterToken('('): # Simple variable reference.
return VariableExpressionNode(identifier_name)
# Call.
self.Next() # eat '('.
args = []
if self.current != CharacterToken(')'):
while True:
args.append(self.ParseExpression())
if self.current == CharacterToken(')'):
break
elif self.current != CharacterToken(','):
raise RuntimeError('Expected ")" or "," in argument list.')
self.Next()
self.Next() # eat ')'.
return CallExpressionNode(identifier_name, args)
# numberexpr ::= number def ParseNumberExpr(self): result =
NumberExpressionNode(self.current.value) self.Next() # consume the
number. return result
# parenexpr ::= '(' expression ')' def ParseParenExpr(self): self.Next()
# eat '('.
::
contents = self.ParseExpression()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")".')
self.Next() # eat ')'.
return contents
# primary ::= identifierexpr \| numberexpr \| parenexpr def
ParsePrimary(self): if isinstance(self.current, IdentifierToken): return
self.ParseIdentifierExpr() elif isinstance(self.current, NumberToken):
return self.ParseNumberExpr() elif self.current == CharacterToken('('):
return self.ParseParenExpr() else: raise RuntimeError('Unknown token
when expecting an expression.')
# binoprhs ::= (operator primary)\* def ParseBinOpRHS(self, left,
left_precedence): # If this is a binary operator, find its precedence.
while True: precedence = self.GetCurrentTokenPrecedence()
::
# If this is a binary operator that binds at least as tightly as the
# current one, consume it; otherwise we are done.
if precedence < left_precedence:
return left
binary_operator = self.current.char
self.Next() # eat the operator.
# Parse the primary expression after the binary operator.
right = self.ParsePrimary()
# If binary_operator binds less tightly with right than the operator after
# right, let the pending operator take right as its left.
next_precedence = self.GetCurrentTokenPrecedence()
if precedence < next_precedence:
right = self.ParseBinOpRHS(right, precedence + 1)
# Merge left/right.
left = BinaryOperatorExpressionNode(binary_operator, left, right)
# expression ::= primary binoprhs def ParseExpression(self): left =
self.ParsePrimary() return self.ParseBinOpRHS(left, 0)
# prototype ::= id '(' id\* ')' def ParsePrototype(self): if not
isinstance(self.current, IdentifierToken): raise RuntimeError('Expected
function name in prototype.')
::
function_name = self.current.name
self.Next() # eat function name.
if self.current != CharacterToken('('):
raise RuntimeError('Expected "(" in prototype.')
self.Next() # eat '('.
arg_names = []
while isinstance(self.current, IdentifierToken):
arg_names.append(self.current.name)
self.Next()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")" in prototype.')
# Success.
self.Next() # eat ')'.
return PrototypeNode(function_name, arg_names)
# definition ::= 'def' prototype expression def ParseDefinition(self):
self.Next() # eat def. proto = self.ParsePrototype() body =
self.ParseExpression() return FunctionNode(proto, body)
# toplevelexpr ::= expression def ParseTopLevelExpr(self): proto =
PrototypeNode('', []) return FunctionNode(proto, self.ParseExpression())
# external ::= 'extern' prototype def ParseExtern(self): self.Next() #
eat extern. return self.ParsePrototype()
# Top-Level parsing def HandleDefinition(self):
self.Handle(self.ParseDefinition, 'Read a function definition:')
def HandleExtern(self): self.Handle(self.ParseExtern, 'Read an extern:')
def HandleTopLevelExpression(self): self.Handle(self.ParseTopLevelExpr,
'Read a top-level expression:')
def Handle(self, function, message): try: print message,
function().CodeGen() except Exception, e: print 'Error:', e try:
self.Next() # Skip for error recovery. except: pass
Main driver code.
-----------------
def main(): # Install standard binary operators. # 1 is lowest possible
precedence. 40 is the highest. operator_precedence = { '<': 10, '+':
20, '-': 20, '\*': 40 }
# Run the main "interpreter loop". while True: print 'ready>', try: raw
= raw_input() except KeyboardInterrupt: break
::
parser = Parser(Tokenize(raw), operator_precedence)
while True:
# top ::= definition | external | expression | EOF
if isinstance(parser.current, EOFToken):
break
if isinstance(parser.current, DefToken):
parser.HandleDefinition()
elif isinstance(parser.current, ExternToken):
parser.HandleExtern()
else:
parser.HandleTopLevelExpression()
# Print out all of the generated code. print '', g_llvm_module
if **name** == '**main**\ ': main()

View file

@ -1,808 +0,0 @@
*************************************************
Chapter 4: Adding JIT and Optimizer Support
*************************************************
Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Max
Shawabkeh <http://max99x.com>`_
Introduction # {#intro}
=======================
Welcome to Chapter 4 of the `Implementing a language with
LLVM <http://www.llvm.org/docs/tutorial/index.html>`_ tutorial. Chapters
1-3 described the implementation of a simple language and added support
for generating LLVM IR. This chapter describes two new techniques:
adding optimizer support to your language, and adding JIT compiler
support. These additions will demonstrate how to get nice, efficient
code for the Kaleidoscope language.
--------------
Trivial Constant Folding # {#trivialconstfold}
==============================================
Our demonstration for Chapter 3 is elegant and easy to extend.
Unfortunately, it does not produce wonderful code. The LLVM Builder,
however, does give us obvious optimizations when compiling simple code:
.. code-block:: bash
ready> def test(x) 1+2+x Read function definition:
define double @test(double %x) { entry: %addtmp = fadd double
3.000000e+00, %x ret double %addtmp }
This code is not a literal transcription of the AST built by parsing the
input. That would be:
.. code-block:: bash
ready> def test(x) 1+2+x Read function definition:
define double @test(double %x) { entry: %addtmp = fadd double
2.000000e+00, 1.000000e+00 %addtmp1 = fadd double %addtmp, %x ret double
%addtmp1 }
Constant folding, as seen above, in particular, is a very common and
very important optimization: so much so that many language implementors
implement constant folding support in their AST representation.
With LLVM, you don't need this support in the AST. Since all calls to
build LLVM IR go through the LLVM IR builder, the builder itself checked
to see if there was a constant folding opportunity when you call it. If
so, it just does the constant fold and return the constant instead of
creating an instruction.
Well, that was easy :). In practice, we recommend always using
``llvm.core.Builder`` when generating code like this. It has no
"syntactic overhead" for its use (you don't have to uglify your compiler
with constant checks everywhere) and it can dramatically reduce the
amount of LLVM IR that is generated in some cases (particular for
languages with a macro preprocessor or that use a lot of constants).
On the other hand, the ``Builder`` is limited by the fact that it does
all of its analysis inline with the code as it is built. If you take a
slightly more complex example:
.. code-block:: bash
ready> def test(x) (1+2+x)\*(x+(1+2)) Read a
function definition: define double @test(double %x) { entry: %addtmp =
fadd double 3.000000e+00, %x ; [#uses=1] %addtmp1 = fadd double %x,
3.000000e+00 ; [#uses=1] %multmp = fmul double %addtmp, %addtmp1 ;
[#uses=1] ret double %multmp }
In this case, the LHS and RHS of the multiplication are the same value.
We'd really like to see this generate"``tmp = x+3; result = tmp*tmp;``
instead of computing ``x+3`` twice.
Unfortunately, no amount of local analysis will be able to detect and
correct this. This requires two transformations: reassociation of
expressions (to make the add's lexically identical) and Common
Subexpression Elimination (CSE) to delete the redundant add instruction.
Fortunately, LLVM provides a broad range of optimizations that you can
use, in the form of "passes".
--------------
LLVM Optimization Passes # {#optimizerpasses}
=============================================
LLVM provides many optimization passes, which do many different sorts of
things and have different tradeoffs. Unlike other systems, LLVM doesn't
hold to the mistaken notion that one set of optimizations is right for
all languages and for all situations. LLVM allows a compiler implementor
to make complete decisions about what optimizations to use, in which
order, and in what situation.
As a concrete example, LLVM supports both "whole module" passes, which
look across as large of body of code as they can (often a whole file,
but if run at link time, this can be a substantial portion of the whole
program). It also supports and includes "per-function" passes which just
operate on a single function at a time, without looking at other
functions. For more information on passes and how they are run, see the
`How to Write a Pass <http://www.llvm.org/docs/WritingAnLLVMPass.html>`_
document and the `List of LLVM
Passes <http://www.llvm.org/docs/Passes.html>`_.
For Kaleidoscope, we are currently generating functions on the fly, one
at a time, as the user types them in. We aren't shooting for the
ultimate optimization experience in this setting, but we also want to
catch the easy and quick stuff where possible. As such, we will choose
to run a few per-function optimizations as the user types the function
in. If we wanted to make a "static Kaleidoscope compiler", we would use
exactly the code we have now, except that we would defer running the
optimizer until the entire file has been parsed.
In order to get per-function optimizations going, we need to set up a
`FunctionPassManager <http://www.llvm.org/docs/WritingAnLLVMPass.html#passmanager>`_
to hold and organize the LLVM optimizations that we want to run. Once we
have that, we can add a set of optimizations to run. The code looks like
this:
.. code-block:: python
# The function optimization passes manager.
g_llvm_pass_manager = FunctionPassManager.new(g_llvm_module)
# The LLVM execution engine.
g_llvm_executor = ExecutionEngine.new(g_llvm_module)
...
def main(): # Set up the optimizer pipeline. Start with registering info
about how the # target lays out data structures.
g_llvm_pass_manager.add(g_llvm_executor.target_data) # Do simple
"peephole" optimizations and bit-twiddling optzns.
g_llvm_pass_manager.add(PASS_INSTRUCTION_COMBINING) # Reassociate
expressions. g_llvm_pass_manager.add(PASS_REASSOCIATE) # Eliminate
Common SubExpressions. g_llvm_pass_manager.add(PASS_GVN) # Simplify
the control flow graph (deleting unreachable blocks, etc).
g_llvm_pass_manager.add(PASS_CFG_SIMPLIFICATION)
g_llvm_pass_manager.initialize()
This code defines a ``FunctionPassManager``, ``g_llvm_pass_manager``.
Once it is set up, we use a series of "add" calls to add a bunch of LLVM
passes. The first pass is basically boilerplate, it adds a pass so that
later optimizations know how the data structures in the program are laid
out. (The "``g_llvm_executor``\ " variable is related to the JIT, which
we will get to in the next section.) In this case, we choose to add 4
optimization passes. The passes we chose here are a pretty standard set
of "cleanup" optimizations that are useful for a wide variety of code. I
won't delve into what they do but, believe me, they are a good starting
place :).
Once the pass manager is set up, we need to make use of it. We do this
by running it after our newly created function is constructed (in
``FunctionNode.CodeGen``), but before it is returned to the client:
.. code-block:: python
return_value = self.body.CodeGen()
g_llvm_builder.ret(return_value)
::
# Validate the generated code, checking for consistency.
function.verify()
# Optimize the function.
g_llvm_pass_manager.run(function)
As you can see, this is pretty straightforward. The
``FunctionPassManager`` optimizes and updates the LLVM Function in
place, improving (hopefully) its body. With this in place, we can try
our test above again:
.. code-block:: bash
ready> def test(x) (1+2+x)\*(x+(1+2)) Read a
function definition: define double @test(double %x) { entry: %addtmp =
fadd double %x, 3.000000e+00 ; [#uses=2] %multmp = fmul double %addtmp,
%addtmp ; [#uses=1] ret double %multmp }
As expected, we now get our nicely optimized code, saving a floating
point add instruction from every execution of this function.
LLVM provides a wide variety of optimizations that can be used in
certain circumstances. Some `documentation about the various
passes <http://www.llvm.org/docs/Passes.html>`_ is available, but it
isn't very complete. Another good source of ideas can come from looking
at the passes that ``llvm-gcc`` or ``llvm-ld`` run to get started. The
``opt`` tool allows you to experiment with passes from the command line,
so you can see if they do anything.
Now that we have reasonable code coming out of our front-end, lets talk
about executing it!
--------------
Adding a JIT Compiler # {#jit}
==============================
Code that is available in LLVM IR can have a wide variety of tools
applied to it. For example, you can run optimizations on it (as we did
above), you can dump it out in textual or binary forms, you can compile
the code to an assembly file (.s) for some target, or you can JIT
compile it. The nice thing about the LLVM IR representation is that it
is the "common currency" between many different parts of the compiler.
In this section, we'll add JIT compiler support to our interpreter. The
basic idea that we want for Kaleidoscope is to have the user enter
function bodies as they do now, but immediately evaluate the top-level
expressions they type in. For example, if they type in "1 + 2", we
should evaluate and print out 3. If they define a function, they should
be able to call it from the command line.
In order to do this, we first declare and initialize the JIT. This is
done by adding and initializing a global variable:
.. code-block:: python
# The LLVM execution engine. g_llvm_executor =
ExecutionEngine.new(g_llvm_module)
This creates an abstract "Execution Engine" which can be either a JIT
compiler or the LLVM interpreter. LLVM will automatically pick a JIT
compiler for you if one is available for your platform, otherwise it
will fall back to the interpreter.
Once the ``ExecutionEngine`` is created, the JIT is ready to be used. We
can use the ``run_function`` method of the execution engine to execute a
compiled function and get its return value. In our case, this means that
we can change the code that parses a top-level expression to look like
this:
.. code-block:: python
def HandleTopLevelExpression(self): try: function
= self.ParseTopLevelExpr().CodeGen() result =
g_llvm_executor.run_function(function, []) print 'Evaluated to:',
result.as_real(Type.double()) except Exception, e: print 'Error:', e
try: self.Next() # Skip for error recovery. except: pass {% endhighlight
%}
Recall that we compile top-level expressions into a self-contained LLVM
function that takes no arguments and returns the computed double.
With just these two changes, lets see how Kaleidoscope works now!
{% highlight python %} ready> 4+5 Read a top level expression: define
double @0() { entry: ret double 9.000000e+00 }
Evaluated to: 9.0
Well this looks like it is basically working. The dump of the function
shows the "no argument function that always returns double" that we
synthesize for each top-level expression that is typed in. This
demonstrates very basic functionality, but can we do more?
.. code-block:: python
ready> def testfunc(x y) x + y\*2 Read a function
definition: define double @testfunc(double %x, double %y) { entry:
%multmp = fmul double %y, 2.000000e+00 ; [#uses=1] %addtmp = fadd double
%multmp, %x ; [#uses=1] ret double %addtmp }
ready> testfunc(4, 10) Read a top level expression: define double @0() {
entry: %calltmp = call double @testfunc(double 4.000000e+00, double
1.000000e+01) ; [#uses=1] ret double %calltmp }
*Evaluated to: 24.0*
This illustrates that we can now call user code, but there is something
a bit subtle going on here. Note that we only invoke the JIT on the
anonymous functions that *call testfunc*, but we never invoked it on
*testfunc* itself. What actually happened here is that the JIT scanned
for all non-JIT'd functions transitively called from the anonymous
function and compiled all of them before returning from
``run_function()``.
The JIT provides a number of other more advanced interfaces for things
like freeing allocated machine code, rejit'ing functions to update them,
etc. However, even with this simple code, we get some surprisingly
powerful capabilities - check this out (I removed the dump of the
anonymous functions, you should get the idea by now :) :
.. code-block:: bash
ready> extern sin(x) Read an extern: declare double
@sin(double)
ready> extern cos(x) Read an extern: declare double @cos(double)
ready> sin(1.0) *Evaluated to: 0.841470984808*
ready> def foo(x) sin(x)\ *sin(x) + cos(x)*\ cos(x) Read a function
definition: define double @foo(double %x) { entry: %calltmp = call
double @sin(double %x) ; [#uses=1] %calltmp1 = call double @sin(double
%x) ; [#uses=1] %multmp = fmul double %calltmp, %calltmp1 ; [#uses=1]
%calltmp2 = call double @cos(double %x) ; [#uses=1] %calltmp3 = call
double @cos(double %x) ; [#uses=1] %multmp4 = fmul double %calltmp2,
%calltmp3 ; [#uses=1] %addtmp = fadd double %multmp, %multmp4 ;
[#uses=1] ret double %addtmp }
ready> foo(4.0) *Evaluated to: 1.000000*
Whoa, how does the JIT know about sin and cos? The answer is
surprisingly simple: in this example, the JIT started execution of a
function and got to a function call. It realized that the function was
not yet JIT compiled and invoked the standard set of routines to resolve
the function. In this case, there is no body defined for the function,
so the JIT ended up calling ``dlsym("sin")`` on the Python process that
is hosting our Kaleidoscope prompt. Since ``sin`` is defined within the
JIT's address space, it simply patches up calls in the module to call
the libm version of ``sin`` directly.
One interesting application of this is that we can now extend the
language by writing arbitrary C++ code to implement operations. For
example, we can create a C file with the following simple function:
.. code-block:: c
#include
double putchard(double x) { putchar((char)x); return 0; } {%
endhighlight %}
We can then compile this into a shared library with GCC:
{% highlight bash %} gcc -shared -fPIC -o putchard.so putchard.c {%
endhighlight %}
Now we can load this library into the Python process using
``llvm.core.load_library_permanently`` and access it from Kaleidoscope
to produce simple output to the console:
{% highlight python %} >>> import llvm.core >>>
llvm.core.load_library_permanently('/home/max/llvmpy-tutorial/putchard.so')
>>> import kaleidoscope >>> kaleidoscope.main() ready> extern
putchard(x) Read an extern: declare double @putchard(double)
ready> putchard(65) + putchard(66) + putchard(67) + putchard(10) *ABC*
Evaluated to: 0.0
Similar code could be used to implement file I/O, console input, and
many other capabilities in Kaleidoscope.
This completes the JIT and optimizer chapter of the Kaleidoscope
tutorial. At this point, we can compile a non-Turing-complete
programming language, optimize and JIT compile it in a user-driven way.
Next up we'll look into `extending the language with control flow
constructs <PythonLangImpl5.html>`_, tackling some interesting LLVM IR
issues along the way.
--------------
Full Code Listing # {#code}
===========================
Here is the complete code listing for our running example, enhanced with
the LLVM JIT and optimizer:
.. code-block:: python
#!/usr/bin/env python
import re from llvm.core import Module, Constant, Type, Function,
Builder, FCMP_ULT from llvm.ee import ExecutionEngine, TargetData from
llvm.passes import FunctionPassManager from llvm.passes import
(PASS_INSTRUCTION_COMBINING, PASS_REASSOCIATE, PASS_GVN,
PASS_CFG_SIMPLIFICATION)
Globals
-------
# The LLVM module, which holds all the IR code.
g_llvm_module = Module.new('my cool jit')
# The LLVM instruction builder. Created whenever a new function is entered.
g_llvm_builder = None
# A dictionary that keeps track of which values are defined in the current scope
# and what their LLVM representation is.
g_named_values = {}
# The function optimization passes manager.
g_llvm_pass_manager = FunctionPassManager.new(g_llvm_module)
# The LLVM execution engine.
g_llvm_executor = ExecutionEngine.new(g_llvm_module)
Lexer
-----
# The lexer yields one of these types for each token.
class EOFToken(object): pass
class DefToken(object): pass
class ExternToken(object): pass
class IdentifierToken(object): def **init**\ (self, name): self.name =
name
class NumberToken(object): def **init**\ (self, value): self.value =
value
class CharacterToken(object): def **init**\ (self, char): self.char =
char def **eq**\ (self, other): return isinstance(other, CharacterToken)
and self.char == other.char def **ne**\ (self, other): return not self
== other
# Regular expressions that tokens and comments of our language.
REGEX_NUMBER = re.compile('[0-9]+(?:.[0-9]+)?') REGEX_IDENTIFIER =
re.compile('[a-zA-Z][a-zA-Z0-9]\ *') REGEX_COMMENT = re.compile('#.*')
def Tokenize(string): while string: # Skip whitespace. if
string[0].isspace(): string = string[1:] continue
::
# Run regexes.
comment_match = REGEX_COMMENT.match(string)
number_match = REGEX_NUMBER.match(string)
identifier_match = REGEX_IDENTIFIER.match(string)
# Check if any of the regexes matched and yield the appropriate result.
if comment_match:
comment = comment_match.group(0)
string = string[len(comment):]
elif number_match:
number = number_match.group(0)
yield NumberToken(float(number))
string = string[len(number):]
elif identifier_match:
identifier = identifier_match.group(0)
# Check if we matched a keyword.
if identifier == 'def':
yield DefToken()
elif identifier == 'extern':
yield ExternToken()
else:
yield IdentifierToken(identifier)
string = string[len(identifier):]
else:
# Yield the ASCII value of the unknown character.
yield CharacterToken(string[0])
string = string[1:]
yield EOFToken()
Abstract Syntax Tree (aka Parse Tree)
-------------------------------------
# Base class for all expression nodes.
class ExpressionNode(object): pass
# Expression class for numeric literals like "1.0".
class NumberExpressionNode(ExpressionNode):
def **init**\ (self, value): self.value = value
def CodeGen(self): return Constant.real(Type.double(), self.value)
# Expression class for referencing a variable, like "a".
class VariableExpressionNode(ExpressionNode):
def **init**\ (self, name): self.name = name
def CodeGen(self): if self.name in g_named_values: return
g_named_values[self.name] else: raise RuntimeError('Unknown variable
name: ' + self.name)
# Expression class for a binary operator.
class BinaryOperatorExpressionNode(ExpressionNode):
def **init**\ (self, operator, left, right): self.operator = operator
self.left = left self.right = right
def CodeGen(self): left = self.left.CodeGen() right =
self.right.CodeGen()
::
if self.operator == '+':
return g_llvm_builder.fadd(left, right, 'addtmp')
elif self.operator == '-':
return g_llvm_builder.fsub(left, right, 'subtmp')
elif self.operator == '*':
return g_llvm_builder.fmul(left, right, 'multmp')
elif self.operator == '<':
result = g_llvm_builder.fcmp(FCMP_ULT, left, right, 'cmptmp')
# Convert bool 0 or 1 to double 0.0 or 1.0.
return g_llvm_builder.uitofp(result, Type.double(), 'booltmp')
else:
raise RuntimeError('Unknown binary operator.')
# Expression class for function calls.
class CallExpressionNode(ExpressionNode):
def **init**\ (self, callee, args): self.callee = callee self.args =
args
def CodeGen(self): # Look up the name in the global module table. callee
= g_llvm_module.get_function_named(self.callee)
::
# Check for argument mismatch error.
if len(callee.args) != len(self.args):
raise RuntimeError('Incorrect number of arguments passed.')
arg_values = [i.CodeGen() for i in self.args]
return g_llvm_builder.call(callee, arg_values, 'calltmp')
# This class represents the "prototype" for a function, which captures its name,
# and its argument names (thus implicitly the number of arguments the function
# takes).
class PrototypeNode(object):
def **init**\ (self, name, args): self.name = name self.args = args
def CodeGen(self): # Make the function type, eg. double(double,double).
funct_type = Type.function( Type.double(), [Type.double()] \*
len(self.args), False)
::
function = Function.new(g_llvm_module, funct_type, self.name)
# If the name conflicted, there was already something with the same name.
# If it has a body, don't allow redefinition or reextern.
if function.name != self.name:
function.delete()
function = g_llvm_module.get_function_named(self.name)
# If the function already has a body, reject this.
if not function.is_declaration:
raise RuntimeError('Redefinition of function.')
# If F took a different number of args, reject.
if len(callee.args) != len(self.args):
raise RuntimeError('Redeclaration of a function with different number '
'of args.')
# Set names for all arguments and add them to the variables symbol table.
for arg, arg_name in zip(function.args, self.args):
arg.name = arg_name
# Add arguments to variable symbol table.
g_named_values[arg_name] = arg
return function
# This class represents a function definition itself.
class FunctionNode(object):
def **init**\ (self, prototype, body): self.prototype = prototype
self.body = body
def CodeGen(self): # Clear scope. g_named_values.clear()
::
# Create a function object.
function = self.prototype.CodeGen()
# Create a new basic block to start insertion into.
block = function.append_basic_block('entry')
global g_llvm_builder
g_llvm_builder = Builder.new(block)
# Finish off the function.
try:
return_value = self.body.CodeGen()
g_llvm_builder.ret(return_value)
# Validate the generated code, checking for consistency.
function.verify()
# Optimize the function.
g_llvm_pass_manager.run(function)
except:
function.delete()
raise
return function
Parser
------
class Parser(object):
def **init**\ (self, tokens, binop_precedence): self.tokens = tokens
self.binop_precedence = binop_precedence self.Next()
# Provide a simple token buffer. Parser.current is the current token the
# parser is looking at. Parser.Next() reads another token from the lexer
and # updates Parser.current with its results. def Next(self):
self.current = self.tokens.next()
# Gets the precedence of the current token, or -1 if the token is not a
binary # operator. def GetCurrentTokenPrecedence(self): if
isinstance(self.current, CharacterToken): return
self.binop_precedence.get(self.current.char, -1) else: return -1
# identifierexpr ::= identifier \| identifier '(' expression\* ')' def
ParseIdentifierExpr(self): identifier_name = self.current.name
self.Next() # eat identifier.
::
if self.current != CharacterToken('('): # Simple variable reference.
return VariableExpressionNode(identifier_name)
# Call.
self.Next() # eat '('.
args = []
if self.current != CharacterToken(')'):
while True:
args.append(self.ParseExpression())
if self.current == CharacterToken(')'):
break
elif self.current != CharacterToken(','):
raise RuntimeError('Expected ")" or "," in argument list.')
self.Next()
self.Next() # eat ')'.
return CallExpressionNode(identifier_name, args)
# numberexpr ::= number def ParseNumberExpr(self): result =
NumberExpressionNode(self.current.value) self.Next() # consume the
number. return result
# parenexpr ::= '(' expression ')' def ParseParenExpr(self): self.Next()
# eat '('.
::
contents = self.ParseExpression()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")".')
self.Next() # eat ')'.
return contents
# primary ::= identifierexpr \| numberexpr \| parenexpr def
ParsePrimary(self): if isinstance(self.current, IdentifierToken): return
self.ParseIdentifierExpr() elif isinstance(self.current, NumberToken):
return self.ParseNumberExpr() elif self.current == CharacterToken('('):
return self.ParseParenExpr() else: raise RuntimeError('Unknown token
when expecting an expression.')
# binoprhs ::= (operator primary)\* def ParseBinOpRHS(self, left,
left_precedence): # If this is a binary operator, find its precedence.
while True: precedence = self.GetCurrentTokenPrecedence()
::
# If this is a binary operator that binds at least as tightly as the
# current one, consume it; otherwise we are done.
if precedence < left_precedence:
return left
binary_operator = self.current.char
self.Next() # eat the operator.
# Parse the primary expression after the binary operator.
right = self.ParsePrimary()
# If binary_operator binds less tightly with right than the operator after
# right, let the pending operator take right as its left.
next_precedence = self.GetCurrentTokenPrecedence()
if precedence < next_precedence:
right = self.ParseBinOpRHS(right, precedence + 1)
# Merge left/right.
left = BinaryOperatorExpressionNode(binary_operator, left, right)
# expression ::= primary binoprhs def ParseExpression(self): left =
self.ParsePrimary() return self.ParseBinOpRHS(left, 0)
# prototype ::= id '(' id\* ')' def ParsePrototype(self): if not
isinstance(self.current, IdentifierToken): raise RuntimeError('Expected
function name in prototype.')
::
function_name = self.current.name
self.Next() # eat function name.
if self.current != CharacterToken('('):
raise RuntimeError('Expected "(" in prototype.')
self.Next() # eat '('.
arg_names = []
while isinstance(self.current, IdentifierToken):
arg_names.append(self.current.name)
self.Next()
if self.current != CharacterToken(')'):
raise RuntimeError('Expected ")" in prototype.')
# Success.
self.Next() # eat ')'.
return PrototypeNode(function_name, arg_names)
# definition ::= 'def' prototype expression def ParseDefinition(self):
self.Next() # eat def. proto = self.ParsePrototype() body =
self.ParseExpression() return FunctionNode(proto, body)
# toplevelexpr ::= expression def ParseTopLevelExpr(self): proto =
PrototypeNode('', []) return FunctionNode(proto, self.ParseExpression())
# external ::= 'extern' prototype def ParseExtern(self): self.Next() #
eat extern. return self.ParsePrototype()
# Top-Level parsing def HandleDefinition(self):
self.Handle(self.ParseDefinition, 'Read a function definition:')
def HandleExtern(self): self.Handle(self.ParseExtern, 'Read an extern:')
def HandleTopLevelExpression(self): try: function =
self.ParseTopLevelExpr().CodeGen() result =
g_llvm_executor.run_function(function, []) print 'Evaluated to:',
result.as_real(Type.double()) except Exception, e: print 'Error:', e
try: self.Next() # Skip for error recovery. except: pass
def Handle(self, function, message): try: print message,
function().CodeGen() except Exception, e: print 'Error:', e try:
self.Next() # Skip for error recovery. except: pass
Main driver code.
-----------------
def main(): # Set up the optimizer pipeline. Start with registering info
about how the # target lays out data structures.
g_llvm_pass_manager.add(g_llvm_executor.target_data) # Do simple
"peephole" optimizations and bit-twiddling optzns.
g_llvm_pass_manager.add(PASS_INSTRUCTION_COMBINING) # Reassociate
expressions. g_llvm_pass_manager.add(PASS_REASSOCIATE) # Eliminate
Common SubExpressions. g_llvm_pass_manager.add(PASS_GVN) # Simplify
the control flow graph (deleting unreachable blocks, etc).
g_llvm_pass_manager.add(PASS_CFG_SIMPLIFICATION)
g_llvm_pass_manager.initialize()
# Install standard binary operators. # 1 is lowest possible precedence.
40 is the highest. operator_precedence = { '<': 10, '+': 20, '-': 20,
'\*': 40 }
# Run the main "interpreter loop". while True: print 'ready>', try: raw
= raw_input() except KeyboardInterrupt: break
::
parser = Parser(Tokenize(raw), operator_precedence)
while True:
# top ::= definition | external | expression | EOF
if isinstance(parser.current, EOFToken):
break
if isinstance(parser.current, DefToken):
parser.HandleDefinition()
elif isinstance(parser.current, ExternToken):
parser.HandleExtern()
else:
parser.HandleTopLevelExpression()
# Print out all of the generated code. print '', g_llvm_module
if **name** == '**main**\ ': main()

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -1,21 +0,0 @@
Kaleidoscope
--------------
Implementing a Language with LLVM
The LLVM `Kaleidoscope <http://www.llvm.org/docs/tutorial/>`_ tutorial
has been ported to llvmpy by Max Shawabkeh.
.. toctree::
:titlesonly:
:numbered:
PythonLangImpl1.rst
PythonLangImpl2.rst
PythonLangImpl3.rst
PythonLangImpl4.rst
PythonLangImpl5.rst
PythonLangImpl6.rst
PythonLangImpl7.rst
PythonLangImpl8.rst

View file

@ -1,58 +0,0 @@
+-------------------------------+
| layout: page |
+-------------------------------+
| title: Argument (llvm.core) |
+-------------------------------+
The ``args`` property of ``llvm.core.Function`` objects yields
``llvm.core.Argument`` objects. This allows for setting attributes for
functions arguments. ``Argument`` objects cannot be constructed from
user code, the only way to get a reference to these are from
``Function`` objects.
The method ``add_attribute`` and ``remove_attribute`` can be used to add
or remove the following attributes:
Value\| Equivalent LLVM Assembly Keyword \|
-----\|----------------------------------\| ``ATTR_ZEXT``\ \|
``zeroext`` \| ``ATTR_SEXT``\ \| ``signext`` \| ``ATTR_IN_REG``\ \|
``inreg`` \| ``ATTR_BY_VAL``\ \| ``byval`` \| ``ATTR_STRUCT_RET``\ \|
``sret`` \| ``ATTR_NO_ALIAS``\ \| ``noalias`` \| ``ATTR_NO_CAPTURE``\ \|
``nocapture`` \| ``ATTR_NEST``\ \| ``nest`` \|
These method work exactly like the `corresponding
methods <functions.html#fnattr>`_ of the ``Function`` class above. Refer
`LLVM docs <http://www.llvm.org/docs/LangRef.html#paramattrs>`_ for
information on what each attribute means.
The alignment of any argument can be set via the ``alignment`` property,
to any power of 2.
llvm.core.Argument
==================
Base Class
----------
- `llvm.core.Value <llvm.core.Value.html>`_
Properties
----------
``alignment``
~~~~~~~~~~~~~
The alignment of the argument. Must be a power of 2.
Methods
-------
``add_attribute(attr)``
~~~~~~~~~~~~~~~~~~~~~~~
Add an attribute ``attr`` to the argument, from the set listed above.
``remove_attribute(attr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Remove the attribute ``attr`` of the argument.

View file

@ -1,30 +0,0 @@
+--------------------------------+
| layout: page |
+--------------------------------+
| title: ArrayType (llvm.core) |
+--------------------------------+
llvm.core.ArrayType
===================
Base Class
----------
- `llvm.core.Type <llvm.core.Type.html>`_
Properties
----------
``element``
~~~~~~~~~~~
[read-only]
A ``Type`` object representing the type of the element of the array.
``count``
~~~~~~~~~
[read-only]
The number of elements in the array.

View file

@ -1,44 +0,0 @@
+---------------------------------+
| layout: page |
+---------------------------------+
| title: BasicBlock (llvm.core) |
+---------------------------------+
A basicblock is a list of instructions. A wellformed basicblock should
end with a terminator. ``Function.verify()`` will verify that. A
terminator is either a branch instruction or return instruction. It is
not possible to have instructions after a branch or return instruction.
llvm.core.BasicBlock
====================
Base Class
----------
- `llvm.core.Value <llvm.core.Value.html>`_
Methods
-------
``delete(self)``
~~~~~~~~~~~~~~~~
Delete this basicblock from the function (``self.function``).
``insert_before(self, name)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TODO
Proporties
----------
``function``
~~~~~~~~~~~~
The parent function of this basicblock.
``instructions``
~~~~~~~~~~~~~~~~
A list of instructions in this basicblock.

View file

@ -1,411 +0,0 @@
+------------------------------+
| layout: page |
+------------------------------+
| title: Builder (llvm.core) |
+------------------------------+
The ``Builder`` class corresponds to the
`IRBuilder <http://llvm.org/docs/doxygen/html/classllvm_1_1IRBuilder.html>`_
in C++ llvm. It provides an uniform API to populating
`BasicBlocks <llvm.core.BasicBlock.html>`_. Most of the methods in
``Builder`` correspond to the instructions in the LLVM IR. See `LLVM
documentation <http://llvm.org/docs/LangRef.html>`_ for detail. These
methods have the ``name`` argument for overiding the name of the result
variable. When it is an empty string (default value), LLVM will set a
numeric ID for the result variable.
llvm.core.Builder
=================
- This will become a table of contents (this text will be scraped).
{:toc}
Static Factor Method
--------------------
``new(basic_block)``
~~~~~~~~~~~~~~~~~~~~
Create an instance of ``Builder`` at
`BasicBlock <llvm.core.BasicBlock.html>`_.
Methods
-------
``add(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs+rhs`` for integer values only.
``alloca(self, ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that allocates stack memory for a value of type
``ty``.
``alloca_array(self, ty, size, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that allocates stack memory for a ``size``
elements array of type ``ty``.
``and_(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs & rhs``.
``ashr(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs >> rhs`` using arithmetic
shift.
``bitcast(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that cast ``value`` to type ``dest_ty``.
``branch(self, bblk)``
~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that branch to basicblock ``bblk``.
``call(self, fn, args, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that call function ``fn`` with a iterable of
arguments ``args``.
``cbranch(self, if_value, then_blk, else_blk)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that conditionally branch base on the predicate
``if_value``. If ``if_value`` is ``True``, branch to ``then_blk``;
Otherwise, branch to ``else_blk``.
``extract_element(self, vec_val, idx_val, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that extracts an element from a value ``vec_val``
of `llvm.core.VectorType <llvm.core.VectorType.html>`_ at index
``idx_val``.
``extract_value(self, retval, idx, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that extracts an element from an aggregate value
``retval`` at index ``idx``.
``fadd(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs + rhs`` for floating-point
values.
``fcmp(self, rpred, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that compares ``lhs`` and ``rhs`` using the
comparision operation defined by ``rpred``. See
`here <comparision.html#fcmp>`_ for a list of comparators.
``fdiv(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs / rhs`` for floating-point
values.
``fmul(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs * rhs`` for floating-point
values.
``fpext(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that extends ``value`` to a float type
``dest_ty``.
``fptosi(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts a floating-point value ``value`` to
a signed integer type ``dest_ty``.
``fptoui(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts a floating-point value ``value`` to
an unsigned integer type ``dest_ty``.
``fptrunc(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that truncates a floating-point value ``value`` to
a float type ``dest_ty``.
``free(self, ptr)``
~~~~~~~~~~~~~~~~~~~
Insert an instruction that call performs heap deallocation on pointer
``ptr``.
``frem(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs % rhs`` for floating-point
values.
``fsub(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs - rhs`` for floating-point
values.
``gep(self, ptr, indices, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
See `GEP <http://llvm.org/docs/LangRef.html#i_getelementptr>`_.
``getresult(self, retval, idx, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
same as ``extract_value``.
``icmp(self, ipred, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that compares ``lhs`` and ``rhs`` using the
comparision operation defined by ``ipred``. See
`here <comparision.html#icmp>`_ for a list of comparators.
``insert_element(self, vec_val, elt_val, idx_val, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that inserts a value ``elt_val`` into ``vec_val``
of `llvm.core.VectorType <llvm.core.VectorType.html>`_ at index
``idx_val``.
``inttoptr(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts an integer ``value`` to pointer
``dest_ty``.
``invoke(self, func, args, then_blk, catch_blk, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
See `invoke <http://llvm.org/docs/LangRef.html#i_invoke>`_
``load(self, ptr, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that loads a value at the memory pointed by
``ptr``.
``lshr(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs >> rhs`` using logical shift.
``malloc(self, ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that allocates heap memory of type ``ty``. The
instruction returns a pointer that points to a value of type ``ty``.
``malloc_array(self, ty, size, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Similar to ``malloc`` but allocates an array of ``size`` elements.
``mul(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs * rhs`` for integer types.
``neg(self, val, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``0 - val``.
``not_(self, val, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes an one's complement of ``val``.
``or_(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs | rhs``.
``phi(self, ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a PHI node of type ``ty``.
``position_at_beginning(self, bblk)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Position the builder at the beginning of the given block. Next
instruction inserted will be first one in the block.
``position_at_end(self, bblk)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Position the builder at the end of the given block. Next instruction
inserted will be last one in the block.
``position_before(self, instr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Position the builder before the given instruction. The instruction can
belong to a basic block other than the current one.
``ptrtoint(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts a pointer to an integer ``value`` of
type ``dest_ty``.
``ret(self, value)``
~~~~~~~~~~~~~~~~~~~~
Insert an instruction that returns ``value``.
``ret_many(self, values)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that returns ``values`` which is an iterable of
`llvm.core.Value <llvm.core.Value.html>`_.
``ret_void(self)``
~~~~~~~~~~~~~~~~~~
Insert an instruction that returns nothing (void).
``sdiv(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs / rhs`` for signed integers.
``select(self, cond, then_value, else_value, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``cond ? then_value : else_value``.
``sext(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that sign extends an integer ``value`` to type
``dest_ty``.
``shl(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs << rhs``.
``shuffle_vector(self, vecA, vecB, mask, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that performs a vector shuffle base on the two
vectors -- ``vecA`` and ``vecB``, base on a bit mask ``mask``. The mask
must be a constant.
See `LLVM document <http://llvm.org/docs/LangRef.html#i_shufflevector>`_
for detail.
``sitofp(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts a signed integer ``value`` to a
floating-point type ``dest_ty``.
``srem(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs % rhs`` for signed integers.
``store(self, value, ptr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that stores ``value`` into the memory pointed by
``ptr``.
``sub(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs - rhs``.
``switch(self, value, else_blk, n=10)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that transfer control flow depending on the
``value``. ``else_blk`` is the default case. ``n`` sets the number of
additional cases.
This method returns an instance of
`SwitchInstruction <llvm.core.Instruction.html#switchinstr>`_ for adding
cases to the switch.
``trunc(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that truncates an integer ``value`` to the
destination integer type ``dest_ty``.
``udiv(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs / rhs`` for unsigned integers.
``uitofp(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that converts an unsigned integer ``value`` to a
floating-point type ``dest_ty``.
``unreachable(self)``
~~~~~~~~~~~~~~~~~~~~~
Insert an unreachabe instruction, which has no defined semantics. See
`LLVM document <http://llvm.org/docs/LangRef.html#i_unreachable>`_ for
detail.
``urem(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs % rhs`` for unsigned integers.
``vaarg(self, list_val, ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is used to access variable arguments given as ``list_val`` of type
``ty``. see `LLVM
document <http://llvm.org/docs/LangRef.html#int_varargs>`_ about
variable argument intrinsics.
``xor(self, lhs, rhs, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that computes ``lhs xor rhs``.
``zext(self, value, dest_ty, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert an instruction that zero extends ``value`` to type ``dest_ty``.
Properties
----------
``basic_block``
~~~~~~~~~~~~~~~
The `BasicBlock <llvm.core.BasicBlock.html>`_ where the builder is
positioned.
``block``
~~~~~~~~~
Deprecated. Same as ``basic_block``

View file

@ -1,353 +0,0 @@
+-------------------------------+
| layout: page |
+-------------------------------+
| title: Constant (llvm.core) |
+-------------------------------+
``Constant``-s represents constants that appear within the code. The
values of such objects are known at creation time. Constants can be
created from Python constants. A constant expression is also a constant
-- given a ``Constant`` object, an operation (like addition, subtraction
etc) can be specified, to yield a new ``Constant`` object. Let's see
some examples:
.. code-block:: python
#!/usr/bin/env python
ti = Type.int() # a 32-bit int type
k1 = Constant.int(ti, 42) # "int k1 = 42;" k2 = k1.add( Constant.int(
ti, 10 ) ) # "int k2 = k1 + 10;"
tr = Type.float()
r1 = Constant.real(tr, "3.141592") # create from a string r2 =
Constant.real(tr, 1.61803399) # create from a Python float {%
endhighlight %}
# llvm.core.Constant
- This will become a table of contents (this text will be scraped).
{:toc}
Static factory methods
----------------------
``null(ty)``
~~~~~~~~~~~~
A null value (all zeros) of type ``ty``
``all_ones(ty)``
~~~~~~~~~~~~~~~~
All 1's value of type ``ty``
``undef(ty)``
~~~~~~~~~~~~~
An undefined value of type ``ty``
``int(ty, value)``
~~~~~~~~~~~~~~~~~~
Integer of type ``ty``, with value ``value`` (a Python int or long)
``int_signextend(ty, value)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Integer of signed type ``ty`` (use for signed types)
``real(ty, value)``
~~~~~~~~~~~~~~~~~~~
Floating point value of type ``ty``, with value ``value`` (a Python
float)
``stringz(value)``
~~~~~~~~~~~~~~~~~~
A null-terminated string. ``value`` is a Python string
``string(value)``
~~~~~~~~~~~~~~~~~
As ``string(ty)``, but not null terminated
``array(ty, consts)``
~~~~~~~~~~~~~~~~~~~~~
Array of type ``ty``, initialized with ``consts`` (an iterable yielding
``Constant`` objects of the appropriate type)
``struct(ty, consts)``
~~~~~~~~~~~~~~~~~~~~~~
Struct (unpacked) of type ``ty``, initialized with ``consts`` (an
iterable yielding ``Constant`` objects of the appropriate type)
``packed_struct(ty, consts)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As ``struct(ty, consts)`` but packed
``vector(consts)``
~~~~~~~~~~~~~~~~~~
Vector, initialized with ``consts`` (an iterable yielding ``Constant``
objects of the appropriate type)
``sizeof(ty)``
~~~~~~~~~~~~~~
Constant value representing the sizeof the type ``ty``
Methods
-------
The following operations on constants are supported. For more details on
any operation, consult the `Constant
Expressions <http://www.llvm.org/docs/LangRef.html#constantexprs>`_
section of the LLVM Language Reference.
``k.neg()``
~~~~~~~~~~~
negation, same as ``0 - k``
``k.not_()``
~~~~~~~~~~~~
1's complement of ``k``. Note trailing underscore.
``k.add(k2)``
~~~~~~~~~~~~~
``k + k2``, where ``k`` and ``k2`` are integers.
``k.fadd(k2)``
~~~~~~~~~~~~~~
``k + k2``, where ``k`` and ``k2`` are floating-point.
``k.sub(k2)``
~~~~~~~~~~~~~
``k - k2``, where ``k`` and ``k2`` are integers.
``k.fsub(k2)``
~~~~~~~~~~~~~~
``k - k2``, where ``k`` and ``k2`` are floating-point.
``k.mul(k2)``
~~~~~~~~~~~~~
``k * k2``, where ``k`` and ``k2`` are integers.
``k.fmul(k2)``
~~~~~~~~~~~~~~
``k * k2``, where ``k`` and ``k2`` are floating-point.
``k.udiv(k2)``
~~~~~~~~~~~~~~
Quotient of unsigned division of ``k`` with ``k2``
``k.sdiv(k2)``
~~~~~~~~~~~~~~
Quotient of signed division of ``k`` with ``k2``
``k.fdiv(k2)``
~~~~~~~~~~~~~~
Quotient of floating point division of ``k`` with ``k2``
``k.urem(k2)``
~~~~~~~~~~~~~~
Reminder of unsigned division of ``k`` with ``k2``
``k.srem(k2)``
~~~~~~~~~~~~~~
Reminder of signed division of ``k`` with ``k2``
``k.frem(k2)``
~~~~~~~~~~~~~~
Reminder of floating point division of ``k`` with ``k2``
``k.and_(k2)``
~~~~~~~~~~~~~~
Bitwise and of ``k`` and ``k2``. Note trailing underscore.
``k.or_(k2)``
~~~~~~~~~~~~~
Bitwise or of ``k`` and ``k2``. Note trailing underscore.
``k.xor(k2)``
~~~~~~~~~~~~~
Bitwise exclusive-or of ``k`` and ``k2``.
``k.icmp(icmp, k2)``
~~~~~~~~~~~~~~~~~~~~
Compare ``k`` with ``k2`` using the predicate ``icmp``. See
`here <comparision.html#icmp>`_ for list of predicates for integer
operands.
``k.fcmp(fcmp, k2)``
~~~~~~~~~~~~~~~~~~~~
Compare ``k`` with ``k2`` using the predicate ``fcmp``. See
`here <comparision.html#fcmp>`_ for list of predicates for real
operands.
``k.shl(k2)``
~~~~~~~~~~~~~
Shift ``k`` left by ``k2`` bits.
``k.lshr(k2)``
~~~~~~~~~~~~~~
Shift ``k`` logically right by ``k2`` bits (new bits are 0s).
``k.ashr(k2)``
~~~~~~~~~~~~~~
Shift ``k`` arithmetically right by ``k2`` bits (new bits are same as
previous sign bit).
``k.gep(indices)``
~~~~~~~~~~~~~~~~~~
GEP, see `LLVM docs <http://www.llvm.org/docs/GetElementPtr.html>`_.
``k.trunc(ty)``
~~~~~~~~~~~~~~~
Truncate ``k`` to a type ``ty`` of lower bitwidth.
``k.sext(ty)``
~~~~~~~~~~~~~~
Sign extend ``k`` to a type ``ty`` of higher bitwidth, while extending
the sign bit.
``k.zext(ty)``
~~~~~~~~~~~~~~
Sign extend ``k`` to a type ``ty`` of higher bitwidth, all new bits are
0s.
``k.fptrunc(ty)``
~~~~~~~~~~~~~~~~~
Truncate floating point constant ``k`` to floating point type ``ty`` of
lower size than k's.
``k.fpext(ty)``
~~~~~~~~~~~~~~~
Extend floating point constant ``k`` to floating point type ``ty`` of
higher size than k's.
``k.uitofp(ty)``
~~~~~~~~~~~~~~~~
Convert an unsigned integer constant ``k`` to floating point constant of
type ``ty``.
``k.sitofp(ty)``
~~~~~~~~~~~~~~~~
Convert a signed integer constant ``k`` to floating point constant of
type ``ty``.
``k.fptoui(ty)``
~~~~~~~~~~~~~~~~
Convert a floating point constant ``k`` to an unsigned integer constant
of type ``ty``.
``k.fptosi(ty)``
~~~~~~~~~~~~~~~~
Convert a floating point constant ``k`` to a signed integer constant of
type ``ty``.
``k.ptrtoint(ty)``
~~~~~~~~~~~~~~~~~~
Convert a pointer constant ``k`` to an integer constant of type ``ty``.
``k.inttoptr(ty)``
~~~~~~~~~~~~~~~~~~
Convert an integer constant ``k`` to a pointer constant of type ``ty``.
``k.bitcast(ty)``
~~~~~~~~~~~~~~~~~
Convert ``k`` to a (equal-width) constant of type ``ty``.
``k.select(cond,k2,k3)``
~~~~~~~~~~~~~~~~~~~~~~~~
Replace value with ``k2`` if the 1-bit integer constant ``cond`` is 1,
else with ``k3``.
``k.extract_element(idx)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Extract value at ``idx`` (integer constant) from a vector constant
``k``.
``k.insert_element(k2,idx)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Insert value ``k2`` (scalar constant) at index ``idx`` (integer
constant) of vector constant ``k``.
``k.shuffle_vector(k2,mask)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Shuffle vector constant ``k`` based on vector constants ``k2`` and
``mask``.
--------------
# Other Constant Classes
The following subclasses of ``Constant`` do not provide additional
methods, **they serve only to provide richer type information.**
Subclass \| LLVM C++ Class \| Remarks \|
---------\|----------------\|---------\| ``ConstantExpr`` \|
``llvmConstantExpr`` \| A constant expression \|
``ConstantAggregateZero``\ \| ``llvmConstantAggregateZero``\ \| All-zero
constant \| ``ConstantInt``\ \| ``llvmConstantInt``\ \| An integer
constant \| ``ConstantFP``\ \| ``llvmConstantFP``\ \| A floating-point
constant \| ``ConstantArray``\ \| ``llvmConstantArray``\ \| An array
constant \| ``ConstantStruct``\ \| ``llvmConstantStruct``\ \| A
structure constant \| ``ConstantVector``\ \| ``llvmConstantVector``\ \|
A vector constant \| ``ConstantPointerNull``\ \|
``llvmConstantPointerNull``\ \| All-zero pointer constant \|
``UndefValue``\ \| ``llvmUndefValue``\ \| corresponds to ``undef`` of
LLVM IR \|
These types are helpful in ``isinstance`` checks, like so:
{% highlight python %} ti = Type.int(32) k1 = Constant.int(ti, 42) #
int32_t k1 = 42; k2 = Constant.array(ti, [k1, k1]) # int32_t k2[] = {
k1, k1 };
assert isinstance(k1, ConstantInt) assert isinstance(k2, ConstantArray)

View file

@ -1,151 +0,0 @@
+-------------------------------+
| layout: page |
+-------------------------------+
| title: Function (llvm.core) |
+-------------------------------+
llvm.core.Function
==================
- This will become a table of contents (this text will be scraped).
{:toc}
Base Class
----------
- `llvm.core.GlobalValue <llvm.core.GlobalValue.html>`_
Static Constructors
-------------------
``new(module_obj, func_ty, name)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a function named ``name`` of type ``func_ty`` in the module
``module_obj`` and return a ``Function`` object that represents it.
``get(module_obj, name)``
~~~~~~~~~~~~~~~~~~~~~~~~~
Return a ``Function`` object to represent the function named ``name`` in
the module ``module_obj`` or raise ``LLVMException`` if such a function
does not exist.
``get_or_insert(module_obj, func_ty, name)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Similar to ``get``, except that if the function does not exist it is
added first, as though with ``new``.
``intrinsic(module_obj, intrinsic_id, types)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create and return a ``Function`` object that refers to an intrinsic
function, as described `here <functions.html#intrinsic>`_.
Properties
----------
``calling_convention``
~~~~~~~~~~~~~~~~~~~~~~
The calling convention for the function, as listed
`here <functions.html#callconv>`_.
``collector``
~~~~~~~~~~~~~
A string holding the name of the garbage collection algorithm. See `LLVM
docs <http://www.llvm.org/docs/LangRef.html#gc>`_.
``does_not_throw``
~~~~~~~~~~~~~~~~~~
Setting to True sets the ``ATTR_NO_UNWIND`` attribute, False removes it.
Shortcut to using ``f.add_attribute(ATTR_NO_UNWIND)`` and
``f.remove_attribute(ATTR_NO_UNWIND)``.
``args``
~~~~~~~~
[read-only]
List of `llvm.core.Argument <llvm.core.Argument.html>`_ objects
representing the formal arguments of the function.
``basic_block_count``
~~~~~~~~~~~~~~~~~~~~~
[read-only]
Number of basic blocks belonging to this function. Same as
``len(f.basic_blocks)`` but faster if you just want the count.
``entry_basic_block``
~~~~~~~~~~~~~~~~~~~~~
[read-only]
The `llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_ object
representing the entry basic block for this function, or ``None`` if
there are no basic blocks.
``basic_blocks``
~~~~~~~~~~~~~~~~
[read-only]
List of `llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_ objects
representing the basic blocks belonging to this function.
``intrinsic_id``
~~~~~~~~~~~~~~~~
[read-only]
Returns the ID of the intrinsic if this object represents an intrinsic
instruction. Otherwise 0.
Methods
-------
``delete()``
~~~~~~~~~~~~
Deletes the function from it's module. Do not hold any references to
this object after calling ``delete`` on it.
``append_basic_block(name)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add a new basic block named ``name``, and return a corresponding
`llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_ object. Note that if
this is not the entry basic block, you'll have to add appropriate branch
instructions from other basic blocks yourself.
``add_attribute(attr)``
~~~~~~~~~~~~~~~~~~~~~~~
Add an attribute ``attr`` to the function, from the set listed above.
``remove_attribute(attr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Remove the attribute ``attr`` of the function.
``viewCFG()``
~~~~~~~~~~~~~
Displays the control flow graph using the GraphViz tool.
``viewCFGOnly()``
~~~~~~~~~~~~~~~~~
Displays the control flow graph using the GraphViz tool, but omitting
function bodies.
``verify()``
~~~~~~~~~~~~
Verifies the function. See `LLVM
docs <http://llvm.org/docs/Passes.html#verify>`_.

View file

@ -1,48 +0,0 @@
+-----------------------------------+
| layout: page |
+-----------------------------------+
| title: FunctionType (llvm.core) |
+-----------------------------------+
llvm.core.FunctionType
======================
Base Class
----------
- `llvm.core.Type <llvm.core.Type.html>`_
Properties
----------
``return_type``
~~~~~~~~~~~~~~~
[read-only]
A `Type <llvm.core.Type.html>`_ object, representing the return type of
the function.
``vararg``
~~~~~~~~~~
[read-only]
``True`` if the function is variadic.
``args``
~~~~~~~~
[read-only]
Returns an iterable object that yields `Type <llvm.core.Type.html>`_
objects that represent, in order, the types of the arguments accepted by
the function. Used like this:
.. code-block:: python
func_type = Type.function( Type.int(), [
Type.int(), Type.int() ] ) for arg in func_type.args: assert arg.kind
== TYPE_INTEGER assert arg == Type.int() assert func_type.arg_count
== len(func_type.args)

View file

@ -1,97 +0,0 @@
+----------------------------------+
| layout: page |
+----------------------------------+
| title: GlobalValue (llvm.core) |
+----------------------------------+
The class ``llvm.core.GlobalValue`` represents module-scope aliases,
variables and functions. Global variables are represented by the
sub-class `llvm.core.GlobalVariable <llvm.core.GlobalVariable.html>`_
and functions by `llvm.core.Function <llvm.core.Function.html>`_.
Global values have the read-write properties ``linkage``, ``section``,
``visibility`` and ``alignment``. Use one of the following constants
(from llvm.core) as values for ``linkage`` (see `LLVM
documentaion <http://www.llvm.org/docs/LangRef.html#linkage>`_ for
details on each):
Value \| Equivalent LLVM Assembly Keyword \|
------\|----------------------------------\| ``LINKAGE_EXTERNAL`` \|
``externally_visible`` \| ``LINKAGE_AVAILABLE_EXTERNALLY`` \|
``available_externally`` \| ``LINKAGE_LINKONCE_ANY`` \| ``linkonce`` \|
``LINKAGE_LINKONCE_ODR`` \| ``linkonce_odr`` \| ``LINKAGE_WEAK_ANY`` \|
``weak`` \| ``LINKAGE_WEAK_ODR`` \| ``weak_odr`` \|
``LINKAGE_APPENDING`` \| ``appending`` \| ``LINKAGE_INTERNAL`` \|
``internal`` \| ``LINKAGE_PRIVATE`` \| ``private`` \|
``LINKAGE_DLLIMPORT`` \| ``dllimport`` \| ``LINKAGE_DLLEXPORT`` \|
``dllexport`` \| ``LINKAGE_EXTERNAL_WEAK`` \| ``extern_weak`` \|
``LINKAGE_GHOST`` \| deprecated -- do not use \| ``LINKAGE_COMMON`` \|
``common`` \| ``LINKAGE_LINKER_PRIVATE`` \| ``linker_private`` \|
The ``section`` property can be assigned strings (like ".rodata"), which
will be used if the target supports it. Visibility property can be set
to one of thse constants (from llvm.core, see also `LLVM
docs <http://www.llvm.org/docs/LangRef.html#visibility>`_):
Value \| Equivalent LLVM Assembly Keyword \|
------\|----------------------------------\| ``VISIBILITY_DEFAULT`` \|
``default`` \| ``VISIBILITY_HIDDEN`` \| ``hidden`` \|
``VISIBILITY_PROTECTED`` \| ``protected`` \|
The ``alignment`` property can be 0 (default), or can be set to a power
of 2. The read-only property ``is_declaration`` can be used to check if
the global is a declaration or not. The module to which the global
belongs to can be retrieved using the ``module`` property (read-only).
llvm.core.GlobalValue
=====================
- This will become a table of contents (this text will be scraped).
{:toc}
Base Class
----------
- `llvm.core.Constant <llvm.core.Constant.html>`_
Properties
----------
``linkage``
~~~~~~~~~~~
The linkage type, takes one of the constants listed above (LINKAGE\_\*).
``section``
~~~~~~~~~~~
A string like ".rodata", indicating the section into which the global is
placed into.
``visibility``
~~~~~~~~~~~~~~
The visibility type, takes one of the constants listed above
(VISIBILITY\_\*).
``alignment``
~~~~~~~~~~~~~
A power-of-2 integer indicating the boundary to align to.
``is_declaration``
~~~~~~~~~~~~~~~~~~
[read-only]
``True`` if the global is a declaration, ``False`` otherwise.
``module``
~~~~~~~~~~
[read-only]
::
The module object to which this global belongs to.

View file

@ -1,39 +0,0 @@
+-------------------------------------+
| layout: page |
+-------------------------------------+
| title: GlobalVariable (llvm.core) |
+-------------------------------------+
Global variables (``llvm.core.GlobalVariable``) are subclasses of
`llvm.core.GlobalValue <llvm.core.GlobalValue.html>`_ and represent
module-level variables. These can have optional initializers and can be
marked as constants. Global variables can be created either by using the
``add_global_variable`` method of the `Module <llvm.core.Module.html>`_
class, or by using the static method ``GlobalVariable.new``.
.. code-block:: python
# create a global variable using
add_global_variable method gv1 =
module_obj.add_global_variable(Type.int(), "gv1")
# or equivalently, using a static constructor method
gv2 = GlobalVariable.new(module_obj, Type.int(), "gv2") {% endhighlight
%}
Existing global variables of a module can be accessed by name using
``module_obj.get_global_variable_named(name)`` or
``GlobalVariable.get``. All existing global variables can be enumerated
via iterating over the property ``module_obj.global_variables``.
{% highlight python %} # retrieve a reference to the global variable
gv1, # using the get_global_variable_named method gv1 =
module_obj.get_global_variable_named("gv1")
# or equivalently, using the static ``get`` method:
gv2 = GlobalVariable.get(module_obj, "gv2")
# list all global variables in a module
for gv in module_obj.global_variables: print gv.name, "of type",
gv.type

View file

@ -1,242 +0,0 @@
+----------------------------------+
| layout: page |
+----------------------------------+
| title: Instruction (llvm.core) |
+----------------------------------+
An ``llvm.core.Instruction`` object represents an LLVM instruction. This
class is the root of a small hierarchy:
::
Instruction
CallOrInvokeInstruction
PHINode
SwitchInstruction
CompareInstruction
Instructions are not created directly, but via a builder. The builder
both creates instructions and adds them to a basic block at the same
time. One way of getting instruction objects are from basic blocks.
Being derived from `llvm.core.User <llvm.core.User.html>`_, the
instruction is-a user, i.e., an instruction in turn uses other values.
The values an instruction uses are its operands. These may be accessed
using ``operands`` property from the
`llvm.core.User <llvm.core.User.html>`_ base.
The name of the instruction (like ``add``, ``mul`` etc) can be got via
the ``opcode_name`` property. The ``basic_block`` property gives the
basic block to which the instruction belongs to. Note that llvmpy does
not allow free-standing instruction objects (i.e., all instructions are
created contained within a basic block).
Classes of instructions can be got via the properties ``is_terminator``,
``is_binary_op``, ``is_shift`` etc. See below for the full list.
- This will become a table of contents (this text will be scraped).
{:toc}
llvm.core.Instruction
=====================
Base Class
----------
- `llvm.core.User <llvm.core.User.html>`_
Properties
----------
``basic_block``
~~~~~~~~~~~~~~~
[read-only] The basic block to which this instruction belongs to.
``is_terminator``
~~~~~~~~~~~~~~~~~
[read-only] True if the instruction is a terminator instruction.
``is_binary_op``
~~~~~~~~~~~~~~~~
[read-only] True if the instruction is a binary operator.
``is_shift``
~~~~~~~~~~~~
[read-only] True if the instruction is a shift instruction.
``is_cast``
~~~~~~~~~~~
[read-only] True if the instruction is a cast instruction.
``is_logical_shift``
~~~~~~~~~~~~~~~~~~~~
[read-only] True if the instruction is a logical shift instruction.
``is_arithmetic_shift``
~~~~~~~~~~~~~~~~~~~~~~~
[read-only] True if the instruction is an arithmetic shift instruction.
``is_associative``
~~~~~~~~~~~~~~~~~~
[read-only] True if the instruction is associative.
``is_commutative``
~~~~~~~~~~~~~~~~~~
[read-only] True if the instruction is commutative.
``is_volatile``
~~~~~~~~~~~~~~~
[read-only] True if the instruction is a volatile load or store.
``opcode``
~~~~~~~~~~
[read-only] The numeric opcode value of the instruction. Do not rely on
the absolute value of this number, it may change with LLVM version.
``opcode_name``
~~~~~~~~~~~~~~~
[read-only] The name of the instruction, like ``add``, ``sub`` etc.
--------------
llvm.core.CallOrInvokeInstruction
=================================
The ``llvm.core.CallOrInvokeInstruction`` is a subclass of
``llvm.core.Instruction``, and represents either a ``call`` or an
``invoke`` instruction.
Base Class
----------
- ``llvm.core.Instruction``
Properties
----------
``calling_convention`` Get or set the calling convention. See
`here <functions.html#callconv>`_ for possible values.
Methods
-------
``add_parameter_attribute(idx, attr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add an attribute ``attr`` to the ``idx``-th argument. See
`here <llvm.core.Argument.html>`_ for possible values of ``attr``.
``remove_parameter_attribute(idx, attr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Remove an attribute ``attr`` from the ``idx``-th argument. See
`here <llvm.core.Argument.html>`_ for possible values of ``attr``.
``set_parameter_alignment(idx, align)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set the alignment of the ``idx``-th argument to ``align``. ``align``
should be a power of two.
--------------
llvm.core.PHINode
=================
The ``llvm.core.PHINode`` is a subclass of ``llvm.core.Instruction``,
and represents the ``phi`` instruction. When created (using
``Builder.phi``) the phi node contains no incoming blocks (nor their
corresponding values). To add an incoming arc to the phi node, use the
``add_incoming`` method, which takes a source block
(`llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_ object) and a value
(object of `llvm.core.Value <llvm.core.Value.html>`_ or of a class
derived from it) that the phi node will take on if control branches in
from that block.
Base Class
----------
- ``llvm.core.Instruction``
Properties
----------
``incoming_count`` [read-only] The number of incoming arcs for this phi
node.
Methods
-------
``add_incoming(value, block)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add an incoming arc, from the
`llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_ object ``block``,
with the corresponding value ``value``. ``value`` should be an object of
`llvm.core.Value <llvm.core.Value.html>`_ (or of a descendent class).
``get_incoming_value(idx)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the ``idx``-th incoming arc's value.
``get_incoming_block(idx)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the ``idx``-th incoming arc's block.
llvm.core.SwitchInstruction # {#switchinstr}
============================================
(TODO describe)
Base Class
----------
- ``llvm.core.Instruction``
Methods
-------
``add_case(const, block)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Add another case to the switch statement. When the expression being
evaluated equals ``const``, then control branches to ``block``. Here
``const`` must be of type
`llvm.core.ConstantInt <llvm.core.Constant.html>`_.
--------------
llvm.core.CompareInstruction
============================
(TODO describe)
Base Class
----------
- ``llvm.core.Instruction``
Properties
----------
``predicate``
~~~~~~~~~~~~~
[read-only]
The predicate of the compare instruction, one of the ``ICMP_*`` or
``FCMP_*`` constants.

View file

@ -1,23 +0,0 @@
+----------------------------------+
| layout: page |
+----------------------------------+
| title: IntegerType (llvm.core) |
+----------------------------------+
llvm.core.IntegerType
=====================
Base Class
----------
- `llvm.core.Type <llvm.core.Type.html>`_
Properties
----------
``width``
~~~~~~~~~
[read-only]
The width of the integer type, in number of bits.

View file

@ -1,19 +0,0 @@
+-----------------------------+
| layout: page |
+-----------------------------+
| title: Module (llvm.core) |
+-----------------------------+
Modules are top-level container objects. You need to create a module
object first, before you can add global variables, aliases or functions.
Modules are created using the static method ``Module.new``:
.. code-block:: python
#!/usr/bin/env python
from llvm import \* from llvm.core import \*
# create a module
my_module = Module.new('my_module')

View file

@ -1,31 +0,0 @@
+----------------------------------+
| layout: page |
+----------------------------------+
| title: PointerType (llvm.core) |
+----------------------------------+
llvm.core.PointerType
=====================
Base Class
----------
- `llvm.core.Type <llvm.core.Type.html>`_
Properties
----------
``address_space``
~~~~~~~~~~~~~~~~~
[read-only]
The address space of the pointer.
``pointee``
~~~~~~~~~~~
[read-only]
A `Type <llvm.core.Type.html>`_ object representing the type of the
value pointed to.

View file

@ -1,128 +0,0 @@
+---------------------------+
| layout: page |
+---------------------------+
| title: Type (llvm.core) |
+---------------------------+
llvm.core.Type
==============
- This will become a table of contents (this text will be scraped).
{:toc}
Static Constructors
-------------------
``int(n)``
~~~~~~~~~~
Create an integer type of bit width ``n``.
``float()``
~~~~~~~~~~~
Create a 32-bit floating point type.
``double()``
~~~~~~~~~~~~
Create a 64-bit floating point type.
``x86_fp80()``
~~~~~~~~~~~~~~
Create a 80-bit 80x87-style floating point type.
``fp128()``
~~~~~~~~~~~
Create a 128-bit floating point type (112-bit mantissa).
``ppc_fp128()``
~~~~~~~~~~~~~~~
Create a 128-bit float (two 64-bits).
``function(ret, params, vararg=False)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a function type, having the return type ``ret`` (must be a
``Type``), accepting the parameters ``params``, where ``params`` is an
iterable, that yields ``Type`` objects representing the type of each
function argument in order. If ``vararg`` is ``True``, function is
variadic.
``struct(eltys, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Create an unpacked structure. ``eltys`` is an iterable, that yields
``Type`` objects representing the type of each element in order.
If ``name`` is evaulates ``True`` (not empty), create an *identified
structure*; otherwise, create a *literal structure* by default.
``packed_struct(eltys, name='')``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Like ``struct(eltys)``, but creates a packed struct.
``array(elty, count)``
~~~~~~~~~~~~~~~~~~~~~~
Creates an array type, holding ``count`` elements, each of type ``elty``
(which should be a ``Type``).
``pointer(pty, addrspc=0)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a pointer to type ``pty`` (which should be a ``Type``).
``addrspc`` is an integer that represents the address space of the
pointer (see LLVM docs or ask on llvm-dev for more info).
``void()``
~~~~~~~~~~
Creates a void type. Used for function return types.
``label()``
~~~~~~~~~~~
Creates a label type.
``opaque(name)``
~~~~~~~~~~~~~~~~
Opaque `StructType <llvm.core.StructType.html>`_, used for creating
self-referencing types.
Properties
----------
``kind``
~~~~~~~~
[read-only]
A value (enum) representing the "type" of the object. It will be one of
the following constants defined in ``llvm.core``:
.. code-block:: python
# Warning: do not rely on actual numerical
values! TYPE_VOID = 0 TYPE_FLOAT = 1 TYPE_DOUBLE = 2 TYPE_X86_FP80
= 3 TYPE_FP128 = 4 TYPE_PPC_FP128 = 5 TYPE_LABEL = 6 TYPE_INTEGER =
7 TYPE_FUNCTION = 8 TYPE_STRUCT = 9 TYPE_ARRAY = 10 TYPE_POINTER =
11 TYPE_OPAQUE = 12 TYPE_VECTOR = 13 TYPE_METADATA = 14 TYPE_UNION =
15
Example:
^^^^^^^^
.. code-block:: python
assert Type.int().kind == TYPE_INTEGER assert
Type.void().kind == TYPE_VOID

View file

@ -1,39 +0,0 @@
+---------------------------+
| layout: page |
+---------------------------+
| title: User (llvm.core) |
+---------------------------+
``User``-s are values that refer to other values. The values so refered
can be retrived by the properties of ``User``. This is the reverse of
the ``Value.uses``. Together these can be used to traverse the use-def
chains of the SSA.
--------------
llvm.core.User # {#user}
========================
Base Class
----------
- `llvm.core.Value <llvm.core.Value.html>`_
Properties
----------
``operands``
~~~~~~~~~~~~
[read-only]
The list of operands (values, of type
`llvm.core.Value <llvm.core.Value.html>`_) that this value refers to.
``operand_count``
~~~~~~~~~~~~~~~~~
[read-only]
The number of operands that this value referes to. Same as
``len(uses.operands)`` but faster if you just want the count.

View file

@ -1,65 +0,0 @@
+----------------------------+
| layout: page |
+----------------------------+
| title: Value (llvm.core) |
+----------------------------+
llvm.core.Value
===============
- This will become a table of contents (this text will be scraped).
{:toc}
Properties
----------
``name``
~~~~~~~~
The name of the value.
``type``
~~~~~~~~
[read-only]
An ``llvm.core.Type`` object representing the type of the value.
``uses``
~~~~~~~~
[read-only]
The list of values (``llvm.core.Value``) that use this value.
``use_count``
~~~~~~~~~~~~~
[read-only]
The number of values that use (refer) this value. Same as
``len(val.uses)`` but faster if you just want the count.
``value_id``
~~~~~~~~~~~~
[read-only]
Returns ``llvmValuegetValueID()``. Refer LLVM documentation for more
info.
Special Methods
---------------
``__str__``
~~~~~~~~~~~
``Value`` objects can be stringified into it's LLVM assembly language
representation.
``__eq__``
~~~~~~~~~~
``Value`` objects can be compared for equality. Internally, this
converts both arguments into their LLVM assembly representations and
compares the resultant strings.

View file

@ -1,31 +0,0 @@
+---------------------------------+
| layout: page |
+---------------------------------+
| title: VectorType (llvm.core) |
+---------------------------------+
llvm.core.VectorType
====================
Base Class
----------
- `llvm.core.Type <llvm.core.Type.html>`_
Properties
----------
``element``
~~~~~~~~~~~
[read-only]
A `Type <llvm.core.Type.html>`_ object representing the type of the
element of the vector.
``count``
~~~~~~~~~
[read-only]
The number of elements in the vector.

View file

@ -1,56 +0,0 @@
+----------------------------------+
| layout: page |
+----------------------------------+
| title: EngineBuilder (llvm.ee) |
+----------------------------------+
llvm.ee.EngineBuilder
=====================
A convenient class for building
`llvm.ee.ExecutionEngine <llvm.ee.ExecutionEngine.html>`_. Each
``EngineBuilder`` instance can only create one ``ExecutionEngine``.
Methods
-------
``create(self)``
~~~~~~~~~~~~~~~~
Create and return a new
`ExecutionEngine <llvm.ee.ExecutionEngine.html>`_ instance.
Raise ``llvm.LLVMException`` if the builder cannot create an
``ExecutionEngine`` base on the given configuration.
``force_interpreter(self)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Force the output the output ``ExecutionEngine`` to be an LLVM IR
interpreter.
``force_jit(self)``
~~~~~~~~~~~~~~~~~~~
Force the output the output ``ExecutionEngine`` to be a JIT engine.
``opt(self, level)``
~~~~~~~~~~~~~~~~~~~~
Set the code generation optimization level for a JIT engine. Valid value
of ``level`` is 0-3, inclusive. The default setting is 2. To use vector
instructions, such as SSE on Intel processors, ``level`` must be 3
(aggressive).
Static Factory Methods
----------------------
``new(module)``
~~~~~~~~~~~~~~~
Create a new EngineBuilder. ``module`` must be a
`llvm.core.Module <llvm.core.Module.html>`_ instance. Its ownership is
transferred to the resulting
`ExecutionEngine <llvm.ee.ExecutionEngine.html>`_. Therefore, it is
impossible to create more than one ``ExecutionEngine`` with a single
``EngineBuilder``

View file

@ -1,62 +0,0 @@
+------------------------------------+
| layout: page |
+------------------------------------+
| title: ExecutionEngine (llvm.ee) |
+------------------------------------+
llvm.ee.ExecutionEngine
=======================
Methods
-------
``add_module(self, module)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add a new module to the ExecutionEngine. The ownership is of ``module``
is transferred. When the ``ExecutionEngine`` is destroyed, the module is
destroyed.
``free_machine_code_for(self, fn)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Release memory used for the machine code generated for the function
``fn``.
``get_pointer_to_function(self, fn)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Obtain the pointer to the function ``fn``. This forces the
ExecutionEngine to generate the machine code in lazy mode.
If ``fn`` is not defined, ``ExecutionEngine`` will lookup the symbol
through ``dlsym``.
The returned function pointer can be wrapped as a ``ctypes`` function.
``remove_module(self, module)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Remove the ``module``.
``run_function(self, fn, args)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Execute the function ``fn`` with an iterable of arguments ``args`` which
are of ``GenericValue``. This method returns whatever that is returned
by ``fn`` as a ``GenericValue``.
``run_static_ctors(self)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
``run_static_dtors(self)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Properties
----------
``target_data``
~~~~~~~~~~~~~~~
Access the `TargetData <llvm.ee.TargetData.html>`_ instance associated
with the ``ExecutionEngine``.

View file

@ -1,64 +0,0 @@
+---------------------------------+
| layout: page |
+---------------------------------+
| title: GenericValue (llvm.ee) |
+---------------------------------+
llvm.ee.GenericValue
====================
- This will become a table of contents (this text will be scraped).
{:toc}
Methods
-------
``as_int(self)``
~~~~~~~~~~~~~~~~
Return the value of this ``GenericValue`` instance as an unsigned
integer
``as_int_signed(self)``
~~~~~~~~~~~~~~~~~~~~~~~
Return the value of this ``GenericValue`` instance as a signed integer.
``as_pointer(self)``
~~~~~~~~~~~~~~~~~~~~
Return the value of this ``GenericValue`` instance as a pointer. The
type of the return value is ``int``.
``as_real(self, ty)``
~~~~~~~~~~~~~~~~~~~~~
Return the value of this ``GenericValue`` instance as a real number
which type is specified by ``ty``. ``ty`` must be a
`Type <llvm.core.Type.html>`_ instance of a real number type.
Static Factory Methods
----------------------
``int(ty, intval)``
~~~~~~~~~~~~~~~~~~~
Create a ``GenericValue`` instance with a ``int`` value, which is
zero-extended if necessary. The type of the value is specified by
``ty``, which is a `Type <llvm.core.Type.html>`_ instance.
``int_signed(ty, intval)``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a ``GenericValue`` instance with a ``int`` value, which is
sign-extended if necessary. The type of the value is specified by
``ty``, which is a `Type <llvm.core.Type.html>`_ instance.
``pointer(ty, addr)`` or ``pointer(addr)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a ``GenericValue`` instance with a ``int`` value, which is
representing a pointer value.
The two argument version is **deprecated**. The old code never used
``ty`` anyway.

View file

@ -1,70 +0,0 @@
+-------------------------------+
| layout: page |
+-------------------------------+
| title: TargetData (llvm.ee) |
+-------------------------------+
llvm.ee.TargetData
==================
- This will become a table of contents (this text will be scraped).
{:toc}
Methods
-------
``abi_alignment(self, ty)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the minimum ABI-required alignment for the specified type
``ty``.
``abi_size(self, ty)``
~~~~~~~~~~~~~~~~~~~~~~
``callframe_alignment(self, ty)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the minimum ABI-required alignment for the specified type ``ty``
when it is part of a call frame.
``element_at_offset(self, ty, ofs)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``offset_of_element(self, ty, el)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``preferred_alignment(self, ty_or_gv)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``size(self, ty)``
~~~~~~~~~~~~~~~~~~
``store_size(self, ty)``
~~~~~~~~~~~~~~~~~~~~~~~~
``__str__(self)``
~~~~~~~~~~~~~~~~~
Returns the string representation.
Static Factory Methods
----------------------
``new(strrep)``
~~~~~~~~~~~~~~~
Construct a new ``TargetData`` instance from the string representation
Properties
----------
``byte_order``
~~~~~~~~~~~~~~
``pointer_size``
~~~~~~~~~~~~~~~~
``target_integer_type``
~~~~~~~~~~~~~~~~~~~~~~~

View file

@ -1,42 +0,0 @@
+--------------------------------------------+
| layout: page |
+--------------------------------------------+
| title: FunctionPassManager (llvm.passes) |
+--------------------------------------------+
llvm.passes.FunctionPassManager
===============================
Base Classes
------------
- `llvm.passes.PassManager <llvm.passes.PassManager.html>`_
Methods
-------
``finalize(self)``
~~~~~~~~~~~~~~~~~~
Finalizes all associated function passes in the LLVM system.
Beware that this destroys all associated passes even if another pass
manager is using those passes. This may result is a segfault.
``initialize(self)``
~~~~~~~~~~~~~~~~~~~~
Initializes all associated function passes in the LLVM system.
``run(self, fn)``
~~~~~~~~~~~~~~~~~
Run all passes on the given function ``fn``.
Static Factory Methods
----------------------
``new(module)``
~~~~~~~~~~~~~~~
Create a ``FunctionPassManager`` instance for a given ``module``.

View file

@ -1,29 +0,0 @@
+------------------------------------+
| layout: page |
+------------------------------------+
| title: PassManager (llvm.passes) |
+------------------------------------+
llvm.passes.PassManager
=======================
Methods
-------
``add(self, tgt_data_or_pass_id)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add a pass by its ID. A pass IDs are defined as ``PASS_*``.
``run(self, module)``
~~~~~~~~~~~~~~~~~~~~~
Run all passes on the given ``module``.
Static Factory Methods
----------------------
``new()``
~~~~~~~~~
Creates a new ``PassManager`` instance.

View file

@ -1,72 +0,0 @@
+-------------------------------------------+
| layout: page |
+-------------------------------------------+
| title: PassManagerBuilder (llvm.passes) |
+-------------------------------------------+
llvm.passes.PassManagerBuilder
==============================
Provide a simple API to populate pass managers for language like C/C++.
Refer to `LLVM API
Documentation <http://llvm.org/docs/doxygen/html/classllvm_1_1PassManagerBuilder.html>`_
for detail.
Methods
-------
``populate(self, pm)``
~~~~~~~~~~~~~~~~~~~~~~
Populate a `FunctionPassManager <llvm.passes.FunctionPassManager.html>`_
or `PassManager <llvm.passes.PassManager.html>`_ given as ``pm``.
``use_inliner_with_threshold(self, threshold)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use an inliner pass with the given ``threshold``.
Properties
----------
The following properties can be overriden to customize how pass managers
are populated.
``disable_simplify_lib_calls``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Boolean. Default is ``False``.
``disable_unit_at_a_time``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Boolean. Default is ``False``.
``disable_unroll_loops``
~~~~~~~~~~~~~~~~~~~~~~~~
Boolean. Default is ``False``.
``opt_level``
~~~~~~~~~~~~~
Default is ``2``. Valid values are 0-3. Corresponds to O0, O1, O2, O3 as
in C/C++ optimization options.
``size_level``
~~~~~~~~~~~~~~
Default is ``0``.
``vectorize``
~~~~~~~~~~~~~
Default is ``False``.
Static Factory Methods
----------------------
``new()``
~~~~~~~~~
Creates a new ``PassManagerBuilder`` instance.

View file

@ -1,236 +0,0 @@
********************
LLVM Concepts
********************
This section explains a few concepts related to LLVM, not specific to
llvmpy.
.. toctree::
:hidden:
Intermediate Representation
===========================
The intermediate representation, or IR for short, is an in-memory data
structure that represents executable code. The IR data structures allow
for creation of types, constants, functions, function arguments,
instructions, global variables and so on. For example, to create a
function *sum* that takes two integers and returns their sum, we need to
follow these steps:
- create an integer type *ti* of required bitwidth
- create a function type *tf* which takes two *ti* -s and returns
another *ti*
- create a function of type *tf* named *sum*
- add a *basic block* to the function
- using a helper object called an *instruction builder*, add two
instructions into the basic block: . an instruction to add the two
arguments and store the result into a temporary variable . a return
instruction to return the value of the temporary variable
(A basic block is a block of instructions.)
LLVM has it's own instruction set; the instructions used above (*add*
and *ret*) are from this set. The LLVM instructions are at a higher
level than the usual assembly language; for example there are
instructions related to variable argument handling, exception handling,
and garbage collection. These allow high-level languages to be
represented cleanly in the IR.
SSA Form and PHI Nodes
======================
All LLVM instructions are represented in the *Static Single Assignment*
(SSA) form. Essentially, this means that any variable can be assigned to
only once. Such a representation facilitates better optimization, among
other benefits.
A consequence of single assignment are PHI (Φ) nodes. These are required
when a variable can be assigned a different value based on the path of
control flow. For example, the value of *b* at the end of execution of
the snippet below:
.. code-block:: c
a = 1;
if (v < 10)
a = 2;
b = a;
cannot be determined statically. The value of '2' cannot be assigned to
the 'original' *a*, since *a* can be assigned to only once. There are
two *a* 's in there, and the last assignment has to choose between which
version to pick. This is accomplished by adding a PHI node:
.. code-block:: c
a1 = 1;
if (v < 10)
a2 = 2;
b = PHI(a1, a2);
The PHI node selects *a1* or *a2*, depending on where the control
reached the PHI node. The argument *a1* of the PHI node is associated
with the block *"a1 = 1;"* and *a2* with the block *"a2 = 2;"*.
PHI nodes have to be explicitly created in the LLVM IR. Accordingly the
LLVM instruction set has an instruction called *phi*.
LLVM Assembly Language
======================
The LLVM IR can be represented offline in two formats
- a textual, human-readable form, similar to assembly language, called
the LLVM assembly language (files with .ll extension)
- a binary form, called the LLVM bitcode (files with .bc extension)
All three formats (the in-memory IR, the LLVM assembly language and the
LLVM bitcode) represent the *same* information. Each format can be
converted into the other two formats (using LLVM APIs).
The `LLVM demo page <http://www.llvm.org/demo/>`_ lets you type in C or
C++ code, converts it into LLVM IR and outputs the IR as LLVM assembly
language code.
Just to get a feel of the LLVM assembly language, here's a function in
C, and the corresponding LLVM assembly (as generated by the demo page):
.. code-block:: c
/* compute sum of 1..n */
unsigned sum(unsigned n) {
if (n == 0)
return 0;
else
return n + sum(n-1);
}
The corresponding LLVM assembly:
.. code-block:: llvm
; ModuleID = '/tmp/webcompile/_7149_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"
define i32 @sum(i32 %n) nounwind readnone {
entry:
%0 = icmp eq i32 %n, 0 ; [#uses=1]
br i1 %0, label %bb2, label %bb1
bb1: ; preds = %entry
%1 = add i32 %n, -1 ; [#uses=2]
%2 = icmp eq i32 %1, 0 ; [#uses=1]
br i1 %2, label %sum.exit, label %bb1.i
bb1.i: ; preds = %bb1
%3 = add i32 %n, -2 ; [#uses=1]
%4 = tail call i32 @sum(i32 %3) nounwind ; [#uses=1]
%5 = add i32 %4, %1 ; [#uses=1]
br label %sum.exit
sum.exit: ; preds = %bb1.i, %bb1
%6 = phi i32 [ %5, %bb1.i ], [ 0, %bb1 ] ; [#uses=1]
%7 = add i32 %6, %n ; [#uses=1]
ret i32 %7
bb2: ; preds = %entry
ret i32 0
}
Note the usage of SSA form. The long string called ``target datalayout``
is a specification of the platform ABI (like endianness, sizes of types,
alignment etc.).
The `LLVM Language Reference <http://www.llvm.org/docs/LangRef.html>`_
defines the LLVM assembly language including the entire instruction set.
Modules
=======
`Modules <./llvm.core.Module.html>`_, in the LLVM IR, are similar to a
single *C* language source file (.c file). A module contains:
- functions (declarations and definitions)
- global variables and constants
- global type aliases for structures
Modules are top-level containers; all executable code representation is
contained within modules. Modules may be combined (linked) together to
give a bigger resultant module. During this process LLVM attempts to
reconcile the references between the combined modules.
Optimization and Passes
=======================
LLVM provides quite a few optimization algorithms that work on the IR.
These algorithms are organized as *passes*. Each pass does something
specific, like combining redundant instructions. Passes need not always
optimize the IR, it can also do other operations like inserting
instrumentation code, or analyzing the IR (the result of which can be
used by passes that do optimizations) or even printing call graphs.
This LLVM `documentation page <http://www.llvm.org/docs/Passes.html>`_
describes all the available passes, and what they do.
LLVM does not automatically choose to run any passes, anytime. Passes
have to be explicitly selected and run on each module. This gives you
the flexibility to choose transformations and optimizations that are
most suitable for the code in the module.
There is an LLVM binary called
`opt <http://www.llvm.org/cmds/opt.html>`_, which lets you run passes on
bitcode files from the command line. You can write your own passes (in
C/C++, as a shared library). This can be loaded and executed by +opt+.
(Although llvmpy does not allow you to write your own passes, it does
allow you to navigate the entire IR at any stage, and perform any
transforms on it as you like.)
A "pass manager" is responsible for loading passes, selecting the
correct objects to run them on (for example, a pass may work only on
functions, individually) and actually runs them. ``opt`` is a
command-line wrapper for the pass manager.
LLVM defines two kinds of pass managers:
- The
`FunctionPassManager <http://llvm.org/docs/doxygen/html/classllvm_1_1FunctionPassManager.html>`_
manages function or basic-block passes. These lighter weight passes
can be used immediately after each generated function to reduce
memory footprint.
- The
`PassManager <http://llvm.org/docs/doxygen/html/classllvm_1_1PassManager.html>`_
manages module passes for optimizing the entire module.
Bitcode
=======
LLVM IR can be represented as a bitcode format for disk storage. It is
`suitable for fast loading by JIT
compiler <http://llvm.org/docs/LangRef.html#introduction>`_. See `LLVM
documentation <http://llvm.org/docs/BitCodeFormat.html>`_ for detail
about the bitcode format.
Execution Engine, JIT and Interpreter
=====================================
The *execution engine* implements execution of LLVM IR through an
interpreter or a JIT dynamic compiler. An *execution engine* can contain
multiple modules.
**Note**
Inter-module reference is not possible. That is module ``A`` cannot
call a function in module ``B``, directly.

View file

@ -1,89 +0,0 @@
***********************
The llvmpy Package
***********************
The llvmpy is a Python package, consisting of 6 modules, that wrap over
enough LLVM APIs to allow the implementation of your own compiler/VM
backend in pure Python. If you're come this far, you probably know why
this is a good idea.
Out of the 6 modules, one is an "extension" module (i.e., it is written
in C), and another one is a small private utility module, which leaves 4
public modules. These are:
- *llvm* -- top-level package, common classes (like exceptions)
- *llvm.core* -- IR-related APIs
- *llvm.ee* -- execution engine related APIs
- *llvm.passes* -- pass manager and passes related APIs
The modules contain only classes and (integer) constants. Mostly simple
Python constructs are used (deliberately) --
`property() <http://docs.python.org/lib/built-in-funcs.html>`_ and
`property
decorators <http://wiki.python.org/moin/PythonDecoratorLibrary>`_ are
probably the most exotic animals around. All classes are "new style"
classes. The APIs are designed to be navigable (and guessable!) once you
know a few conventions. These conventions are highlighted in the
sections below.
Here is a quick overview of the contents of each package:
llvm
----
- LLVMException -- exception class (currently the only one)
llvm.core
---------
- `Module <llvm.core.Module.html>`_ -- represents an LLVM Module
- `Type <types.html>`_ -- represents an LLVM Type
- `Value <values.html>`_ -- represents an LLVM Value, including:
globals, constants, variables, arguments, functions, instructions,
etc..
- `BasicBlock <llvm.core.BasicBlock.html>`_ -- another derived of
Value, represents an LLVM basic block
- `Builder <llvm.core.Builder.html>`_ -- used for creating
instructions, wraps LLVM IRBuilder helper class
- constants *TYPE\_\** that represents various types
- constants *CC\_\** that represent calling conventions
- constants *ICMP\_\** and *FCMP\_\** that represent integer and real
comparison predicates (like less than, greater than etc.)
- constants *LINKAGE\_\** that represent linkage of symbols (external,
internal etc.)
- constants *VISIBILITY\_\** that represents visibility of symbols
(default, hidden, protected)
- constants *ATTR\_\** that represent function parameter attributes
llvm.ee
-------
- `ExecutionEngine <llvm.ee.ExecutionEngine.html>`_ -- represents an
execution engine (which can be an either an interpreter or a JIT)
- `TargetData <llvm.ee.TargetData.html>`_ -- represents the ABI of the
target platform (details like sizes and alignment of primitive types,
endinanness etc)
llvm.passes
-----------
- `PassManager <llvm.passes.PassManager.html>`_ -- represents an LLVM
pass manager
- `FunctionPassManager <llvm.passes.FunctionPassManager.html>`_ --
represents an LLVM function pass manager
- constants *PASS\_\** that represent various passes
A note on the importing of these modules
----------------------------------------
Pythonically, modules are imported with the statement
``import llvm.core``. However, you might find it more convenient to
import llvmpy modules thus:
.. code-block:: python
from llvm import *
from llvm.core import *
from llvm.ee import *
from llvm.passes import *

View file

@ -1,123 +0,0 @@
+----------------+
| layout: page |
+----------------+
| title: Types |
+----------------+
Types are what you think they are. A instance of
`llvm.core.Type <llvm.core.Type.html>`_, or one of its derived classes,
represent a type. llvmpy does not use as many classes to represent
types as does LLVM itself. Some types are represented using
`llvm.core.Type <llvm.core.Type.html>`_ itself and the rest are
represented using derived classes of
`llvm.core.Type <llvm.core.Type.html>`_. As usual, an instance is
created via one of the static methods of `Type <llvm.core.Type.html>`_.
These methods return an instance of either
`llvm.core.Type <llvm.core.Type.html>`_ itself or one of its derived
classes.
The following table lists all the available types along with the static
method which has to be used to construct it and the name of the class
whose object is actually returned by the static method.
Name \| Constructor Method \| Class \|
-----\|:------------------:\|:-----:\| integer of bitwidth *n* \|
Type.int(n) \| `IntegerType <llvm.core.IntegerType.html>`_ \| 32-bit
float \| Type.float() \| `Type <llvm.core.Type.html>`_ \| 64-bit double
\| Type.double() \| `Type <llvm.core.Type.html>`_ \| 80-bit float \|
Type.x86\_fp80() \| `Type <llvm.core.Type.html>`_ \| 128-bit float
(112-bit mantissa) \| Type.fp128() \| `Type <llvm.core.Type.html>`_ \|
128-bit float (two 64-bits) \| Type.ppc\_fp128() \|
`Type <llvm.core.Type.html>`_ \| function \| Type.function(r, p, v) \|
`FunctionType <llvm.core.FunctionType.html>`_ \| unpacked struct \|
Type.struct(eltys, name) \| `StructType <llvm.core.StructType.html>`_ \|
packed struct \| Type.packed\_struct(eltys, name) \|
`StructType <llvm.core.StructType.html>`_ \| opaque struct \|
Type.opaque(name) \| `StructType <llvm.core.StructType.html>`_ \| array
\| Type.array(elty, count) \| `ArrayType <llvm.core.ArrayType.html>`_ \|
pointer to value of type *pty* \| Type.pointer(pty, addrspc) \|
`PointerType <llvm.core.PointerType.html>`_ \| vector \|
Type.vector(elty, count) \| `VectorType <llvm.core.VectorType.html>`_ \|
void \| Type.void() \| `Type <llvm.core.Type.html>`_ \| label \|
Type.label() \| `Type <llvm.core.Type.html>`_ \|
The class hierarchy is:
::
Type
IntegerType
FunctionType
StructType
ArrayType
PointerType
VectorType
--------------
An Example
----------
Here is an example that demonstrates the creation of types:
.. code-block:: python
#!/usr/bin/env python
# integers
int_ty = Type.int() bool_ty = Type.int(1) int_64bit = Type.int(64)
# floats
sprec_real = Type.float() dprec_real = Type.double()
# arrays and vectors
intar_ty = Type.array( int_ty, 10 ) # "typedef int intar_ty[10];"
twodim = Type.array( intar_ty , 10 ) # "typedef int twodim[10][10];"
vec = Type.array( int_ty, 10 )
# structures
s1_ty = Type.struct( [ int_ty, sprec_real ] ) # "struct s1_ty { int
v1; float v2; };"
# pointers
intptr_ty = Type.pointer(int_ty) # "typedef int \*intptr_ty;"
# functions
f1 = Type.function( int_ty, [ int_ty ] ) # functions that take 1
int_ty and return 1 int_ty
f2 = Type.function( Type.void(), [ int_ty, int_ty ] ) # functions that
take 2 int_tys and return nothing
f3 = Type.function( Type.void(), ( int_ty, int_ty ) ) # same as f2;
any iterable can be used
fnargs = [ Type.pointer( Type.int(8) ) ] printf = Type.function(
Type.int(), fnargs, True ) # variadic function
--------------
Another Example: Recursive Type
-------------------------------
The type system was rewritten in LLVM 3.0. The old opaque type was
removed. Instead, identified ``StructType`` can now be defined without a
body. Doing so creates a opaque structure. One can then set the body
after the construction of a structure.
(See `LLVM
Blog <http://blog.llvm.org/2011/11/llvm-30-type-system-rewrite.html>`_
for detail about the new type system.)
The following code defines a opaque structure, named "mystruct". The
body is defined after the construction using ``StructType.set_body``.
The second subtype is a pointer to a "mystruct" type.
.. code-block:: python
ts = Type.opaque('mystruct')
ts.set_body([Type.int(), Type.pointer(ts)])

View file

@ -1,17 +0,0 @@
************
User Guide
************
llvmpy provides Python bindings for LLVM. This document explains how
you can setup and use it. A working knowledge of Python and a basic idea
of LLVM is assumed.
.. toctree::
:maxdepth: 1
getting_started.rst
llvm_concepts.rst
llvmpy_package.rst

View file

@ -1,78 +0,0 @@
+-----------------+
| layout: page |
+-----------------+
| title: Values |
+-----------------+
`llvm.core.Value <llvm.core.Value.html>`_ is the base class of all
values computed by a program that may be used as operands to other
values. A value has a type associated with it (an object of
`llvm.core.Type <types.html>`_).
The class hierarchy is:
::
Value
User
Constant
ConstantExpr
ConstantAggregateZero
ConstantInt
ConstantFP
ConstantArray
ConstantStruct
ConstantVector
ConstantPointerNull
UndefValue
GlobalValue
GlobalVariable
Function
Instruction
CallOrInvokeInstruction
PHINode
SwitchInstruction
CompareInstruction
Argument
BasicBlock
The `Value <llvm.core.Value.html>`_ class is abstract, it's not meant to
be instantiated. `User <llvm.core.User.html>`_ is a
`Value <llvm.core.Value.html>`_ that in turn uses (i.e., can refer to)
other values (for e.g., a constant expression 1+2 refers to two constant
values 1 and 2).
`Constant <llvm.core.Constant.html>`_-s represent constants that appear
within code or as initializers of globals. They are constructed using
static methods of `Constant <llvm.core.Constant.html>`_. Various types
of constants are represented by various subclasses of
`Constant <llvm.core.Constant.html>`_. However, most of them are empty
and do not provide any additional attributes or methods over
`Constant <llvm.core.Constant.html>`_.
The `Function <functions.html>`_ object represents an instance of a
function type. Such objects contain
`Argument <llvm.core.Argument.html>`_ objects, which represent the
actual, local-variable-like arguments of the function (not to be
confused with the arguments returned by a function *type* object --
these represent the *type* of the arguments).
The various `Instruction <llvm.core.Instruction.html>`_-s are created by
the `Builder <llvm.core.Builder.html>`_ class. Most instructions are
represented by `Instruction <llvm.core.Instruction.html>`_ itself, but
there are a few subclasses that represent interesting instructions.
`Value <llvm.core.Value.html>`_ objects have a type (read-only), and a
name (read-write).
**Related Links** `functions <functions.html>`_,
`comparision <comparision.html>`_,
`llvm.core.Value <llvm.core.Value.html>`_,
`llvm.core.User <llvm.core.User.html>`_,
`llvm.core.Constant <llvm.core.Constant.html>`_,
`llvm.core.GlobalValue <llvm.core.GlobalValue.html>`_,
`llvm.core.GlobalVariable <llvm.core.GlobalVariable.html>`_,
`llvm.core.Argument <llvm.core.Argument.html>`_,
`llvm.core.Instruction <llvm.core.Instruction.html>`_,
`llvm.core.Builder <llvm.core.Builder.html>`_,
`llvm.core.BasicBlock <llvm.core.BasicBlock.html>`_

View file

@ -1,25 +0,0 @@
.. llvmpy documentation master file, created by
sphinx-quickstart on Wed Aug 8 17:33:58 2012.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Documentation for llvmpy
=================
Contents:
.. toctree::
:titlesonly:
:maxdepth: 2
doc/userguide.rst
doc/examples.rst
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

Binary file not shown.

Before

Width:  |  Height:  |  Size: 673 B

View file

@ -1,540 +0,0 @@
/*
* basic.css
* ~~~~~~~~~
*
* Sphinx stylesheet -- basic theme.
*
* :copyright: Copyright 2007-2011 by the Sphinx team, see AUTHORS.
* :license: BSD, see LICENSE for details.
*
*/
/* -- main layout ----------------------------------------------------------- */
div.clearer {
clear: both;
}
/* -- relbar ---------------------------------------------------------------- */
div.related {
width: 100%;
font-size: 90%;
}
div.related h3 {
display: none;
}
div.related ul {
margin: 0;
padding: 0 0 0 10px;
list-style: none;
}
div.related li {
display: inline;
}
div.related li.right {
float: right;
margin-right: 5px;
}
/* -- sidebar --------------------------------------------------------------- */
div.sphinxsidebarwrapper {
padding: 10px 5px 0 10px;
}
div.sphinxsidebar {
float: left;
width: 230px;
margin-left: -100%;
font-size: 90%;
}
div.sphinxsidebar ul {
list-style: none;
}
div.sphinxsidebar ul ul,
div.sphinxsidebar ul.want-points {
margin-left: 20px;
list-style: square;
}
div.sphinxsidebar ul ul {
margin-top: 0;
margin-bottom: 0;
}
div.sphinxsidebar form {
margin-top: 10px;
}
div.sphinxsidebar input {
border: 1px solid #98dbcc;
font-family: sans-serif;
font-size: 1em;
}
div.sphinxsidebar input[type="text"] {
width: 170px;
}
div.sphinxsidebar input[type="submit"] {
width: 30px;
}
img {
border: 0;
}
/* -- search page ----------------------------------------------------------- */
ul.search {
margin: 10px 0 0 20px;
padding: 0;
}
ul.search li {
padding: 5px 0 5px 20px;
background-image: url(file.png);
background-repeat: no-repeat;
background-position: 0 7px;
}
ul.search li a {
font-weight: bold;
}
ul.search li div.context {
color: #888;
margin: 2px 0 0 30px;
text-align: left;
}
ul.keywordmatches li.goodmatch a {
font-weight: bold;
}
/* -- index page ------------------------------------------------------------ */
table.contentstable {
width: 90%;
}
table.contentstable p.biglink {
line-height: 150%;
}
a.biglink {
font-size: 1.3em;
}
span.linkdescr {
font-style: italic;
padding-top: 5px;
font-size: 90%;
}
/* -- general index --------------------------------------------------------- */
table.indextable {
width: 100%;
}
table.indextable td {
text-align: left;
vertical-align: top;
}
table.indextable dl, table.indextable dd {
margin-top: 0;
margin-bottom: 0;
}
table.indextable tr.pcap {
height: 10px;
}
table.indextable tr.cap {
margin-top: 10px;
background-color: #f2f2f2;
}
img.toggler {
margin-right: 3px;
margin-top: 3px;
cursor: pointer;
}
div.modindex-jumpbox {
border-top: 1px solid #ddd;
border-bottom: 1px solid #ddd;
margin: 1em 0 1em 0;
padding: 0.4em;
}
div.genindex-jumpbox {
border-top: 1px solid #ddd;
border-bottom: 1px solid #ddd;
margin: 1em 0 1em 0;
padding: 0.4em;
}
/* -- general body styles --------------------------------------------------- */
a.headerlink {
visibility: hidden;
}
h1:hover > a.headerlink,
h2:hover > a.headerlink,
h3:hover > a.headerlink,
h4:hover > a.headerlink,
h5:hover > a.headerlink,
h6:hover > a.headerlink,
dt:hover > a.headerlink {
visibility: visible;
}
div.body p.caption {
text-align: inherit;
}
div.body td {
text-align: left;
}
.field-list ul {
padding-left: 1em;
}
.first {
margin-top: 0 !important;
}
p.rubric {
margin-top: 30px;
font-weight: bold;
}
img.align-left, .figure.align-left, object.align-left {
clear: left;
float: left;
margin-right: 1em;
}
img.align-right, .figure.align-right, object.align-right {
clear: right;
float: right;
margin-left: 1em;
}
img.align-center, .figure.align-center, object.align-center {
display: block;
margin-left: auto;
margin-right: auto;
}
.align-left {
text-align: left;
}
.align-center {
text-align: center;
}
.align-right {
text-align: right;
}
/* -- sidebars -------------------------------------------------------------- */
div.sidebar {
margin: 0 0 0.5em 1em;
border: 1px solid #ddb;
padding: 7px 7px 0 7px;
background-color: #ffe;
width: 40%;
float: right;
}
p.sidebar-title {
font-weight: bold;
}
/* -- topics ---------------------------------------------------------------- */
div.topic {
border: 1px solid #ccc;
padding: 7px 7px 0 7px;
margin: 10px 0 10px 0;
}
p.topic-title {
font-size: 1.1em;
font-weight: bold;
margin-top: 10px;
}
/* -- admonitions ----------------------------------------------------------- */
div.admonition {
margin-top: 10px;
margin-bottom: 10px;
padding: 7px;
}
div.admonition dt {
font-weight: bold;
}
div.admonition dl {
margin-bottom: 0;
}
p.admonition-title {
margin: 0px 10px 5px 0px;
font-weight: bold;
}
div.body p.centered {
text-align: center;
margin-top: 25px;
}
/* -- tables ---------------------------------------------------------------- */
table.docutils {
border: 0;
border-collapse: collapse;
}
table.docutils td, table.docutils th {
padding: 1px 8px 1px 5px;
border-top: 0;
border-left: 0;
border-right: 0;
border-bottom: 1px solid #aaa;
}
table.field-list td, table.field-list th {
border: 0 !important;
}
table.footnote td, table.footnote th {
border: 0 !important;
}
th {
text-align: left;
padding-right: 5px;
}
table.citation {
border-left: solid 1px gray;
margin-left: 1px;
}
table.citation td {
border-bottom: none;
}
/* -- other body styles ----------------------------------------------------- */
ol.arabic {
list-style: decimal;
}
ol.loweralpha {
list-style: lower-alpha;
}
ol.upperalpha {
list-style: upper-alpha;
}
ol.lowerroman {
list-style: lower-roman;
}
ol.upperroman {
list-style: upper-roman;
}
dl {
margin-bottom: 15px;
}
dd p {
margin-top: 0px;
}
dd ul, dd table {
margin-bottom: 10px;
}
dd {
margin-top: 3px;
margin-bottom: 10px;
margin-left: 30px;
}
dt:target, .highlighted {
background-color: #fbe54e;
}
dl.glossary dt {
font-weight: bold;
font-size: 1.1em;
}
.field-list ul {
margin: 0;
padding-left: 1em;
}
.field-list p {
margin: 0;
}
.refcount {
color: #060;
}
.optional {
font-size: 1.3em;
}
.versionmodified {
font-style: italic;
}
.system-message {
background-color: #fda;
padding: 5px;
border: 3px solid red;
}
.footnote:target {
background-color: #ffa;
}
.line-block {
display: block;
margin-top: 1em;
margin-bottom: 1em;
}
.line-block .line-block {
margin-top: 0;
margin-bottom: 0;
margin-left: 1.5em;
}
.guilabel, .menuselection {
font-family: sans-serif;
}
.accelerator {
text-decoration: underline;
}
.classifier {
font-style: oblique;
}
abbr, acronym {
border-bottom: dotted 1px;
cursor: help;
}
/* -- code displays --------------------------------------------------------- */
pre {
overflow: auto;
overflow-y: hidden; /* fixes display issues on Chrome browsers */
}
td.linenos pre {
padding: 5px 0px;
border: 0;
background-color: transparent;
color: #aaa;
}
table.highlighttable {
margin-left: 0.5em;
}
table.highlighttable td {
padding: 0 0.5em 0 0.5em;
}
tt.descname {
background-color: transparent;
font-weight: bold;
font-size: 1.2em;
}
tt.descclassname {
background-color: transparent;
}
tt.xref, a tt {
background-color: transparent;
font-weight: bold;
}
h1 tt, h2 tt, h3 tt, h4 tt, h5 tt, h6 tt {
background-color: transparent;
}
.viewcode-link {
float: right;
}
.viewcode-back {
float: right;
font-family: sans-serif;
}
div.viewcode-block:target {
margin: -1px -10px;
padding: 0 10px;
}
/* -- math display ---------------------------------------------------------- */
img.math {
vertical-align: middle;
}
div.body div.math p {
text-align: center;
}
span.eqno {
float: right;
}
/* -- printout stylesheet --------------------------------------------------- */
@media print {
div.document,
div.documentwrapper,
div.bodywrapper {
margin: 0 !important;
width: 100%;
}
div.sphinxsidebar,
div.related,
div.footer,
#top-link {
display: none;
}
}

Some files were not shown because too many files have changed in this diff Show more