
The SWIG documentation is being updated to reflect new SWIG features and enhancements. However, this update process is not quite finished--there is a lot of old SWIG-1.1 documentation and it is taking some time to update all of it. Please pardon our dust (or volunteer to help!).
This documentation has not been completely updated from SWIG-1.1, but most of the topics still apply to the current release. Make sure you read the SWIG Basics chapter before reading any of these chapters. Also, SWIG-1.3.10 features extensive changes to the implementation of typemaps. Make sure you read the Typemaps chapter above if you are using this feature.
SWIG (Simplified Wrapper and Interface Generator) is a software development tool for building scripting language interfaces to C and C++ programs. Originally developed in 1995, SWIG was first used by scientists in the Theoretical Physics Division at Los Alamos National Laboratory for building user interfaces to simulation codes running on the Connection Machine 5 supercomputer. In this environment, scientists needed to work with huge amounts of simulation data, complex hardware, and a constantly changing code base. The use of a scripting language interface provided a simple yet highly flexible foundation for solving these types of problems. SWIG simplifies development by largely automating the task of scripting language integration--allowing developers and users to focus on more important problems.
Although SWIG was originally developed for scientific applications, it has since evolved into a general purpose tool that is used in a wide variety of applications--in fact almost anything where C/C++ programming is involved.
Since SWIG was released in 1996, its user base and applicability has continued to grow. Although its rate of development has varied, an active development effort has continued to make improvements to the system. Today, nearly a dozen developers are working to create SWIG-2.0---a system that aims to provide wrapping support for nearly all of the ANSI C++ standard and approximately ten target languages including Guile, Java, Mzscheme, Ocaml, Perl, Pike, PHP, Python, Ruby, and Tcl.
For several years, the most stable version of SWIG has been release 1.1p5. Starting with version 1.3, a new version numbering scheme has been adopted. Odd version numbers (1.3, 1.5, etc.) represent development versions of SWIG. Even version numbers (1.4, 1.6, etc.) represent stable releases. Currently, developers are working to create a stable SWIG-2.0 release. Don't let the development status of SWIG-1.3 scare you---it is much more stable (and capable) than SWIG-1.1p5.
The official location of SWIG related material is
This site contains the latest version of the software, users guide, and information regarding bugs, installation problems, and implementation tricks.
You can also subscribe to the swig-user mailing list by visiting the page
The mailing list often discusses some of the more technical aspects of SWIG along with information about beta releases and future work.
SVN access to the latest version of SWIG is also available. More information about this can be obtained at:
This manual assumes that you know how to write C/C++ programs and that you have at least heard of scripting languages such as Tcl, Python, and Perl. A detailed knowledge of these scripting languages is not required although some familiarity won't hurt. No prior experience with building C extensions to these languages is required---after all, this is what SWIG does automatically. However, you should be reasonably familiar with the use of compilers, linkers, and makefiles since making scripting language extensions is somewhat more complicated than writing a normal C program.
Recent SWIG releases have become significantly more capable in their C++ handling--especially support for advanced features like namespaces, overloaded operators, and templates. Whenever possible, this manual tries to cover the technicalities of this interface. However, this isn't meant to be a tutorial on C++ programming. For many of the gory details, you will almost certainly want to consult a good C++ reference. If you don't program in C++, you may just want to skip those parts of the manual.
The first few chapters of this manual describe SWIG in general and provide an overview of its capabilities. The remaining chapters are devoted to specific SWIG language modules and are self contained. Thus, if you are using SWIG to build Python interfaces, you can probably skip to that chapter and find almost everything you need to know. Caveat: we are currently working on a documentation rewrite and many of the older language module chapters are still somewhat out of date.
If you hate reading manuals, glance at the "Introduction" which contains a few simple examples. These examples contain about 95% of everything you need to know to use SWIG. After that, simply use the language-specific chapters as a reference. The SWIG distribution also comes with a large directory of examples that illustrate different topics.
If you are a previous user of SWIG, don't expect recent versions of SWIG to provide backwards compatibility. In fact, backwards compatibility issues may arise even between successive 1.3.x releases. Although these incompatibilities are regrettable, SWIG-1.3 is an active development project. The primary goal of this effort is to make SWIG better---a process that would simply be impossible if the developers are constantly bogged down with backwards compatibility issues.
On a positive note, a few incompatibilities are a small price to pay for the large number of new features that have been added---namespaces, templates, smart pointers, overloaded methods, operators, and more.
If you need to work with different versions of SWIG and backwards compatibility is an issue, you can use the SWIG_VERSION preprocessor symbol which holds the version of SWIG being executed. SWIG_VERSION is a hexadecimal integer such as 0x010311 (corresponding to SWIG-1.3.11). This can be used in an interface file to define different typemaps, take advantage of different features etc:
#if SWIG_VERSION >= 0x010311 /* Use some fancy new feature */ #endif
Note: The version symbol is not defined in the generated SWIG wrapper file. The SWIG preprocessor has defined SWIG_VERSION since SWIG-1.3.11.
SWIG is an unfunded project that would not be possible without the contributions of many people. Most recent SWIG development has been supported by Matthias Köppe, William Fulton, Lyle Johnson, Richard Palmer, Thien-Thi Nguyen, Jason Stewart, Loic Dachary, Masaki Fukushima, Luigi Ballabio, Sam Liddicott, Art Yerkes, Marcelo Matus, Harco de Hilster, John Lenz, and Surendra Singhi.
Historically, the following people contributed to early versions of SWIG. Peter Lomdahl, Brad Holian, Shujia Zhou, Niels Jensen, and Tim Germann at Los Alamos National Laboratory were the first users. Patrick Tullmann at the University of Utah suggested the idea of automatic documentation generation. John Schmidt and Kurtis Bleeker at the University of Utah tested out the early versions. Chris Johnson supported SWIG's developed at the University of Utah. John Buckman, Larry Virden, and Tom Schwaller provided valuable input on the first releases and improving the portability of SWIG. David Fletcher and Gary Holt have provided a great deal of input on improving SWIG's Perl5 implementation. Kevin Butler contributed the first Windows NT port.
Although every attempt has been made to make SWIG bug-free, we are also trying to make feature improvements that may introduce bugs. To report a bug, either send mail to the SWIG developer list at the swig-devel mailing list or report a bug at the SWIG bug tracker. In your report, be as specific as possible, including (if applicable), error messages, tracebacks (if a core dump occurred), corresponding portions of the SWIG interface file used, and any important pieces of the SWIG generated wrapper code. We can only fix bugs if we know about them.
SWIG is a software development tool that simplifies the task of interfacing different languages to C and C++ programs. In a nutshell, SWIG is a compiler that takes C declarations and creates the wrappers needed to access those declarations from other languages including including Perl, Python, Tcl, Ruby, Guile, and Java. SWIG normally requires no modifications to existing code and can often be used to build a usable interface in only a few minutes. Possible applications of SWIG include:
SWIG was originally designed to make it extremely easy for scientists and engineers to build extensible scientific software without having to get a degree in software engineering. Because of this, the use of SWIG tends to be somewhat informal and ad-hoc (e.g., SWIG does not require users to provide formal interface specifications as you would find in a dedicated IDL compiler). Although this style of development isn't appropriate for every project, it is particularly well suited to software development in the small; especially the research and development work that is commonly found in scientific and engineering projects.
As stated in the previous section, the primary purpose of SWIG is to simplify the task of integrating C/C++ with other programming languages. However, why would anyone want to do that? To answer that question, it is useful to list a few strengths of C/C++ programming:
Next, let's list a few problems with C/C++ programming
To address these limitations, many programmers have arrived at the conclusion that it is much easier to use different programming languages for different tasks. For instance, writing a graphical user interface may be significantly easier in a scripting language like Python or Tcl (consider the reasons why millions of programmers have used languages like Visual Basic if you need more proof). An interactive interpreter might also serve as a useful debugging and testing tool. Other languages like Java might greatly simplify the task of writing distributed computing software. The key point is that different programming languages offer different strengths and weaknesses. Moreover, it is extremely unlikely that any programming is ever going to be perfect. Therefore, by combining languages together, you can utilize the best features of each language and greatly simplify certain aspects of software development.
From the standpoint of C/C++, a lot of people use SWIG because they want to break out of the traditional monolithic C programming model which usually results in programs that resemble this:
Instead of going down that route, incorporating C/C++ into a higher level language often results in a more modular design, less code, better flexibility, and increased programmer productivity.
SWIG tries to make the problem of C/C++ integration as painless as possible. This allows you to focus on the underlying C program and using the high-level language interface, but not the tedious and complex chore of making the two languages talk to each other. At the same time, SWIG recognizes that all applications are different. Therefore, it provides a wide variety of customization features that let you change almost every aspect of the language bindings. This is the main reason why SWIG has such a large user manual ;-).
The best way to illustrate SWIG is with a simple example. Consider the following C code:
/* File : example.c */
double My_variable = 3.0;
/* Compute factorial of n */
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
/* Compute n mod m */
int my_mod(int n, int m) {
return(n % m);
}
Suppose that you wanted to access these functions and the global variable My_variable from Tcl. You start by making a SWIG interface file as shown below (by convention, these files carry a .i suffix) :
/* File : example.i */
%module example
%{
/* Put headers and other declarations here */
extern double My_variable;
extern int fact(int);
extern int my_mod(int n, int m);
%}
extern double My_variable;
extern int fact(int);
extern int my_mod(int n, int m);
The interface file contains ANSI C function prototypes and variable declarations. The %module directive defines the name of the module that will be created by SWIG. The %{,%} block provides a location for inserting additional code such as C header files or additional C declarations.
SWIG is invoked using the swig command. We can use this to build a Tcl module (under Linux) as follows :
unix > swig -tcl example.i unix > gcc -c -fpic example.c example_wrap.c -I/usr/local/include unix > gcc -shared example.o example_wrap.o -o example.so unix > tclsh % load ./example.so % fact 4 24 % my_mod 23 7 2 % expr $My_variable + 4.5 7.5 %
The swig command produced a new file called example_wrap.c that should be compiled along with the example.c file. Most operating systems and scripting languages now support dynamic loading of modules. In our example, our Tcl module has been compiled into a shared library that can be loaded into Tcl. When loaded, Tcl can now access the functions and variables declared in the SWIG interface. A look at the file example_wrap.c reveals a hideous mess. However, you almost never need to worry about it.
Now, let's turn these functions into a Perl5 module. Without making any changes type the following (shown for Solaris):
unix > swig -perl5 example.i unix > gcc -c example.c example_wrap.c \ -I/usr/local/lib/perl5/sun4-solaris/5.003/CORE unix > ld -G example.o example_wrap.o -o example.so # This is for Solaris unix > perl5.003 use example; print example::fact(4), "\n"; print example::my_mod(23,7), "\n"; print $example::My_variable + 4.5, "\n"; <ctrl-d> 24 2 7.5 unix >
Finally, let's build a module for Python (shown for Irix).
unix > swig -python example.i unix > gcc -c -fpic example.c example_wrap.c -I/usr/local/include/python2.0 unix > gcc -shared example.o example_wrap.o -o _example.so unix > python Python 2.0 (#6, Feb 21 2001, 13:29:45) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> import example >>> example.fact(4) 24 >>> example.my_mod(23,7) 2 >>> example.cvar.My_variable + 4.5 7.5
To the truly lazy programmer, one may wonder why we needed the extra interface file at all. As it turns out, you can often do without it. For example, you could also build a Perl5 module by just running SWIG on the C header file and specifying a module name as follows
unix > swig -perl5 -module example example.h unix > gcc -c example.c example_wrap.c \ -I/usr/local/lib/perl5/sun4-solaris/5.003/CORE unix > ld -G example.o example_wrap.o -o example.so unix > perl5.003 use example; print example::fact(4), "\n"; print example::my_mod(23,7), "\n"; print $example::My_variable + 4.5, "\n"; <ctrl-d> 24 2 7.5
A primary goal of the SWIG project is to make the language binding process extremely easy. Although a few simple examples have been shown, SWIG is quite capable in supporting most of C++. Some of the major features include:
Currently, the only major C++ feature not supported is nested classes--a limitation that will be removed in a future release.
It is important to stress that SWIG is not a simplistic C++ lexing tool like several apparently similar wrapper generation tools. SWIG not only parses C++, it implements the full C++ type system and it is able to understand C++ semantics. SWIG generates its wrappers with full knowledge of this information. As a result, you will find SWIG to be just as capable of dealing with nasty corner cases as it is in wrapping simple C++ code. In fact, SWIG is able handle C++ code that stresses the very limits of many C++ compilers.
When used as intended, SWIG requires minimal (if any) modification to existing C or C++ code. This makes SWIG extremely easy to use with existing packages and promotes software reuse and modularity. By making the C/C++ code independent of the high level interface, you can change the interface and reuse the code in other applications. It is also possible to support different types of interfaces depending on the application.
SWIG is a command line tool and as such can be incorporated into any build system that supports invoking external tools/compilers. SWIG is most commonly invoked from within a Makefile, but is also known to be invoked from from popular IDEs such as Microsoft Visual Studio.
If you are using the GNU Autotools ( Autoconf/ Automake / Libtool) to configure SWIG use in your project, the SWIG Autoconf macros can be used. The primary macro is ac_pkg_swig, see http://www.gnu.org/software/ac-archive/htmldoc/ac_pkg_swig.html. The ac_python_devel macro is also helpful for generating Python extensions. See the Autoconf Macro Archive for further information on this and other Autoconf macros.
There is growing support for SWIG in some build tools, for example CMake is a cross-platform, open-source build manager with built in support for SWIG. CMake can detect the SWIG executable and many of the target language libraries for linking against. CMake knows how to build shared libraries and loadable modules on many different operating systems. This allows easy cross platform SWIG development. It also can generate the custom commands necessary for driving SWIG from IDE's and makefiles. All of this can be done from a single cross platform input file. The following example is a CMake input file for creating a python wrapper for the SWIG interface file, example.i:
# This is a CMake example for Python
FIND_PACKAGE(SWIG REQUIRED)
INCLUDE(${SWIG_USE_FILE})
FIND_PACKAGE(PythonLibs)
INCLUDE_DIRECTORIES(${PYTHON_INCLUDE_PATH})
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR})
SET(CMAKE_SWIG_FLAGS "")
SET_SOURCE_FILES_PROPERTIES(example.i PROPERTIES CPLUSPLUS ON)
SET_SOURCE_FILES_PROPERTIES(example.i PROPERTIES SWIG_FLAGS "-includeall")
SWIG_ADD_MODULE(example python example.i example.cxx)
SWIG_LINK_LIBRARIES(example ${PYTHON_LIBRARIES})
The above example will generate native build files such as makefiles, nmake files and Visual Studio projects which will invoke SWIG and compile the generated C++ files into _example.so (UNIX) or _example.dll (Windows).
SWIG is designed to produce working code that needs no hand-modification (in fact, if you look at the output, you probably won't want to modify it). You should think of your target language interface being defined entirely by the input to SWIG, not the resulting output file. While this approach may limit flexibility for hard-core hackers, it allows others to forget about the low-level implementation details.
No, this isn't a special section on the sorry state of world politics. However, it may be useful to know that SWIG was written with a certain "philosophy" about programming---namely that programmers are smart and that tools should just stay out of their way. Because of that, you will find that SWIG is extremely permissive in what it lets you get away with. In fact, you can use SWIG to go well beyond "shooting yourself in the foot" if dangerous programming is your goal. On the other hand, this kind of freedom may be exactly what is needed to work with complicated and unusual C/C++ applications.
Ironically, the freedom that SWIG provides is countered by an extremely conservative approach to code generation. At it's core, SWIG tries to distill even the most advanced C++ code down to a small well-defined set of interface building techniques based on ANSI C programming. Because of this, you will find that SWIG interfaces can be easily compiled by virtually every C/C++ compiler and that they can be used on any platform. Again, this is an important part of staying out of the programmer's way----the last thing any developer wants to do is to spend their time debugging the output of a tool that relies on non-portable or unreliable programming features.
This chapter describes SWIG usage on Microsoft Windows. Installing SWIG and running the examples is covered as well as building the SWIG executable. Usage within the Unix like environments MinGW and Cygwin is also detailed.
SWIG does not come with the usual Windows type installation program, however it is quite easy to get started. The main steps are:
The swigwin distribution contains the SWIG Windows executable, swig.exe, which will run on 32 bit versions of Windows, ie Windows 95/98/ME/NT/2000/XP. If you want to build your own swig.exe have a look at Building swig.exe on Windows.
Using Microsoft Visual C++ is the most common approach to compiling and linking SWIG's output. The Examples directory has a few Visual C++ project files (.dsp files). These were produced by Visual C++ 6, although they should also work in Visual C++ 5. Later versions of Visual Studio should also be able to open and convert these project files. The C# examples come with .NET 2003 solution (.sln) and project files instead of Visual C++ 6 project files. The project files have been set up to execute SWIG in a custom build rule for the SWIG interface (.i) file. Alternatively run the examples using Cygwin.
More information on each of the examples is available with the examples distributed with SWIG (Examples/index.html).
Ensure the SWIG executable is as supplied in the SWIG root directory in order for the examples to work. Most languages require some environment variables to be set before running Visual C++. Note that Visual C++ must be re-started to pick up any changes in environment variables. Open up an example .dsp file, Visual C++ will create a workspace for you (.dsw file). Ensure the Release build is selected then do a Rebuild All from the Build menu. The required environment variables are displayed with their current values.
The list of required environment variables for each module language is also listed below. They are usually set from the Control Panel and System properties, but this depends on which flavour of Windows you are running. If you don't want to use environment variables then change all occurrences of the environment variables in the .dsp files with hard coded values. If you are interested in how the project files are set up there is explanatory information in some of the language module's documentation.
The C# examples do not require any environment variables to be set as a C# project file is included. Just open up the .sln solution file in Visual Studio .NET 2003 or later, select Release Build, and do a Rebuild All from the Build menu. The accompanying C# and C++ project files are automatically used by the solution file.
JAVA_INCLUDE : Set this to the directory containing
jni.h
JAVA_BIN : Set this to the bin directory containing
javac.exe
Example using JDK1.3:
JAVA_INCLUDE: D:\jdk1.3\include
JAVA_BIN: D:\jdk1.3\bin
PERL5_INCLUDE : Set this to the directory containing
perl.h
PERL5_LIB : Set this to the Perl library including
path for linking
Example using nsPerl 5.004_04:
PERL5_INCLUDE: D:\nsPerl5.004_04\lib\CORE
PERL5_LIB: D:\nsPerl5.004_04\lib\CORE\perl.lib
PYTHON_INCLUDE : Set this to the directory that
contains python.h
PYTHON_LIB : Set this to the python library
including path for linking
Example using Python 2.1.1:
PYTHON_INCLUDE: D:\python21\include
PYTHON_LIB: D:\python21\libs\python21.lib
TCL_INCLUDE : Set this to the directory containing
tcl.h
TCL_LIB : Set this to the TCL library including
path for linking
Example using ActiveTcl 8.3.3.3
TCL_INCLUDE: D:\tcl\include
TCL_LIB: D:\tcl\lib\tcl83.lib
R_INCLUDE : Set this to the directory containing R.h
R_LIB : Set this to the R library (Rdll.lib)
including path for linking. The library needs to be built as described
in the R README.packages file (the pexports.exe approach is the
easiest).
Example using R 2.5.1:
R_INCLUDE: C:\Program Files\R\R-2.5.1\include
R_LIB: C:\Program Files\R\R-2.5.1\bin\Rdll.lib
RUBY_INCLUDE : Set this to the directory containing
ruby.h
RUBY_LIB : Set this to the ruby library including
path for linking
Example using Ruby 1.6.4:
RUBY_INCLUDE: D:\ruby\lib\ruby\1.6\i586-mswin32
RUBY_LIB: D:\ruby\lib\mswin32-ruby16.lib
If you do not have access to Visual C++ you will have to set up project files / Makefiles for your chosen compiler. There is a section in each of the language modules detailing what needs setting up using Visual C++ which may be of some guidance. Alternatively you may want to use Cygwin as described in the following section.
SWIG can also be compiled and run using Cygwin or MinGW which provides a Unix like front end to Windows and comes free with gcc, an ANSI C/C++ compiler. However, this is not a recommended approach as the prebuilt executable is supplied.
If you want to replicate the build of swig.exe that comes with the download, follow the MinGW instructions below. This is not necessary to use the supplied swig.exe. This information is provided for those that want to modify the SWIG source code in a Windows environment. Normally this is not needed, so most people will want to ignore this section.
The short abbreviated instructions follow...
The step by step instructions to download and install MinGW and MSYS, then download and build the latest version of SWIG from SVN follow... Note that the instructions for obtaining SWIG from SVN are also online at SWIG SVN.
Pitfall note: Execute the steps in the order shown and don't use spaces in path names. In fact it is best to use the default installation directories.
cd / tar -jxf msys-automake-1.8.2.tar.bz2 tar -jxf msys-autoconf-2.59.tar.bz2 tar -zxf bison-2.0-MSYS.tar.gz
mkdir /usr/src cd /usr/src svn co https://swig.svn.sourceforge.net/svnroot/swig/trunk swig
cd /usr/src/swig ./autogen.sh ./configure make
Note that SWIG can also be built using Cygwin. However, SWIG will then require the Cygwin DLL when executing. Follow the Unix instructions in the README file in the SWIG root directory. Note that the Cygwin environment will also allow one to regenerate the autotool generated files which are supplied with the release distribution. These files are generated using the autogen.sh script and will only need regenerating in circumstances such as changing the build system.
If you don't want to install Cygwin or MinGW, use a different compiler to build SWIG. For example, all the source code files can be added to a Visual C++ project file in order to build swig.exe from the Visual C++ IDE.
The examples and test-suite work as successfully on Cygwin as on any other Unix operating system. The modules which are known to work are Python, Tcl, Perl, Ruby, Java and C#. Follow the Unix instructions in the README file in the SWIG root directory to build the examples.
A common problem when using SWIG on Windows are the Microsoft function calling conventions which are not in the C++ standard. SWIG parses ISO C/C++ so cannot deal with proprietary conventions such as __declspec(dllimport), __stdcall etc. There is a Windows interface file, windows.i, to deal with these calling conventions though. The file also contains typemaps for handling commonly used Windows specific types such as __int64, BOOL , DWORD etc. Include it like you would any other interface file, for example:
%include <windows.i> __declspec(dllexport) ULONG __stdcall foo(DWORD, __int32);
This chapter provides a brief overview of scripting language extension programming and the mechanisms by which scripting language interpreters access C and C++ code.
When a scripting language is used to control a C program, the resulting system tends to look as follows:

In this programming model, the scripting language interpreter is used for high level control whereas the underlying functionality of the C/C++ program is accessed through special scripting language "commands." If you have ever tried to write your own simple command interpreter, you might view the scripting language approach to be a highly advanced implementation of that. Likewise, If you have ever used a package such as MATLAB or IDL, it is a very similar model--the interpreter executes user commands and scripts. However, most of the underlying functionality is written in a low-level language like C or Fortran.
The two-language model of computing is extremely powerful because it exploits the strengths of each language. C/C++ can be used for maximal performance and complicated systems programming tasks. Scripting languages can be used for rapid prototyping, interactive debugging, scripting, and access to high-level data structures such associative arrays.
Scripting languages are built around a parser that knows how to execute commands and scripts. Within this parser, there is a mechanism for executing commands and accessing variables. Normally, this is used to implement the builtin features of the language. However, by extending the interpreter, it is usually possible to add new commands and variables. To do this, most languages define a special API for adding new commands. Furthermore, a special foreign function interface defines how these new commands are supposed to hook into the interpreter.
Typically, when you add a new command to a scripting interpreter you need to do two things; first you need to write a special "wrapper" function that serves as the glue between the interpreter and the underlying C function. Then you need to give the interpreter information about the wrapper by providing details about the name of the function, arguments, and so forth. The next few sections illustrate the process.
Suppose you have an ordinary C function like this :
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
In order to access this function from a scripting language, it is necessary to write a special "wrapper" function that serves as the glue between the scripting language and the underlying C function. A wrapper function must do three things :
As an example, the Tcl wrapper function for the fact() function above example might look like the following :
int wrap_fact(ClientData clientData, Tcl_Interp *interp,
int argc, char *argv[]) {
int result;
int arg0;
if (argc != 2) {
interp->result = "wrong # args";
return TCL_ERROR;
}
arg0 = atoi(argv[1]);
result = fact(arg0);
sprintf(interp->result,"%d", result);
return TCL_OK;
}
Once you have created a wrapper function, the final step is to tell the scripting language about the new function. This is usually done in an initialization function called by the language when the module is loaded. For example, adding the above function to the Tcl interpreter requires code like the following :
int Wrap_Init(Tcl_Interp *interp) {
Tcl_CreateCommand(interp, "fact", wrap_fact, (ClientData) NULL,
(Tcl_CmdDeleteProc *) NULL);
return TCL_OK;
}
When executed, Tcl will now have a new command called "fact " that you can use like any other Tcl command.
Although the process of adding a new function to Tcl has been illustrated, the procedure is almost identical for Perl and Python. Both require special wrappers to be written and both need additional initialization code. Only the specific details are different.
Variable linking refers to the problem of mapping a C/C++ global variable to a variable in the scripting language interpreter. For example, suppose you had the following variable:
double Foo = 3.5;
It might be nice to access it from a script as follows (shown for Perl):
$a = $Foo * 2.3; # Evaluation $Foo = $a + 2.0; # Assignment
To provide such access, variables are commonly manipulated using a pair of get/set functions. For example, whenever the value of a variable is read, a "get" function is invoked. Similarly, whenever the value of a variable is changed, a "set" function is called.
In many languages, calls to the get/set functions can be attached to evaluation and assignment operators. Therefore, evaluating a variable such as $Foo might implicitly call the get function. Similarly, typing $Foo = 4 would call the underlying set function to change the value.
In many cases, a C program or library may define a large collection of constants. For example:
#define RED 0xff0000 #define BLUE 0x0000ff #define GREEN 0x00ff00
To make constants available, their values can be stored in scripting language variables such as $RED, $BLUE, and $GREEN. Virtually all scripting languages provide C functions for creating variables so installing constants is usually a trivial exercise.
Although scripting languages have no trouble accessing simple functions and variables, accessing C/C++ structures and classes present a different problem. This is because the implementation of structures is largely related to the problem of data representation and layout. Furthermore, certain language features are difficult to map to an interpreter. For instance, what does C++ inheritance mean in a Perl interface?
The most straightforward technique for handling structures is to implement a collection of accessor functions that hide the underlying representation of a structure. For example,
struct Vector {
Vector();
~Vector();
double x,y,z;
};
can be transformed into the following set of functions :
Vector *new_Vector(); void delete_Vector(Vector *v); double Vector_x_get(Vector *v); double Vector_y_get(Vector *v); double Vector_z_get(Vector *v); void Vector_x_set(Vector *v, double x); void Vector_y_set(Vector *v, double y); void Vector_z_set(Vector *v, double z);
Now, from an interpreter these function might be used as follows:
% set v [new_Vector] % Vector_x_set $v 3.5 % Vector_y_get $v % delete_Vector $v % ...
Since accessor functions provide a mechanism for accessing the internals of an object, the interpreter does not need to know anything about the actual representation of a Vector.
In certain cases, it is possible to use the low-level accessor functions to create a proxy class, also known as a shadow class. A proxy class is a special kind of object that gets created in a scripting language to access a C/C++ class (or struct) in a way that looks like the original structure (that is, it proxies the real C++ class). For example, if you have the following C definition :
class Vector {
public:
Vector();
~Vector();
double x,y,z;
};
A proxy classing mechanism would allow you to access the structure in a more natural manner from the interpreter. For example, in Python, you might want to do this:
>>> v = Vector() >>> v.x = 3 >>> v.y = 4 >>> v.z = -13 >>> ... >>> del v
Similarly, in Perl5 you may want the interface to work like this:
$v = new Vector;
$v->{x} = 3;
$v->{y} = 4;
$v->{z} = -13;
Finally, in Tcl :
Vector v v configure -x 3 -y 4 -z 13
When proxy classes are used, two objects are at really work--one in the scripting language, and an underlying C/C++ object. Operations affect both objects equally and for all practical purposes, it appears as if you are simply manipulating a C/C++ object.
The final step in using a scripting language with your C/C++ application is adding your extensions to the scripting language itself. There are two primary approaches for doing this. The preferred technique is to build a dynamically loadable extension in the form a shared library. Alternatively, you can recompile the scripting language interpreter with your extensions added to it.
To create a shared library or DLL, you often need to look at the manual pages for your compiler and linker. However, the procedure for a few common machines is shown below:
# Build a shared library for Solaris gcc -c example.c example_wrap.c -I/usr/local/include ld -G example.o example_wrap.o -o example.so # Build a shared library for Linux gcc -fpic -c example.c example_wrap.c -I/usr/local/include gcc -shared example.o example_wrap.o -o example.so # Build a shared library for Irix gcc -c example.c example_wrap.c -I/usr/local/include ld -shared example.o example_wrap.o -o example.so
To use your shared library, you simply use the corresponding command in the scripting language (load, import, use, etc...). This will import your module and allow you to start using it. For example:
% load ./example.so % fact 4 24 %
When working with C++ codes, the process of building shared libraries may be more complicated--primarily due to the fact that C++ modules may need additional code in order to operate correctly. On many machines, you can build a shared C++ module by following the above procedures, but changing the link line to the following :
c++ -shared example.o example_wrap.o -o example.so
When building extensions as shared libraries, it is not uncommon for your extension to rely upon other shared libraries on your machine. In order for the extension to work, it needs to be able to find all of these libraries at run-time. Otherwise, you may get an error such as the following :
>>> import graph
Traceback (innermost last):
File "<stdin>", line 1, in ?
File "/home/sci/data1/beazley/graph/graph.py", line 2, in ?
import graphc
ImportError: 1101:/home/sci/data1/beazley/bin/python: rld: Fatal Error: cannot
successfully map soname 'libgraph.so' under any of the filenames /usr/lib/libgraph.so:/
lib/libgraph.so:/lib/cmplrs/cc/libgraph.so:/usr/lib/cmplrs/cc/libgraph.so:
>>>
What this error means is that the extension module created by SWIG depends upon a shared library called "libgraph.so" that the system was unable to locate. To fix this problem, there are a few approaches you can take.
With static linking, you rebuild the scripting language interpreter with extensions. The process usually involves compiling a short main program that adds your customized commands to the language and starts the interpreter. You then link your program with a library to produce a new scripting language executable.
Although static linking is supported on all platforms, this is not the preferred technique for building scripting language extensions. In fact, there are very few practical reasons for doing this--consider using shared libraries instead.
This chapter describes the basic operation of SWIG, the structure of its input files, and how it handles standard ANSI C declarations. C++ support is described in the next chapter. However, C++ programmers should still read this chapter to understand the basics. Specific details about each target language are described in later chapters.
To run SWIG, use the swig command with options options and a filename like this:
swig [ options ] filename
where filename is a SWIG interface file or a C/C++ header file. Below is a subset of options that can be used. Additional options are also defined for each target language. A full list can be obtained by typing swig -help or swig - lang -help.
-allegrocl Generate ALLEGROCL wrappers -chicken Generate CHICKEN wrappers -clisp Generate CLISP wrappers -cffi Generate CFFI wrappers -csharp Generate C# wrappers -guile Generate Guile wrappers -java Generate Java wrappers -lua Generate Lua wrappers -modula3 Generate Modula 3 wrappers -mzscheme Generate Mzscheme wrappers -ocaml Generate Ocaml wrappers -perl Generate Perl wrappers -php Generate PHP wrappers -pike Generate Pike wrappers -python Generate Python wrappers -r Generate R (aka GNU S) wrappers -ruby Generate Ruby wrappers -sexp Generate Lisp S-Expressions wrappers -tcl Generate Tcl wrappers -uffi Generate Common Lisp / UFFI wrappers -xml Generate XML wrappers -c++ Enable C++ parsing -Dsymbol Define a preprocessor symbol -Fstandard Display error/warning messages in commonly used format -Fmicrosoft Display error/warning messages in Microsoft format -help Display all options -Idir Add a directory to the file include path -lfile Include a SWIG library file. -module name Set the name of the SWIG module -o outfile Name of output file -outcurrentdir Set default output dir to current dir instead of input file's path -outdir dir Set language specific files output directory -swiglib Show location of SWIG library -version Show SWIG version number
As input, SWIG expects a file containing ANSI C/C++ declarations and special SWIG directives. More often than not, this is a special SWIG interface file which is usually denoted with a special .i or .swg suffix. In certain cases, SWIG can be used directly on raw header files or source files. However, this is not the most typical case and there are several reasons why you might not want to do this (described later).
The most common format of a SWIG interface is as follows:
%module mymodule
%{
#include "myheader.h"
%}
// Now list ANSI C/C++ declarations
int foo;
int bar(int x);
...
The name of the module is supplied using the special %module directive (or the -module command line option). This directive must appear at the beginning of the file and is used to name the resulting extension module (in addition, this name often defines a namespace in the target language). If the module name is supplied on the command line, it overrides the name specified with the %module directive.
Everything in the %{ ... %} block is simply copied verbatim to the resulting wrapper file created by SWIG. This section is almost always used to include header files and other declarations that are required to make the generated wrapper code compile. It is important to emphasize that just because you include a declaration in a SWIG input file, that declaration does not automatically appear in the generated wrapper code---therefore you need to make sure you include the proper header files in the %{ ... %} section. It should be noted that the text enclosed in %{ ... %} is not parsed or interpreted by SWIG. The %{...%} syntax and semantics in SWIG is analogous to that of the declarations section used in input files to parser generation tools such as yacc or bison.
The output of SWIG is a C/C++ file that contains all of the wrapper code needed to build an extension module. SWIG may generate some additional files depending on the target language. By default, an input file with the name file.i is transformed into a file file_wrap.c or file_wrap.cxx (depending on whether or not the -c++ option has been used). The name of the output file can be changed using the -o option. In certain cases, file suffixes are used by the compiler to determine the source language (C, C++, etc.). Therefore, you have to use the -o option to change the suffix of the SWIG-generated wrapper file if you want something different than the default. For example:
$ swig -c++ -python -o example_wrap.cpp example.i
The C/C++ output file created by SWIG often contains everything that is needed to construct a extension module for the target scripting language. SWIG is not a stub compiler nor is it usually necessary to edit the output file (and if you look at the output, you probably won't want to). To build the final extension module, the SWIG output file is compiled and linked with the rest of your C/C++ program to create a shared library.
Many target languages will also generate proxy class files in the target language. The default output directory for these language specific files is the same directory as the generated C/C++ file. This can be modified using the -outdir option. For example:
$ swig -c++ -python -outdir pyfiles -o cppfiles/example_wrap.cpp example.i
If the directories cppfiles and pyfiles exist, the following will be generated:
cppfiles/example_wrap.cpp pyfiles/example.py
If the -outcurrentdir option is used (without -o) then SWIG behaves like a typical C/C++ compiler and the default output directory is then the current directory. Without this option the default output directory is the path to the input file. If -o and -outcurrentdir are used together, -outcurrentdir is effectively ignored as the output directory for the language files is the same directory as the generated C/C++ file if not overidden with -outdir.
C and C++ style comments may appear anywhere in interface files. In previous versions of SWIG, comments were used to generate documentation files. However, this feature is currently under repair and will reappear in a later SWIG release.
Like C, SWIG preprocesses all input files through an enhanced version of the C preprocessor. All standard preprocessor features are supported including file inclusion, conditional compilation and macros. However, #include statements are ignored unless the -includeall command line option has been supplied. The reason for disabling includes is that SWIG is sometimes used to process raw C header files. In this case, you usually only want the extension module to include functions in the supplied header file rather than everything that might be included by that header file (i.e., system headers, C library functions, etc.).
It should also be noted that the SWIG preprocessor skips all text enclosed inside a %{...%} block. In addition, the preprocessor includes a number of macro handling enhancements that make it more powerful than the normal C preprocessor. These extensions are described in the "Preprocessor" chapter.
Most of SWIG's operation is controlled by special directives that are always preceded by a "%" to distinguish them from normal C declarations. These directives are used to give SWIG hints or to alter SWIG's parsing behavior in some manner.
Since SWIG directives are not legal C syntax, it is generally not possible to include them in header files. However, SWIG directives can be included in C header files using conditional compilation like this:
/* header.h --- Some header file */ /* SWIG directives -- only seen if SWIG is running */ #ifdef SWIG %module foo #endif
SWIG is a special preprocessing symbol defined by SWIG when it is parsing an input file.
Although SWIG can parse most C/C++ declarations, it does not provide a complete C/C++ parser implementation. Most of these limitations pertain to very complicated type declarations and certain advanced C++ features. Specifically, the following features are not currently supported:
/* Non-conventional placement of storage specifier (extern) */ const int extern Number; /* Extra declarator grouping */ Matrix (foo); // A global variable /* Extra declarator grouping in parameters */ void bar(Spam (Grok)(Doh));
In practice, few (if any) C programmers actually write code like this since this style is never featured in programming books. However, if you're feeling particularly obfuscated, you can certainly break SWIG (although why would you want to?).
/* Not supported by SWIG */
int foo::bar(int) {
... whatever ...
}
In the event of a parsing error, conditional compilation can be used to skip offending code. For example:
#ifndef SWIG ... some bad declarations ... #endif
Alternatively, you can just delete the offending code from the interface file.
One of the reasons why SWIG does not provide a full C++ parser implementation is that it has been designed to work with incomplete specifications and to be very permissive in its handling of C/C++ datatypes (e.g., SWIG can generate interfaces even when there are missing class declarations or opaque datatypes). Unfortunately, this approach makes it extremely difficult to implement certain parts of a C/C++ parser as most compilers use type information to assist in the parsing of more complex declarations (for the truly curious, the primary complication in the implementation is that the SWIG parser does not utilize a separate typedef-name terminal symbol as described on p. 234 of K&R).
SWIG wraps simple C declarations by creating an interface that closely matches the way in which the declarations would be used in a C program. For example, consider the following interface file:
%module example
%inline %{
extern double sin(double x);
extern int strcmp(const char *, const char *);
extern int Foo;
%}
#define STATUS 50
#define VERSION "1.1"
In this file, there are two functions sin() and strcmp(), a global variable Foo, and two constants STATUS and VERSION. When SWIG creates an extension module, these declarations are accessible as scripting language functions, variables, and constants respectively. For example, in Tcl:
% sin 3 5.2335956 % strcmp Dave Mike -1 % puts $Foo 42 % puts $STATUS 50 % puts $VERSION 1.1
Or in Python:
>>> example.sin(3)
5.2335956
>>> example.strcmp('Dave','Mike')
-1
>>> print example.cvar.Foo
42
>>> print example.STATUS
50
>>> print example.VERSION
1.1
Whenever possible, SWIG creates an interface that closely matches the underlying C/C++ code. However, due to subtle differences between languages, run-time environments, and semantics, it is not always possible to do so. The next few sections describes various aspects of this mapping.
In order to build an interface, SWIG has to convert C/C++ datatypes to equivalent types in the target language. Generally, scripting languages provide a more limited set of primitive types than C. Therefore, this conversion process involves a certain amount of type coercion.
Most scripting languages provide a single integer type that is implemented using the int or long datatype in C. The following list shows all of the C datatypes that SWIG will convert to and from integers in the target language:
int short long unsigned signed unsigned short unsigned long unsigned char signed char bool
When an integral value is converted from C, a cast is used to convert it to the representation in the target language. Thus, a 16 bit short in C may be promoted to a 32 bit integer. When integers are converted in the other direction, the value is cast back into the original C type. If the value is too large to fit, it is silently truncated.
unsigned char and signed char are special cases that are handled as small 8-bit integers. Normally, the char datatype is mapped as a one-character ASCII string.
The bool datatype is cast to and from an integer value of 0 and 1 unless the target language provides a special boolean type.
Some care is required when working with large integer values. Most scripting languages use 32-bit integers so mapping a 64-bit long integer may lead to truncation errors. Similar problems may arise with 32 bit unsigned integers (which may appear as large negative numbers). As a rule of thumb, the int datatype and all variations of char and short datatypes are safe to use. For unsigned int and long datatypes, you will need to carefully check the correct operation of your program after it has been wrapped with SWIG.
Although the SWIG parser supports the long long datatype, not all language modules support it. This is because long long usually exceeds the integer precision available in the target language. In certain modules such as Tcl and Perl5, long long integers are encoded as strings. This allows the full range of these numbers to be represented. However, it does not allow long long values to be used in arithmetic expressions. It should also be noted that although long long is part of the ISO C99 standard, it is not universally supported by all C compilers. Make sure you are using a compiler that supports long long before trying to use this type with SWIG.
SWIG recognizes the following floating point types :
float double
Floating point numbers are mapped to and from the natural representation of floats in the target language. This is almost always a C double. The rarely used datatype of long double is not supported by SWIG.
The char datatype is mapped into a NULL terminated ASCII string with a single character. When used in a scripting language it shows up as a tiny string containing the character value. When converting the value back into C, SWIG takes a character string from the scripting language and strips off the first character as the char value. Thus if the value "foo" is assigned to a char datatype, it gets the value `f'.
The char * datatype is handled as a NULL-terminated ASCII string. SWIG maps this into a 8-bit character string in the target scripting language. SWIG converts character strings in the target language to NULL terminated strings before passing them into C/C++. The default handling of these strings does not allow them to have embedded NULL bytes. Therefore, the char * datatype is not generally suitable for passing binary data. However, it is possible to change this behavior by defining a SWIG typemap. See the chapter on Typemaps for details about this.
At this time, SWIG provides limited support for Unicode and wide-character strings (the C wchar_t type). Some languages provide typemaps for wchar_t, but bear in mind these might not be portable across different operating systems. This is a delicate topic that is poorly understood by many programmers and not implemented in a consistent manner across languages. For those scripting languages that provide Unicode support, Unicode strings are often available in an 8-bit representation such as UTF-8 that can be mapped to the char * type (in which case the SWIG interface will probably work). If the program you are wrapping uses Unicode, there is no guarantee that Unicode characters in the target language will use the same internal representation (e.g., UCS-2 vs. UCS-4). You may need to write some special conversion functions.
Whenever possible, SWIG maps C/C++ global variables into scripting language variables. For example,
%module example double foo;
results in a scripting language variable like this:
# Tcl set foo [3.5] ;# Set foo to 3.5 puts $foo ;# Print the value of foo # Python cvar.foo = 3.5 # Set foo to 3.5 print cvar.foo # Print value of foo # Perl $foo = 3.5; # Set foo to 3.5 print $foo,"\n"; # Print value of foo # Ruby Module.foo = 3.5 # Set foo to 3.5 print Module.foo, "\n" # Print value of foo
Whenever the scripting language variable is used, the underlying C global variable is accessed. Although SWIG makes every attempt to make global variables work like scripting language variables, it is not always possible to do so. For instance, in Python, all global variables must be accessed through a special variable object known as cvar (shown above). In Ruby, variables are accessed as attributes of the module. Other languages may convert variables to a pair of accessor functions. For example, the Java module generates a pair of functions double get_foo() and set_foo(double val) that are used to manipulate the value.
Finally, if a global variable has been declared as const, it only supports read-only access. Note: this behavior is new to SWIG-1.3. Earlier versions of SWIG incorrectly handled const and created constants instead.
Constants can be created using #define, enumerations, or a special %constant directive. The following interface file shows a few valid constant declarations :
#define I_CONST 5 // An integer constant
#define PI 3.14159 // A Floating point constant
#define S_CONST "hello world" // A string constant
#define NEWLINE '\n' // Character constant
enum boolean {NO=0, YES=1};
enum months {JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG,
SEP, OCT, NOV, DEC};
%constant double BLAH = 42.37;
#define F_CONST (double) 5 // A floating pointer constant with cast
#define PI_4 PI/4
#define FLAGS 0x04 | 0x08 | 0x40
In #define declarations, the type of a constant is inferred by syntax. For example, a number with a decimal point is assumed to be floating point. In addition, SWIG must be able to fully resolve all of the symbols used in a #define in order for a constant to actually be created. This restriction is necessary because #define is also used to define preprocessor macros that are definitely not meant to be part of the scripting language interface. For example:
#define EXTERN extern EXTERN void foo();
In this case, you probably don't want to create a constant called EXTERN (what would the value be?). In general, SWIG will not create constants for macros unless the value can be completely determined by the preprocessor. For instance, in the above example, the declaration
#define PI_4 PI/4
defines a constant because PI was already defined as a constant and the value is known.
The use of constant expressions is allowed, but SWIG does not evaluate them. Rather, it passes them through to the output file and lets the C compiler perform the final evaluation (SWIG does perform a limited form of type-checking however).
For enumerations, it is critical that the original enum definition be included somewhere in the interface file (either in a header file or in the %{,%} block). SWIG only translates the enumeration into code needed to add the constants to a scripting language. It needs the original enumeration declaration in order to get the correct enum values as assigned by the C compiler.
The %constant directive is used to more precisely create constants corresponding to different C datatypes. Although it is not usually not needed for simple values, it is more useful when working with pointers and other more complex datatypes. Typically, %constant is only used when you want to add constants to the scripting language interface that are not defined in the original header file.
A common confusion with C programming is the semantic meaning of the const qualifier in declarations--especially when it is mixed with pointers and other type modifiers. In fact, previous versions of SWIG handled const incorrectly--a situation that SWIG-1.3.7 and newer releases have fixed.
Starting with SWIG-1.3, all variable declarations, regardless of any use of const, are wrapped as global variables. If a declaration happens to be declared as const, it is wrapped as a read-only variable. To tell if a variable is const or not, you need to look at the right-most occurrence of the const qualifier (that appears before the variable name). If the right-most const occurs after all other type modifiers (such as pointers), then the variable is const. Otherwise, it is not.
Here are some examples of const declarations.
const char a; // A constant character char const b; // A constant character (the same) char *const c; // A constant pointer to a character const char *const d; // A constant pointer to a constant character
Here is an example of a declaration that is not const:
const char *e; // A pointer to a constant character. The pointer
// may be modified.
In this case, the pointer e can change---it's only the value being pointed to that is read-only.
Compatibility Note: One reason for changing SWIG to handle const declarations as read-only variables is that there are many situations where the value of a const variable might change. For example, a library might export a symbol as const in its public API to discourage modification, but still allow the value to change through some other kind of internal mechanism. Furthermore, programmers often overlook the fact that with a constant declaration like char *const, the underlying data being pointed to can be modified--it's only the pointer itself that is constant. In an embedded system, a const declaration might refer to a read-only memory address such as the location of a memory-mapped I/O device port (where the value changes, but writing to the port is not supported by the hardware). Rather than trying to build a bunch of special cases into the const qualifier, the new interpretation of const as "read-only" is simple and exactly matches the actual semantics of const in C/C++. If you really want to create a constant as in older versions of SWIG, use the %constant directive instead. For example:
%constant double PI = 3.14159;
or
#ifdef SWIG #define const %constant #endif const double foo = 3.4; const double bar = 23.4; const int spam = 42; #ifdef SWIG #undef const #endif ...
Before going any further, there is one bit of caution involving char * that must now be mentioned. When strings are passed from a scripting language to a C char *, the pointer usually points to string data stored inside the interpreter. It is almost always a really bad idea to modify this data. Furthermore, some languages may explicitly disallow it. For instance, in Python, strings are supposed be immutable. If you violate this, you will probably receive a vast amount of wrath when you unleash your module on the world.
The primary source of problems are functions that might modify string data in place. A classic example would be a function like this:
char *strcat(char *s, const char *t)
Although SWIG will certainly generate a wrapper for this, its behavior will be undefined. In fact, it will probably cause your application to crash with a segmentation fault or other memory related problem. This is because s refers to some internal data in the target language---data that you shouldn't be touching.
The bottom line: don't rely on char * for anything other than read-only input values. However, it must be noted that you could change the behavior of SWIG using typemaps.
Most C programs manipulate arrays, structures, and other types of objects. This section discusses the handling of these datatypes.
Pointers to primitive C datatypes such as
int * double *** char **
are fully supported by SWIG. Rather than trying to convert the data being pointed to into a scripting representation, SWIG simply encodes the pointer itself into a representation that contains the actual value of the pointer and a type-tag. Thus, the SWIG representation of the above pointers (in Tcl), might look like this:
_10081012_p_int _1008e124_ppp_double _f8ac_pp_char
A NULL pointer is represented by the string "NULL" or the value 0 encoded with type information.
All pointers are treated as opaque objects by SWIG. Thus, a pointer may be returned by a function and passed around to other C functions as needed. For all practical purposes, the scripting language interface works in exactly the same way as you would use the pointer in a C program. The only difference is that there is no mechanism for dereferencing the pointer since this would require the target language to understand the memory layout of the underlying object.
The scripting language representation of a pointer value should never be manipulated directly. Even though the values shown look like hexadecimal addresses, the numbers used may differ from the actual machine address (e.g., on little-endian machines, the digits may appear in reverse order). Furthermore, SWIG does not normally map pointers into high-level objects such as associative arrays or lists (for example, converting an int * into an list of integers). There are several reasons why SWIG does not do this:
By allowing pointers to be manipulated from a scripting language, extension modules effectively bypass compile-time type checking in the C/C++ compiler. To prevent errors, a type signature is encoded into all pointer values and is used to perform run-time type checking. This type-checking process is an integral part of SWIG and can not be disabled or modified without using typemaps (described in later chapters).
Like C, void * matches any kind of pointer. Furthermore, NULL pointers can be passed to any function that expects to receive a pointer. Although this has the potential to cause a crash, NULL pointers are also sometimes used as sentinel values or to denote a missing/empty value. Therefore, SWIG leaves NULL pointer checking up to the application.
For everything else (structs, classes, arrays, etc...) SWIG applies a very simple rule :
In other words, SWIG manipulates everything else by reference. This model makes sense because most C/C++ programs make heavy use of pointers and SWIG can use the type-checked pointer mechanism already present for handling pointers to basic datatypes.
Although this probably sounds complicated, it's really quite simple. Suppose you have an interface file like this :
%module fileio FILE *fopen(char *, char *); int fclose(FILE *); unsigned fread(void *ptr, unsigned size, unsigned nobj, FILE *); unsigned fwrite(void *ptr, unsigned size, unsigned nobj, FILE *); void *malloc(int nbytes); void free(void *);
In this file, SWIG doesn't know what a FILE is, but since it's used as a pointer, so it doesn't really matter what it is. If you wrapped this module into Python, you can use the functions just like you expect :
# Copy a file def filecopy(source,target): f1 = fopen(source,"r") f2 = fopen(target,"w") buffer = malloc(8192) nbytes = fread(buffer,8192,1,f1) while (nbytes > 0): fwrite(buffer,8192,1,f2) nbytes = fread(buffer,8192,1,f1) free(buffer)
In this case f1, f2, and buffer are all opaque objects containing C pointers. It doesn't matter what value they contain--our program works just fine without this knowledge.
When SWIG encounters an undeclared datatype, it automatically assumes that it is a structure or class. For example, suppose the following function appeared in a SWIG input file:
void matrix_multiply(Matrix *a, Matrix *b, Matrix *c);
SWIG has no idea what a "Matrix" is. However, it is obviously a pointer to something so SWIG generates a wrapper using its generic pointer handling code.
Unlike C or C++, SWIG does not actually care whether Matrix has been previously defined in the interface file or not. This allows SWIG to generate interfaces from only partial or limited information. In some cases, you may not care what a Matrix really is as long as you can pass an opaque reference to one around in the scripting language interface.
An important detail to mention is that SWIG will gladly generate wrappers for an interface when there are unspecified type names. However, all unspecified types are internally handled as pointers to structures or classes! For example, consider the following declaration:
void foo(size_t num);
If size_t is undeclared, SWIG generates wrappers that expect to receive a type of size_t * (this mapping is described shortly). As a result, the scripting interface might behave strangely. For example:
foo(40); TypeError: expected a _p_size_t.
The only way to fix this problem is to make sure you properly declare type names using typedef.
Like C, typedef can be used to define new type names in SWIG. For example:
typedef unsigned int size_t;
typedef definitions appearing in a SWIG interface are not propagated to the generated wrapper code. Therefore, they either need to be defined in an included header file or placed in the declarations section like this:
%{
/* Include in the generated wrapper file */
typedef unsigned int size_t;
%}
/* Tell SWIG about it */
typedef unsigned int size_t;
or
%inline %{
typedef unsigned int size_t;
%}
In certain cases, you might be able to include other header files to collect type information. For example:
%module example %import "sys/types.h"
In this case, you might run SWIG as follows:
$ swig -I/usr/include -includeall example.i
It should be noted that your mileage will vary greatly here. System headers are notoriously complicated and may rely upon a variety of non-standard C coding extensions (e.g., such as special directives to GCC). Unless you exactly specify the right include directories and preprocessor symbols, this may not work correctly (you will have to experiment).
SWIG tracks typedef declarations and uses this information for run-time type checking. For instance, if you use the above typedef and had the following function declaration:
void foo(unsigned int *ptr);
The corresponding wrapper function will accept arguments of type unsigned int * or size_t *.
So far, this chapter has presented almost everything you need to know to use SWIG for simple interfaces. However, some C programs use idioms that are somewhat more difficult to map to a scripting language interface. This section describes some of these issues.
Sometimes a C function takes structure parameters that are passed by value. For example, consider the following function:
double dot_product(Vector a, Vector b);
To deal with this, SWIG transforms the function to use pointers by creating a wrapper equivalent to the following:
double wrap_dot_product(Vector *a, Vector *b) {
Vector x = *a;
Vector y = *b;
return dot_product(x,y);
}
In the target language, the dot_product() function now accepts pointers to Vectors instead of Vectors. For the most part, this transformation is transparent so you might not notice.
C functions that return structures or classes datatypes by value are more difficult to handle. Consider the following function:
Vector cross_product(Vector v1, Vector v2);
This function wants to return Vector, but SWIG only really supports pointers. As a result, SWIG creates a wrapper like this:
Vector *wrap_cross_product(Vector *v1, Vector *v2) {
Vector x = *v1;
Vector y = *v2;
Vector *result;
result = (Vector *) malloc(sizeof(Vector));
*(result) = cross(x,y);
return result;
}
or if SWIG was run with the -c++ option:
Vector *wrap_cross(Vector *v1, Vector *v2) {
Vector x = *v1;
Vector y = *v2;
Vector *result = new Vector(cross(x,y)); // Uses default copy constructor
return result;
}
In both cases, SWIG allocates a new object and returns a reference to it. It is up to the user to delete the returned object when it is no longer in use. Clearly, this will leak memory if you are unaware of the implicit memory allocation and don't take steps to free the result. That said, it should be noted that some language modules can now automatically track newly created objects and reclaim memory for you. Consult the documentation for each language module for more details.
It should also be noted that the handling of pass/return by value in C++ has some special cases. For example, the above code fragments don't work correctly if Vector doesn't define a default constructor. The section on SWIG and C++ has more information about this case.
When global variables or class members involving structures are encountered, SWIG handles them as pointers. For example, a global variable like this
Vector unit_i;
gets mapped to an underlying pair of set/get functions like this :
Vector *unit_i_get() {
return &unit_i;
}
void unit_i_set(Vector *value) {
unit_i = *value;
}
Again some caution is in order. A global variable created in this manner will show up as a pointer in the target scripting language. It would be an extremely bad idea to free or destroy such a pointer. Also, C++ classes must supply a properly defined copy constructor in order for assignment to work correctly.
When a global variable of type char * appears, SWIG uses malloc() or new to allocate memory for the new value. Specifically, if you have a variable like this
char *foo;
SWIG generates the following code:
/* C mode */
void foo_set(char *value) {
if (foo) free(foo);
foo = (char *) malloc(strlen(value)+1);
strcpy(foo,value);
}
/* C++ mode. When -c++ option is used */
void foo_set(char *value) {
if (foo) delete [] foo;
foo = new char[strlen(value)+1];
strcpy(foo,value);
}
If this is not the behavior that you want, consider making the variable read-only using the %immutable directive. Alternatively, you might write a short assist-function to set the value exactly like you want. For example:
%inline %{
void set_foo(char *value) {
strncpy(foo,value, 50);
}
%}
Note: If you write an assist function like this, you will have to call it as a function from the target scripting language (it does not work like a variable). For example, in Python you will have to write:
>>> set_foo("Hello World")
A common mistake with char * variables is to link to a variable declared like this:
char *VERSION = "1.0";
In this case, the variable will be readable, but any attempt to change the value results in a segmentation or general protection fault. This is due to the fact that SWIG is trying to release the old value using free or delete when the string literal value currently assigned to the variable wasn't allocated using malloc() or new. To fix this behavior, you can either mark the variable as read-only, write a typemap (as described in Chapter 6), or write a special set function as shown. Another alternative is to declare the variable as an array:
char VERSION[64] = "1.0";
When variables of type const char * are declared, SWIG still generates functions for setting and getting the value. However, the default behavior does not release the previous contents (resulting in a possible memory leak). In fact, you may get a warning message such as this when wrapping such a variable:
example.i:20. Typemap warning. Setting const char * variable may leak memory
The reason for this behavior is that const char * variables are often used to point to string literals. For example:
const char *foo = "Hello World\n";
Therefore, it's a really bad idea to call free() on such a pointer. On the other hand, it is legal to change the pointer to point to some other value. When setting a variable of this type, SWIG allocates a new string (using malloc or new) and changes the pointer to point to the new value. However, repeated modifications of the value will result in a memory leak since the old value is not released.
Arrays are fully supported by SWIG, but they are always handled as pointers instead of mapping them to a special array object or list in the target language. Thus, the following declarations :
int foobar(int a[40]); void grok(char *argv[]); void transpose(double a[20][20]);
are processed as if they were really declared like this:
int foobar(int *a); void grok(char **argv); void transpose(double (*a)[20]);
Like C, SWIG does not perform array bounds checking. It is up to the user to make sure the pointer points a suitably allocated region of memory.
Multi-dimensional arrays are transformed into a pointer to an array of one less dimension. For example:
int [10]; // Maps to int * int [10][20]; // Maps to int (*)[20] int [10][20][30]; // Maps to int (*)[20][30]
It is important to note that in the C type system, a multidimensional array a[][] is NOT equivalent to a single pointer *a or a double pointer such as **a. Instead, a pointer to an array is used (as shown above) where the actual value of the pointer is the starting memory location of the array. The reader is strongly advised to dust off their C book and re-read the section on arrays before using them with SWIG.
Array variables are supported, but are read-only by default. For example:
int a[100][200];
In this case, reading the variable 'a' returns a pointer of type int (*)[200] that points to the first element of the array &a[0][0]. Trying to modify 'a' results in an error. This is because SWIG does not know how to copy data from the target language into the array. To work around this limitation, you may want to write a few simple assist functions like this:
%inline %{
void a_set(int i, int j, int val) {
a[i][j] = val;
}
int a_get(int i, int j) {
return a[i][j];
}
%}
To dynamically create arrays of various sizes and shapes, it may be useful to write some helper functions in your interface. For example:
// Some array helpers
%inline %{
/* Create any sort of [size] array */
int *int_array(int size) {
return (int *) malloc(size*sizeof(int));
}
/* Create a two-dimension array [size][10] */
int (*int_array_10(int size))[10] {
return (int (*)[10]) malloc(size*10*sizeof(int));
}
%}
Arrays of char are handled as a special case by SWIG. In this case, strings in the target language can be stored in the array. For example, if you have a declaration like this,
char pathname[256];
SWIG generates functions for both getting and setting the value that are equivalent to the following code:
char *pathname_get() {
return pathname;
}
void pathname_set(char *value) {
strncpy(pathname,value,256);
}
In the target language, the value can be set like a normal variable.
A read-only variable can be created by using the %immutable directive as shown :
// File : interface.i int a; // Can read/write %immutable; int b,c,d // Read only variables %mutable; double x,y // read/write
The %immutable directive enables read-only mode until it is explicitly disabled using the %mutable directive. As an alternative to turning read-only mode off and on like this, individual declarations can also be tagged as immutable. For example:
%immutable x; // Make x read-only ... double x; // Read-only (from earlier %immutable directive) double y; // Read-write ...
The %mutable and %immutable directives are actually %feature directives defined like this:
#define %immutable %feature("immutable")
#define %mutable %feature("immutable","")
If you wanted to make all wrapped variables read-only, barring one or two, it might be easier to take this approach:
%immutable; // Make all variables read-only
%feature("immutable","0") x; // except, make x read/write
...
double x;
double y;
double z;
...
Read-only variables are also created when declarations are declared as const. For example:
const int foo; /* Read only variable */ char * const version="1.0"; /* Read only variable */
Compatibility note: Read-only access used to be controlled by a pair of directives %readonly and %readwrite. Although these directives still work, they generate a warning message. Simply change the directives to %immutable; and %mutable; to silence the warning. Don't forget the extra semicolon!
Normally, the name of a C declaration is used when that declaration is wrapped into the target language. However, this may generate a conflict with a keyword or already existing function in the scripting language. To resolve a name conflict, you can use the %rename directive as shown :
// interface.i %rename(my_print) print; extern void print(char *); %rename(foo) a_really_long_and_annoying_name; extern int a_really_long_and_annoying_name;
SWIG still calls the correct C function, but in this case the function print() will really be called "my_print()" in the target language.
The placement of the %rename directive is arbitrary as long as it appears before the declarations to be renamed. A common technique is to write code for wrapping a header file like this:
// interface.i %rename(my_print) print; %rename(foo) a_really_long_and_annoying_name; %include "header.h"
%rename applies a renaming operation to all future occurrences of a name. The renaming applies to functions, variables, class and structure names, member functions, and member data. For example, if you had two-dozen C++ classes, all with a member function named `print' (which is a keyword in Python), you could rename them all to `output' by specifying :
%rename(output) print; // Rename all `print' functions to `output'
SWIG does not normally perform any checks to see if the functions it wraps are already defined in the target scripting language. However, if you are careful about namespaces and your use of modules, you can usually avoid these problems.
Closely related to %rename is the %ignore directive. %ignore instructs SWIG to ignore declarations that match a given identifier. For example:
%ignore print; // Ignore all declarations named print %ignore _HAVE_FOO_H; // Ignore an include guard constant ... %include "foo.h" // Grab a header file ...
Any function, variable etc which matches %ignore will not be wrapped and therefore will not be available from the target language. A common usage of %ignore is to selectively remove certain declarations from a header file without having to add conditional compilation to the header. However, it should be stressed that this only works for simple declarations. If you need to remove a whole section of problematic code, the SWIG preprocessor should be used instead.
More powerful variants of %rename and %ignore directives can be used to help wrap C++ overloaded functions and methods or C++ methods which use default arguments. This is described in the Ambiguity resolution and renaming section in the C++ chapter.
Compatibility note: Older versions of SWIG provided a special %name directive for renaming declarations. For example:
%name(output) extern void print(char *);
This directive is still supported, but it is deprecated and should probably be avoided. The %rename directive is more powerful and better supports wrapping of raw header file information.
SWIG supports default arguments in both C and C++ code. For example:
int plot(double x, double y, int color=WHITE);
In this case, SWIG generates wrapper code where the default arguments are optional in the target language. For example, this function could be used in Tcl as follows :
% plot -3.4 7.5 # Use default value % plot -3.4 7.5 10 # set color to 10 instead
Although the ANSI C standard does not allow default arguments, default arguments specified in a SWIG interface work with both C and C++.
Note: There is a subtle semantic issue concerning the use of default arguments and the SWIG generated wrapper code. When default arguments are used in C code, the default values are emitted into the wrappers and the function is invoked with a full set of arguments. This is different to when wrapping C++ where an overloaded wrapper method is generated for each defaulted argument. Please refer to the section on default arguments in the C++ chapter for further details.
Occasionally, a C library may include functions that expect to receive pointers to functions--possibly to serve as callbacks. SWIG provides full support for function pointers provided that the callback functions are defined in C and not in the target language. For example, consider a function like this:
int binary_op(int a, int b, int (*op)(int,int));
When you first wrap something like this into an extension module, you may find the function to be impossible to use. For instance, in Python:
>>> def add(x,y): ... return x+y ... >>> binary_op(3,4,add) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: Type error. Expected _p_f_int_int__int >>>
The reason for this error is that SWIG doesn't know how to map a scripting language function into a C callback. However, existing C functions can be used as arguments provided you install them as constants. One way to do this is to use the %constant directive like this:
/* Function with a callback */ int binary_op(int a, int b, int (*op)(int,int)); /* Some callback functions */ %constant int add(int,int); %constant int sub(int,int); %constant int mul(int,int);
In this case, add, sub, and mul become function pointer constants in the target scripting language. This allows you to use them as follows:
>>> binary_op(3,4,add) 7 >>> binary_op(3,4,mul) 12 >>>
Unfortunately, by declaring the callback functions as constants, they are no longer accessible as functions. For example:
>>> add(3,4) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: object is not callable: '_ff020efc_p_f_int_int__int' >>>
If you want to make a function available as both a callback function and a function, you can use the %callback and %nocallback directives like this:
/* Function with a callback */
int binary_op(int a, int b, int (*op)(int,int));
/* Some callback functions */
%callback("%s_cb");
int add(int,int);
int sub(int,int);
int mul(int,int);
%nocallback;
The argument to %callback is a printf-style format string that specifies the naming convention for the callback constants (%s gets replaced by the function name). The callback mode remains in effect until it is explicitly disabled using %nocallback. When you do this, the interface now works as follows:
>>> binary_op(3,4,add_cb) 7 >>> binary_op(3,4,mul_cb) 12 >>> add(3,4) 7 >>> mul(3,4) 12
Notice that when the function is used as a callback, special names such as add_cb is used instead. To call the function normally, just use the original function name such as add().
SWIG provides a number of extensions to standard C printf formatting that may be useful in this context. For instance, the following variation installs the callbacks as all upper-case constants such as ADD, SUB, and MUL:
/* Some callback functions */
%callback("%(upper)s");
int add(int,int);
int sub(int,int);
int mul(int,int);
%nocallback;
A format string of "%(lower)s" converts all characters to lower-case. A string of "%(title)s" capitalizes the first character and converts the rest to lower case.
And now, a final note about function pointer support. Although SWIG does not normally allow callback functions to be written in the target language, this can be accomplished with the use of typemaps and other advanced SWIG features. This is described in a later chapter.
This section describes the behavior of SWIG when processing ANSI C structures and union declarations. Extensions to handle C++ are described in the next section.
If SWIG encounters the definition of a structure or union, it creates a set of accessor functions. Although SWIG does not need structure definitions to build an interface, providing definitions make it possible to access structure members. The accessor functions generated by SWIG simply take a pointer to an object and allow access to an individual member. For example, the declaration :
struct Vector {
double x,y,z;
}
gets transformed into the following set of accessor functions :
double Vector_x_get(struct Vector *obj) {
return obj->x;
}
double Vector_y_get(struct Vector *obj) {
return obj->y;
}
double Vector_z_get(struct Vector *obj) {
return obj->z;
}
void Vector_x_set(struct Vector *obj, double value) {
obj->x = value;
}
void Vector_y_set(struct Vector *obj, double value) {
obj->y = value;
}
void Vector_z_set(struct Vector *obj, double value) {
obj->z = value;
}
In addition, SWIG creates default constructor and destructor functions if none are defined in the interface. For example:
struct Vector *new_Vector() {
return (Vector *) calloc(1,sizeof(struct Vector));
}
void delete_Vector(struct Vector *obj) {
free(obj);
}
Using these low-level accessor functions, an object can be minimally manipulated from the target language using code like this:
v = new_Vector() Vector_x_set(v,2) Vector_y_set(v,10) Vector_z_set(v,-5) ... delete_Vector(v)
However, most of SWIG's language modules also provide a high-level interface that is more convenient. Keep reading.
SWIG supports the following construct which is quite common in C programs :
typedef struct {
double x,y,z;
} Vector;
When encountered, SWIG assumes that the name of the object is `Vector' and creates accessor functions like before. The only difference is that the use of typedef allows SWIG to drop the struct keyword on its generated code. For example:
double Vector_x_get(Vector *obj) {
return obj->x;
}
If two different names are used like this :
typedef struct vector_struct {
double x,y,z;
} Vector;
the name Vector is used instead of vector_struct since this is more typical C programming style. If declarations defined later in the interface use the type struct vector_struct, SWIG knows that this is the same as Vector and it generates the appropriate type-checking code.
Structures involving character strings require some care. SWIG assumes that all members of type char * have been dynamically allocated using malloc() and that they are NULL-terminated ASCII strings. When such a member is modified, the previously contents will be released, and the new contents allocated. For example :
%module mymodule
...
struct Foo {
char *name;
...
}
This results in the following accessor functions :
char *Foo_name_get(Foo *obj) {
return Foo->name;
}
char *Foo_name_set(Foo *obj, char *c) {
if (obj->name) free(obj->name);
obj->name = (char *) malloc(strlen(c)+1);
strcpy(obj->name,c);
return obj->name;
}
If this behavior differs from what you need in your applications, the SWIG "memberin" typemap can be used to change it. See the typemaps chapter for further details.
Note: If the -c++ option is used, new and delete are used to perform memory allocation.
Arrays may appear as the members of structures, but they will be read-only. SWIG will write an accessor function that returns the pointer to the first element of the array, but will not write a function to change the contents of the array itself. When this situation is detected, SWIG may generate a warning message such as the following :
interface.i:116. Warning. Array member will be read-only
To eliminate the warning message, typemaps can be used, but this is discussed in a later chapter. In many cases, the warning message is harmless.
Occasionally, a structure will contain data members that are themselves structures. For example:
typedef struct Foo {
int x;
} Foo;
typedef struct Bar {
int y;
Foo f; /* struct member */
} Bar;
When a structure member is wrapped, it is handled as a pointer, unless the %naturalvar directive is used where it is handled more like a C++ reference (see C++ Member data). The accessors to the member variable as a pointer is effectively wrapped as follows:
Foo *Bar_f_get(Bar *b) {
return &b->f;
}
void Bar_f_set(Bar *b, Foo *value) {
b->f = *value;
}
The reasons for this are somewhat subtle but have to do with the problem of modifying and accessing data inside the data member. For example, suppose you wanted to modify the value of f.x of a Bar object like this:
Bar *b; b->f.x = 37;
Translating this assignment to function calls (as would be used inside the scripting language interface) results in the following code:
Bar *b; Foo_x_set(Bar_f_get(b),37);
In this code, if the Bar_f_get() function were to return a Foo instead of a Foo *, then the resulting modification would be applied to a copy of f and not the data member f itself. Clearly that's not what you want!
It should be noted that this transformation to pointers only occurs if SWIG knows that a data member is a structure or class. For instance, if you had a structure like this,
struct Foo {
WORD w;
};
and nothing was known about WORD, then SWIG will generate more normal accessor functions like this:
WORD Foo_w_get(Foo *f) {
return f->w;
}
void Foo_w_set(FOO *f, WORD value) {
f->w = value;
}
Compatibility Note: SWIG-1.3.11 and earlier releases transformed all non-primitive member datatypes to pointers. Starting in SWIG-1.3.12, this transformation only occurs if a datatype is known to be a structure, class, or union. This is unlikely to break existing code. However, if you need to tell SWIG that an undeclared datatype is really a struct, simply use a forward struct declaration such as "struct Foo;".
When wrapping structures, it is generally useful to have a mechanism for creating and destroying objects. If you don't do anything, SWIG will automatically generate functions for creating and destroying objects using malloc() and free(). Note: the use of malloc() only applies when SWIG is used on C code (i.e., when the -c++ option is not supplied on the command line). C++ is handled differently.
If you don't want SWIG to generate default constructors for your interfaces, you can use the %nodefaultctor directive or the -nodefaultctor command line option. For example:
swig -nodefaultctor example.i
or
%module foo ... %nodefaultctor; // Don't create default constructors ... declarations ... %clearnodefaultctor; // Re-enable default constructors
If you need more precise control, %nodefaultctor can selectively target individual structure definitions. For example:
%nodefaultctor Foo; // No default constructor for Foo
...
struct Foo { // No default constructor generated.
};
struct Bar { // Default constructor generated.
};
Since ignoring the implicit or default destructors most of the times produce memory leaks, SWIG will always try to generate them. If needed, however, you can selectively disable the generation of the default/implicit destructor by using %nodefaultdtor
%nodefaultdtor Foo; // No default/implicit destructor for Foo
...
struct Foo { // No default destructor is generated.
};
struct Bar { // Default destructor generated.
};
Compatibility note: Prior to SWIG-1.3.7, SWIG did not generate default constructors or destructors unless you explicitly turned them on using -make_default. However, it appears that most users want to have constructor and destructor functions so it has now been enabled as the default behavior.
Note: There are also the -nodefault option and %nodefault directive, which disable both the default or implicit destructor generation. This could lead to memory leaks across the target languages, and is highly recommended you don't use them.
Most languages provide a mechanism for creating classes and supporting object oriented programming. From a C standpoint, object oriented programming really just boils down to the process of attaching functions to structures. These functions normally operate on an instance of the structure (or object). Although there is a natural mapping of C++ to such a scheme, there is no direct mechanism for utilizing it with C code. However, SWIG provides a special %extend directive that makes it possible to attach methods to C structures for purposes of building an object oriented interface. Suppose you have a C header file with the following declaration :
/* file : vector.h */
...
typedef struct {
double x,y,z;
} Vector;
You can make a Vector look a lot like a class by writing a SWIG interface like this:
// file : vector.i
%module mymodule
%{
#include "vector.h"
%}
%include vector.h // Just grab original C header file
%extend Vector { // Attach these functions to struct Vector
Vector(double x, double y, double z) {
Vector *v;
v = (Vector *) malloc(sizeof(Vector));
v->x = x;
v->y = y;
v->z = z;
return v;
}
~Vector() {
free($self);
}
double magnitude() {
return sqrt($self->x*$self->x+$self->y*$self->y+$self->z*$self->z);
}
void print() {
printf("Vector [%g, %g, %g]\n", $self->x,$self->y,$self->z);
}
};
Note the usage of the $self special variable. Its usage is identical to a C++ 'this' pointer and should be used whenever access to the struct instance is required.
Now, when used with proxy classes in Python, you can do things like this :
>>> v = Vector(3,4,0) # Create a new vector >>> print v.magnitude() # Print magnitude 5.0 >>> v.print() # Print it out [ 3, 4, 0 ] >>> del v # Destroy it
The %extend directive can also be used inside the definition of the Vector structure. For example:
// file : vector.i
%module mymodule
%{
#include "vector.h"
%}
typedef struct {
double x,y,z;
%extend {
Vector(double x, double y, double z) { ... }
~Vector() { ... }
...
}
} Vector;
Finally, %extend can be used to access externally written functions provided they follow the naming convention used in this example :
/* File : vector.c */
/* Vector methods */
#include "vector.h"
Vector *new_Vector(double x, double y, double z) {
Vector *v;
v = (Vector *) malloc(sizeof(Vector));
v->x = x;
v->y = y;
v->z = z;
return v;
}
void delete_Vector(Vector *v) {
free(v);
}
double Vector_magnitude(Vector *v) {
return sqrt(v->x*v->x+v->y*v->y+v->z*v->z);
}
// File : vector.i
// Interface file
%module mymodule
%{
#include "vector.h"
%}
typedef struct {
double x,y,z;
%extend {
Vector(int,int,int); // This calls new_Vector()
~Vector(); // This calls delete_Vector()
double magnitude(); // This will call Vector_magnitude()
...
}
} Vector;
A little known feature of the %extend directive is that it can also be used to add synthesized attributes or to modify the behavior of existing data attributes. For example, suppose you wanted to make magnitude a read-only attribute of Vector instead of a method. To do this, you might write some code like this:
// Add a new attribute to Vector
%extend Vector {
const double magnitude;
}
// Now supply the implementation of the Vector_magnitude_get function
%{
const double Vector_magnitude_get(Vector *v) {
return (const double) return sqrt(v->x*v->x+v->y*v->y+v->z*v->z);
}
%}
Now, for all practical purposes, magnitude will appear like an attribute of the object.
A similar technique can also be used to work with problematic data members. For example, consider this interface:
struct Person {
char name[50];
...
}
By default, the name attribute is read-only because SWIG does not normally know how to modify arrays. However, you can rewrite the interface as follows to change this:
struct Person {
%extend {
char *name;
}
...
}
// Specific implementation of set/get functions
%{
char *Person_name_get(Person *p) {
return p->name;
}
void Person_name_set(Person *p, char *val) {
strncpy(p->name,val,50);
}
%}
Finally, it should be stressed that even though %extend can be used to add new data members, these new members can not require the allocation of additional storage in the object (e.g., their values must be entirely synthesized from existing attributes of the structure).
Compatibility note: The %extend directive is a new name for the %addmethods directive. Since %addmethods could be used to extend a structure with more than just methods, a more suitable directive name has been chosen.
Occasionally, a C program will involve structures like this :
typedef struct Object {
int objtype;
union {
int ivalue;
double dvalue;
char *strvalue;
void *ptrvalue;
} intRep;
} Object;
When SWIG encounters this, it performs a structure splitting operation that transforms the declaration into the equivalent of the following:
typedef union {
int ivalue;
double dvalue;
char *strvalue;
void *ptrvalue;
} Object_intRep;
typedef struct Object {
int objType;
Object_intRep intRep;
} Object;
SWIG will then create an Object_intRep structure for use inside the interface file. Accessor functions will be created for both structures. In this case, functions like this would be created :
Object_intRep *Object_intRep_get(Object *o) {
return (Object_intRep *) &o->intRep;
}
int Object_intRep_ivalue_get(Object_intRep *o) {
return o->ivalue;
}
int Object_intRep_ivalue_set(Object_intRep *o, int value) {
return (o->ivalue = value);
}
double Object_intRep_dvalue_get(Object_intRep *o) {
return o->dvalue;
}
... etc ...
Although this process is a little hairy, it works like you would expect in the target scripting language--especially when proxy classes are used. For instance, in Perl:
# Perl5 script for accessing nested member
$o = CreateObject(); # Create an object somehow
$o->{intRep}->{ivalue} = 7 # Change value of o.intRep.ivalue
If you have a lot nested structure declarations, it is advisable to double-check them after running SWIG. Although, there is a good chance that they will work, you may have to modify the interface file in certain cases.
SWIG doesn't care if the declaration of a structure in a .i file exactly matches that used in the underlying C code (except in the case of nested structures). For this reason, there are no problems omitting problematic members or simply omitting the structure definition altogether. If you are happy passing pointers around, this can be done without ever giving SWIG a structure definition.
Starting with SWIG1.3, a number of improvements have been made to SWIG's code generator. Specifically, even though structure access has been described in terms of high-level accessor functions such as this,
double Vector_x_get(Vector *v) {
return v->x;
}
most of the generated code is actually inlined directly into wrapper functions. Therefore, no function Vector_x_get() actually exists in the generated wrapper file. For example, when creating a Tcl module, the following function is generated instead:
static int
_wrap_Vector_x_get(ClientData clientData, Tcl_Interp *interp,
int objc, Tcl_Obj *CONST objv[]) {
struct Vector *arg1 ;
double result ;
if (SWIG_GetArgs(interp, objc, objv,"p:Vector_x_get self ",&arg0,
SWIGTYPE_p_Vector) == TCL_ERROR)
return TCL_ERROR;
result = (double ) (arg1->x);
Tcl_SetObjResult(interp,Tcl_NewDoubleObj((double) result));
return TCL_OK;
}
The only exception to this rule are methods defined with %extend . In this case, the added code is contained in a separate function.
Finally, it is important to note that most language modules may choose to build a more advanced interface. Although you may never use the low-level interface described here, most of SWIG's language modules use it in some way or another.
Sometimes it is necessary to insert special code into the resulting wrapper file generated by SWIG. For example, you may want to include additional C code to perform initialization or other operations. There are four common ways to insert code, but it's useful to know how the output of SWIG is structured first.
When SWIG creates its output file, it is broken up into four sections corresponding to runtime code, headers, wrapper functions, and module initialization code (in that order).
Code is inserted into the appropriate code section by using one of the code insertion directives listed below. The order of the sections in the wrapper file is as shown:
%begin %{
... code in begin section ...
%}
%runtime %{
... code in runtime section ...
%}
%header %{
... code in header section ...
%}
%wrapper %{
... code in wrapper section ...
%}
%init %{
... code in init section ...
%}
The bare %{ ... %} directive is a shortcut that is the same as %header %{ ... %}.
The %begin section is effectively empty as it just contains the SWIG banner by default. This section is provided as a way for users to insert code at the top of the wrapper file before any other code is generated. Everything in a code insertion block is copied verbatim into the output file and is not parsed by SWIG. Most SWIG input files have at least one such block to include header files and support C code. Additional code blocks may be placed anywhere in a SWIG file as needed.
%module mymodule
%{
#include "my_header.h"
%}
... Declare functions here
%{
void some_extra_function() {
...
}
%}
A common use for code blocks is to write "helper" functions. These are functions that are used specifically for the purpose of building an interface, but which are generally not visible to the normal C program. For example :
%{
/* Create a new vector */
static Vector *new_Vector() {
return (Vector *) malloc(sizeof(Vector));
}
%}
// Now wrap it
Vector *new_Vector();
Since the process of writing helper functions is fairly common, there is a special inlined form of code block that is used as follows :
%inline %{
/* Create a new vector */
Vector *new_Vector() {
return (Vector *) malloc(sizeof(Vector));
}
%}
The %inline directive inserts all of the code that follows verbatim into the header portion of an interface file. The code is then parsed by both the SWIG preprocessor and parser. Thus, the above example creates a new command new_Vector using only one declaration. Since the code inside an %inline %{ ... %} block is given to both the C compiler and SWIG, it is illegal to include any SWIG directives inside a %{ ... %} block.
When code is included in the %init section, it is copied directly into the module initialization function. For example, if you needed to perform some extra initialization on module loading, you could write this:
%init %{
init_variables();
%}
This section describes the general approach for building interface with SWIG. The specifics related to a particular scripting language are found in later chapters.
SWIG doesn't require modifications to your C code, but if you feed it a collection of raw C header files or source code, the results might not be what you expect---in fact, they might be awful. Here's a series of steps you can follow to make an interface for a C program :
Although this may sound complicated, the process turns out to be fairly easy once you get the hang of it.
In the process of building an interface, SWIG may encounter syntax errors or other problems. The best way to deal with this is to simply copy the offending code into a separate interface file and edit it. However, the SWIG developers have worked very hard to improve the SWIG parser--you should report parsing errors to the swig-devel mailing list or to the SWIG bug tracker.
The preferred method of using SWIG is to generate separate interface file. Suppose you have the following C header file :
/* File : header.h */ #include <stdio.h> #include <math.h> extern int foo(double); extern double bar(int, int); extern void dump(FILE *f);
A typical SWIG interface file for this header file would look like the following :
/* File : interface.i */
%module mymodule
%{
#include "header.h"
%}
extern int foo(double);
extern double bar(int, int);
extern void dump(FILE *f);
Of course, in this case, our header file is pretty simple so we could have made an interface file like this as well:
/* File : interface.i */ %module mymodule %include header.h
Naturally, your mileage may vary.
Although SWIG can parse many header files, it is more common to write a special .i file defining the interface to a package. There are several reasons why you might want to do this:
Sometimes, it is necessary to use certain header files in order for the code generated by SWIG to compile properly. Make sure you include certain header files by using a %{,%} block like this:
%module graphics
%{
#include <GL/gl.h>
#include <GL/glu.h>
%}
// Put rest of declarations here
...
If your program defines a main() function, you may need to get rid of it or rename it in order to use a scripting language. Most scripting languages define their own main() procedure that is called instead. main() also makes no sense when working with dynamic loading. There are a few approaches to solving the main() conflict :
Getting rid of main() may cause potential initialization problems of a program. To handle this problem, you may consider writing a special function called program_init() that initializes your program upon startup. This function could then be called either from the scripting language as the first operation, or when the SWIG generated module is loaded.
As a general note, many C programs only use the main() function to parse command line options and to set parameters. However, by using a scripting language, you are probably trying to create a program that is more interactive. In many cases, the old main() program can be completely replaced by a Perl, Python, or Tcl script.
Note: If some cases, you might be inclined to create a scripting language wrapper for main(). If you do this, the compilation will probably work and your module might even load correctly. The only trouble is that when you call your main() wrapper, you will find that it actually invokes the main() of the scripting language interpreter itself! This behavior is a side effect of the symbol binding mechanism used in the dynamic linker. The bottom line: don't do this.
This chapter describes SWIG's support for wrapping C++. As a prerequisite, you should first read the chapter SWIG Basics to see how SWIG wraps ANSI C. Support for C++ builds upon ANSI C wrapping and that material will be useful in understanding this chapter.
Because of its complexity and the fact that C++ can be difficult to integrate with itself let alone other languages, SWIG only provides support for a subset of C++ features. Fortunately, this is now a rather large subset.
In part, the problem with C++ wrapping is that there is no semantically obvious (or automatic ) way to map many of its advanced features into other languages. As a simple example, consider the problem of wrapping C++ multiple inheritance to a target language with no such support. Similarly, the use of overloaded operators and overloaded functions can be problematic when no such capability exists in a target language.
A more subtle issue with C++ has to do with the way that some C++ programmers think about programming libraries. In the world of SWIG, you are really trying to create binary-level software components for use in other languages. In order for this to work, a "component" has to contain real executable instructions and there has to be some kind of binary linking mechanism for accessing its functionality. In contrast, C++ has increasingly relied upon generic programming and templates for much of its functionality. Although templates are a powerful feature, they are largely orthogonal to the whole notion of binary components and libraries. For example, an STL vector does not define any kind of binary object for which SWIG can just create a wrapper. To further complicate matters, these libraries often utilize a lot of behind the scenes magic in which the semantics of seemingly basic operations (e.g., pointer dereferencing, procedure call, etc.) can be changed in dramatic and sometimes non-obvious ways. Although this "magic" may present few problems in a C++-only universe, it greatly complicates the problem of crossing language boundaries and provides many opportunities to shoot yourself in the foot. You will just have to be careful.
To wrap C++, SWIG uses a layered approach to code generation. At the lowest level, SWIG generates a collection of procedural ANSI-C style wrappers. These wrappers take care of basic type conversion, type checking, error handling, and other low-level details of the C++ binding. These wrappers are also sufficient to bind C++ into any target language that supports built-in procedures. In some sense, you might view this layer of wrapping as providing a C library interface to C++. On top of the low-level procedural (flattened) interface, SWIG generates proxy classes that provide a natural object-oriented (OO) interface to the underlying code. The proxy classes are typically written in the target language itself. For instance, in Python, a real Python class is used to provide a wrapper around the underlying C++ object.
It is important to emphasize that SWIG takes a deliberately conservative and non-intrusive approach to C++ wrapping. SWIG does not encapsulate C++ classes inside a special C++ adaptor, it does not rely upon templates, nor does it add in additional C++ inheritance when generating wrappers. The last thing that most C++ programs need is even more compiler magic. Therefore, SWIG tries to maintain a very strict and clean separation between the implementation of your C++ application and the resulting wrapper code. You might say that SWIG has been written to follow the principle of least surprise--it does not play sneaky tricks with the C++ type system, it doesn't mess with your class hierarchies, and it doesn't introduce new semantics. Although this approach might not provide the most seamless integration with C++, it is safe, simple, portable, and debuggable.
Some of this chapter focuses on the low-level procedural interface to C++ that is used as the foundation for all language modules. Keep in mind that the target languages also provide the high-level OO interface via proxy classes. More detailed coverage can be found in the documentation for each target language.
SWIG currently supports most C++ features including the following:
The following C++ features are not currently supported:
As a rule of thumb, SWIG should not be used on raw C++ source files, use header files only.
SWIG's C++ support is an ongoing project so some of these limitations may be lifted in future releases. However, we make no promises. Also, submitting a bug report is a very good way to get problems fixed (wink).
When wrapping C++ code, it is critical that SWIG be called with the `-c++' option. This changes the way a number of critical features such as memory management are handled. It also enables the recognition of C++ keywords. Without the -c++ flag, SWIG will either issue a warning or a large number of syntax errors if it encounters C++ code in an interface file.
When compiling and linking the resulting wrapper file, it is normal to use the C++ compiler. For example:
$ swig -c++ -tcl example.i $ c++ -c example_wrap.cxx $ c++ example_wrap.o $(OBJS) -o example.so
Unfortunately, the process varies slightly on each platform. Make sure you refer to the documentation on each target language for further details. The SWIG Wiki also has further details.
Compatibility Note: Early versions of SWIG generated just a flattened low-level C style API to C++ classes by default. The -noproxy commandline option is recognised by many target languages and will generate just this interface as in earlier versions.In order to provide a natural mapping from C++ classes to the target language classes, SWIG's target languages mostly wrap C++ classes with special proxy classes. These proxy classes are typically implemented in the target language itself. For example, if you're building a Python module, each C++ class is wrapped by a Python proxy class. Or if you're building a Java module, each C++ class is wrapped by a Java proxy class.
Proxy classes are always constructed as an extra layer of wrapping that uses low-level accessor functions. To illustrate, suppose you had a C++ class like this:
class Foo {
public:
Foo();
~Foo();
int bar(int x);
int x;
};
Using C++ as pseudocode, a proxy class looks something like this:
class FooProxy {
private:
Foo *self;
public:
FooProxy() {
self = new_Foo();
}
~FooProxy() {
delete_Foo(self);
}
int bar(int x) {
return Foo_bar(self,x);
}
int x_get() {
return Foo_x_get(self);
}
void x_set(int x) {
Foo_x_set(self,x);
}
};
Of course, always keep in mind that the real proxy class is written in the target language. For example, in Python, the proxy might look roughly like this:
class Foo:
def __init__(self):
self.this = new_Foo()
def __del__(self):
delete_Foo(self.this)
def bar(self,x):
return Foo_bar(self.this,x)
def __getattr__(self,name):
if name == 'x':
return Foo_x_get(self.this)
...
def __setattr__(self,name,value):
if name == 'x':
Foo_x_set(self.this,value)
...
Again, it's important to emphasize that the low-level accessor functions are always used by the proxy classes. Whenever possible, proxies try to take advantage of language features that are similar to C++. This might include operator overloading, exception handling, and other features.
A major issue with proxies concerns the memory management of wrapped objects. Consider the following C++ code:
class Foo {
public:
Foo();
~Foo();
int bar(int x);
int x;
};
class Spam {
public:
Foo *value;
...
};
Consider some script code that uses these classes:
f = Foo() # Creates a new Foo s = Spam() # Creates a new Spam s.value = f # Stores a reference to f inside s g = s.value # Returns stored reference g = 4 # Reassign g to some other value del f # Destroy f
Now, ponder the resulting memory management issues. When objects are created in the script, the objects are wrapped by newly created proxy classes. That is, there is both a new proxy class instance and a new instance of the underlying C++ class. In this example, both f and s are created in this way. However, the statement s.value is rather curious---when executed, a pointer to f is stored inside another object. This means that the scripting proxy class AND another C++ class share a reference to the same object. To make matters even more interesting, consider the statement g = s.value. When executed, this creates a new proxy class g that provides a wrapper around the C++ object stored in s.value . In general, there is no way to know where this object came from---it could have been created by the script, but it could also have been generated internally. In this particular example, the assignment of g results in a second proxy class for f. In other words, a reference to f is now shared by two proxy classes and a C++ class.
Finally, consider what happens when objects are destroyed. In the statement, g=4, the variable g is reassigned. In many languages, this makes the old value of g available for garbage collection. Therefore, this causes one of the proxy classes to be destroyed. Later on, the statement del f destroys the other proxy class. Of course, there is still a reference to the original object stored inside another C++ object. What happens to it? Is the object still valid?
To deal with memory management problems, proxy classes provide an API for controlling ownership. In C++ pseudocode, ownership control might look roughly like this:
class FooProxy {
public:
Foo *self;
int thisown;
FooProxy() {
self = new_Foo();
thisown = 1; // Newly created object
}
~FooProxy() {
if (thisown) delete_Foo(self);
}
...
// Ownership control API
void disown() {
thisown = 0;
}
void acquire() {
thisown = 1;
}
};
class FooPtrProxy: public FooProxy {
public:
FooPtrProxy(Foo *s) {
self = s;
thisown = 0;
}
};
class SpamProxy {
...
FooProxy *value_get() {
return FooPtrProxy(Spam_value_get(self));
}
void value_set(FooProxy *v) {
Spam_value_set(self,v->self);
v->disown();
}
...
};
Looking at this code, there are a few central features:
Given the tricky nature of C++ memory management, it is impossible for proxy classes to automatically handle every possible memory management problem. However, proxies do provide a mechanism for manual control that can be used (if necessary) to address some of the more tricky memory management problems.
Language specific details on proxy classes are contained in the chapters describing each target language. This chapter has merely introduced the topic in a very general way.
The following code shows a SWIG interface file for a simple C++ class.
%module list
%{
#include "list.h"
%}
// Very simple C++ example for linked list
class List {
public:
List();
~List();
int search(char *value);
void insert(char *);
void remove(char *);
char *get(int n);
int length;
static void print(List *l);
};
To generate wrappers for this class, SWIG first reduces the class to a collection of low-level C-style accessor functions which are then used by the proxy classes.
C++ constructors and destructors are translated into accessor functions such as the following :
List * new_List(void) {
return new List;
}
void delete_List(List *l) {
delete l;
}
Following the C++ rules for implicit constructor and destructors, SWIG will automatically assume there is one even when they are not explicitly declared in the class interface.
In general then:
And as in C++, a few rules that alters the previous behavior:
SWIG should never generate a default constructor, copy constructor or default destructor wrapper for a class in which it is illegal to do so. In some cases, however, it could be necessary (if the complete class declaration is not visible from SWIG, and one of the above rules is violated) or desired (to reduce the size of the final interface) by manually disabling the implicit constructor/destructor generation.
To manually disable these, the %nodefaultctor and %nodefaultdtor feature flag directives can be used. Note that these directives only affects the implicit generation, and they have no effect if the default/copy constructors or destructor are explicitly declared in the class interface.
For example:
%nodefaultctor Foo; // Disable the default constructor for class Foo.
class Foo { // No default constructor is generated, unless one is declared
...
};
class Bar { // A default constructor is generated, if possible
...
};
The directive %nodefaultctor can also be applied "globally", as in:
%nodefaultctor; // Disable creation of default constructors
class Foo { // No default constructor is generated, unless one is declared
...
};
class Bar {
public:
Bar(); // The default constructor is generated, since one is declared
};
%clearnodefaultctor; // Enable the creation of default constructors again
The corresponding %nodefaultdtor directive can be used to disable the generation of the default or implicit destructor, if needed. Be aware, however, that this could lead to memory leaks in the target language. Hence, it is recommended to use this directive only in well known cases. For example:
%nodefaultdtor Foo; // Disable the implicit/default destructor for class Foo.
class Foo { // No destructor is generated, unless one is declared
...
};
Compatibility Note: The generation of default constructors/implicit destructors was made the default behavior in SWIG 1.3.7. This may break certain older modules, but the old behavior can be easily restored using %nodefault or the -nodefault command line option. Furthermore, in order for SWIG to properly generate (or not generate) default constructors, it must be able to gather information from both the private and protected sections (specifically, it needs to know if a private or protected constructor/destructor is defined). In older versions of SWIG, it was fairly common to simply remove or comment out the private and protected sections of a class due to parser limitations. However, this removal may now cause SWIG to erroneously generate constructors for classes that define a constructor in those sections. Consider restoring those sections in the interface or using %nodefault to fix the problem.
Note: The %nodefault directive/-nodefault options described above, which disable both the default constructor and the implicit destructors, could lead to memory leaks, and so it is strongly recommended to not use them.
If a class defines a constructor, SWIG normally tries to generate a wrapper for it. However, SWIG will not generate a constructor wrapper if it thinks that it will result in illegal wrapper code. There are really two cases where this might show up.
First, SWIG won't generate wrappers for protected or private constructors. For example:
class Foo {
protected:
Foo(); // Not wrapped.
public:
...
};
Next, SWIG won't generate wrappers for a class if it appears to be abstract--that is, it has undefined pure virtual methods. Here are some examples:
class Bar {
public:
Bar(); // Not wrapped. Bar is abstract.
virtual void spam(void) = 0;
};
class Grok : public Bar {
public:
Grok(); // Not wrapped. No implementation of abstract spam().
};
Some users are surprised (or confused) to find missing constructor wrappers in their interfaces. In almost all cases, this is caused when classes are determined to be abstract. To see if this is the case, run SWIG with all of its warnings turned on:
% swig -Wall -python module.i
In this mode, SWIG will issue a warning for all abstract classes. It is possible to force a class to be non-abstract using this:
%feature("notabstract") Foo;
class Foo : public Bar {
public:
Foo(); // Generated no matter what---not abstract.
...
};
More information about %feature can be found in the Customization features chapter.
If a class defines more than one constructor, its behavior depends on the capabilities of the target language. If overloading is supported, the copy constructor is accessible using the normal constructor function. For example, if you have this:
class List {
public:
List();
List(const List &); // Copy constructor
...
};
then the copy constructor can be used as follows:
x = List() # Create a list y = List(x) # Copy list x
If the target language does not support overloading, then the copy constructor is available through a special function like this:
List *copy_List(List *f) {
return new List(*f);
}
Note: For a class X, SWIG only treats a constructor as a copy constructor if it can be applied to an object of type X or X *. If more than one copy constructor is defined, only the first definition that appears is used as the copy constructor--other definitions will result in a name-clash. Constructors such as X(const X &), X(X &), and X(X *) are handled as copy constructors in SWIG.
Note: SWIG does not generate a copy constructor wrapper unless one is explicitly declared in the class. This differs from the treatment of default constructors and destructors. However, copy constructor wrappers can be generated if using the copyctor feature flag. For example:
%copyctor List;
class List {
public:
List();
};
Will generate a copy constructor wrapper for List.
Compatibility note: Special support for copy constructors was not added until SWIG-1.3.12. In previous versions, copy constructors could be wrapped, but they had to be renamed. For example:
class Foo {
public:
Foo();
%name(CopyFoo) Foo(const Foo &);
...
};
For backwards compatibility, SWIG does not perform any special copy-constructor handling if the constructor has been manually renamed. For instance, in the above example, the name of the constructor is set to new_CopyFoo(). This is the same as in older versions.
All member functions are roughly translated into accessor functions like this :
int List_search(List *obj, char *value) {
return obj->search(value);
}
This translation is the same even if the member function has been declared as virtual.
It should be noted that SWIG does not actually create a C accessor function in the code it generates. Instead, member access such as obj->search(value) is directly inlined into the generated wrapper functions. However, the name and calling convention of the low-level procedural wrappers match the accessor function prototype described above.
Static member functions are called directly without making any special transformations. For example, the static member function print(List *l) directly invokes List::print(List *l) in the generated wrapper code.
Member data is handled in exactly the same manner as for C structures. A pair of accessor functions are effectively created. For example :
int List_length_get(List *obj) {
return obj->length;
}
int List_length_set(List *obj, int value) {
obj->length = value;
return value;
}
A read-only member can be created using the %immutable and %mutable feature flag directive. For example, we probably wouldn't want the user to change the length of a list so we could do the following to make the value available, but read-only.
class List {
public:
...
%immutable;
int length;
%mutable;
...
};
Alternatively, you can specify an immutable member in advance like this:
%immutable List::length;
...
class List {
...
int length; // Immutable by above directive
...
};
Similarly, all data attributes declared as const are wrapped as read-only members.
There are some subtle issues when wrapping data members that are themselves classes. For instance, if you had another class like this,
class Foo {
public:
List items;
...
then the low-level accessor to the items member actually uses pointers. For example:
List *Foo_items_get(Foo *self) {
return &self->items;
}
void Foo_items_set(Foo *self, List *value) {
self->items = *value;
}
More information about this can be found in the SWIG Basics chapter, Structure data members section.
The wrapper code to generate the accessors for classes comes from the pointer typemaps. This can be somewhat unnatural for some types. For example, a user would expect the STL std::string class member variables to be wrapped as a string in the target language, rather than a pointer to this class. The const reference typemaps offer this type of marshalling, so there is a feature to tell SWIG to use the const reference typemaps rather than the pointer typemaps. It is the %naturalvar directive and is used as follows:
// All List variables will use const List& typemaps
%naturalvar List;
// Only Foo::myList will use const List& typemaps
%naturalvar Foo::myList;
struct Foo {
List myList;
};
// All variables will use const reference typemaps
%naturalvar;
The observant reader will notice that %naturalvar works like any other feature flag directive, except it can also be attached to class types. The first of the example usages above show %naturalvar attaching to the List class. Effectively this feature changes the way accessors are generated to the following:
const List &Foo_items_get(Foo *self) {
return self->items;
}
void Foo_items_set(Foo *self, const List &value) {
self->items = value;
}
In fact it is generally a good idea to use this feature globally as the reference typemaps have extra NULL checking compared to the pointer typemaps. A pointer can be NULL, whereas a reference cannot, so the extra checking ensures that the target language user does not pass in a value that translates to a NULL pointer and thereby preventing any potential NULL pointer dereferences. The %naturalvar feature will apply to global variables in addition to member variables in some language modules, eg C# and Java.
Other alternatives for turning this feature on globally are to use the swig -naturalvar commandline option or the module mode option, %module(naturalvar=1)
Compatibility note: The %naturalvar feature was introduced in SWIG-1.3.28, prior to which it was necessary to manually apply the const reference typemaps, eg %apply const std::string & { std::string * }, but this example would also apply the typemaps to methods taking a std::string pointer.
Compatibility note: Read-only access used to be controlled by a pair of directives %readonly and %readwrite. Although these directives still work, they generate a warning message. Simply change the directives to %immutable; and %mutable; to silence the warning. Don't forget the extra semicolon!
Compatibility note: Prior to SWIG-1.3.12, all members of unknown type were wrapped into accessor functions using pointers. For example, if you had a structure like this
struct Foo {
size_t len;
};
and nothing was known about size_t, then accessors would be written to work with size_t *. Starting in SWIG-1.3.12, this behavior has been modified. Specifically, pointers will only be used if SWIG knows that a datatype corresponds to a structure or class. Therefore, the above code would be wrapped into accessors involving size_t. This change is subtle, but it smooths over a few problems related to structure wrapping and some of SWIG's customization features.
SWIG will wrap all types of functions that have default arguments. For example member functions:
class Foo {
public:
void bar(int x, int y = 3, int z = 4);
};
SWIG handles default arguments by generating an extra overloaded method for each defaulted argument. SWIG is effectively handling methods with default arguments as if it was wrapping the equivalent overloaded methods. Thus for the example above, it is as if we had instead given the following to SWIG:
class Foo {
public:
void bar(int x, int y, int z);
void bar(int x, int y);
void bar(int x);
};
The wrappers produced are exactly the same as if the above code was instead fed into SWIG. Details of this are covered later in the Wrapping Overloaded Functions and Methods section. This approach allows SWIG to wrap all possible default arguments, but can be verbose. For example if a method has ten default arguments, then eleven wrapper methods are generated.
Please see the Features and default arguments section for more information on using %feature with functions with default arguments. The Ambiguity resolution and renaming section also deals with using %rename and %ignore on methods with default arguments. If you are writing your own typemaps for types used in methods with default arguments, you may also need to write a typecheck typemap. See the Typemaps and overloading section for details or otherwise use the compactdefaultargs feature flag as mentioned below.
Compatibility note: Versions of SWIG prior to SWIG-1.3.23 wrapped default arguments slightly differently. Instead a single wrapper method was generated and the default values were copied into the C++ wrappers so that the method being wrapped was then called with all the arguments specified. If the size of the wrappers are a concern then this approach to wrapping methods with default arguments can be re-activated by using the compactdefaultargs feature flag.
%feature("compactdefaultargs") Foo::bar;
class Foo {
public:
void bar(int x, int y = 3, int z = 4);
};
This is great for reducing the size of the wrappers, but the caveat is it does not work for the statically typed languages, such as C# and Java, which don't have optional arguments in the language, Another restriction of this feature is that it cannot handle default arguments that are not public. The following example illustrates this:
class Foo {
private:
static const int spam;
public:
void bar(int x, int y = spam); // Won't work with %feature("compactdefaultargs") -
// private default value
};
This produces uncompileable wrapper code because default values in C++ are evaluated in the same scope as the member function whereas SWIG evaluates them in the scope of a wrapper function (meaning that the values have to be public).
This feature is automatically turned on when wrapping C code with default arguments and whenever keyword arguments (kwargs) are specified for either C or C++ code. Keyword arguments are a language feature of some scripting languages, for example Ruby and Python. SWIG is unable to support kwargs when wrapping overloaded methods, so the default approach cannot be used.
SWIG wraps class members that are public following the C++ conventions, i.e., by explicit public declaration or by the use of the using directive. In general, anything specified in a private or protected section will be ignored, although the internal code generator sometimes looks at the contents of the private and protected sections so that it can properly generate code for default constructors and destructors. Directors could also modify the way non-public virtual protected members are treated.
By default, members of a class definition are assumed to be private until you explicitly give a `public:' declaration (This is the same convention used by C++).
Enumerations and constants are handled differently by the different language modules and are described in detail in the appropriate language chapter. However, many languages map enums and constants in a class definition into constants with the classname as a prefix. For example :
class Swig {
public:
enum {ALE, LAGER, PORTER, STOUT};
};
Generates the following set of constants in the target scripting language :
Swig_ALE = Swig::ALE Swig_LAGER = Swig::LAGER Swig_PORTER = Swig::PORTER Swig_STOUT = Swig::STOUT
Members declared as const are wrapped as read-only members and do not create constants.
Friend declarations are recognised by SWIG. For example, if you have this code:
class Foo {
public:
...
friend void blah(Foo *f);
...
};
then the friend declaration does result in a wrapper code equivalent to one generated for the following declaration
class Foo {
public:
...
};
void blah(Foo *f);
A friend declaration, as in C++, is understood to be in the same scope where the class is declared, hence, you can have
%ignore bar::blah(Foo *f);
namespace bar {
class Foo {
public:
...
friend void blah(Foo *f);
...
};
}
and a wrapper for the method 'blah' will not be generated.
C++ references are supported, but SWIG transforms them back into pointers. For example, a declaration like this :
class Foo {
public:
double bar(double &a);
}
has a low-level accessor
double Foo_bar(Foo *obj, double *a) {
obj->bar(*a);
}
As a special case, most language modules pass const references to primitive datatypes (int, short, float, etc.) by value instead of pointers. For example, if you have a function like this,
void foo(const int &x);
it is called from a script as follows:
foo(3) # Notice pass by value
Functions that return a reference are remapped to return a pointer instead. For example:
class Bar {
public:
Foo &spam();
};
Generates an accessor like this:
Foo *Bar_spam(Bar *obj) {
Foo &result = obj->spam();
return &result;
}
However, functions that return const references to primitive datatypes (int, short, etc.) normally return the result as a value rather than a pointer. For example, a function like this,
const int &bar();
will return integers such as 37 or 42 in the target scripting language rather than a pointer to an integer.
Don't return references to objects allocated as local variables on the stack. SWIG doesn't make a copy of the objects so this will probably cause your program to crash.
Note: The special treatment for references to primitive datatypes is necessary to provide more seamless integration with more advanced C++ wrapping applications---especially related to templates and the STL. This was first added in SWIG-1.3.12.
Occasionally, a C++ program will pass and return class objects by value. For example, a function like this might appear:
Vector cross_product(Vector a, Vector b);
If no information is supplied about Vector, SWIG creates a wrapper function similar to the following:
Vector *wrap_cross_product(Vector *a, Vector *b) {
Vector x = *a;
Vector y = *b;
Vector r = cross_product(x,y);
return new Vector(r);
}
In order for the wrapper code to compile, Vector must define a copy constructor and a default constructor.
If Vector is defined as a class in the interface, but it does not support a default constructor, SWIG changes the wrapper code by encapsulating the arguments inside a special C++ template wrapper class, through a process called the "Fulton Transform". This produces a wrapper that looks like this:
Vector cross_product(Vector *a, Vector *b) {
SwigValueWrapper<Vector> x = *a;
SwigValueWrapper<Vector> y = *b;
SwigValueWrapper<Vector> r = cross_product(x,y);
return new Vector(r);
}
This transformation is a little sneaky, but it provides support for pass-by-value even when a class does not provide a default constructor and it makes it possible to properly support a number of SWIG's customization options. The definition of SwigValueWrapper can be found by reading the SWIG wrapper code. This class is really nothing more than a thin wrapper around a pointer.
Although SWIG usually detects the classes to which the Fulton Transform should be applied, in some situations it's necessary to override it. That's done with %feature("valuewrapper") to ensure it is used and %feature("novaluewrapper") to ensure it is not used:
%feature("novaluewrapper") A;
class A;
%feature("valuewrapper") B;
struct B {
B();
// ....
};
It is well worth considering turning this feature on for classes that do have a default constructor. It will remove a redundant constructor call at the point of the variable declaration in the wrapper, so will generate notably better performance for large objects or for classes with expensive construction. Alternatively consider returning a reference or a pointer.
Note: this transformation has no effect on typemaps or any other part of SWIG---it should be transparent except that you may see this code when reading the SWIG output file.
Note: This template transformation is new in SWIG-1.3.11 and may be refined in future SWIG releases. In practice, it is only absolutely necessary to do this for classes that don't define a default constructor.
Note: The use of this template only occurs when objects are passed or returned by value. It is not used for C++ pointers or references.
SWIG supports C++ inheritance of classes and allows both single and multiple inheritance, as limited or allowed by the target language. The SWIG type-checker knows about the relationship between base and derived classes and allows pointers to any object of a derived class to be used in functions of a base class. The type-checker properly casts pointer values and is safe to use with multiple inheritance.
SWIG treats private or protected inheritance as close to the C++ spirit, and target language capabilities, as possible. In most cases, this means that SWIG will parse the non-public inheritance declarations, but that will have no effect in the generated code, besides the implicit policies derived for constructor and destructors.
The following example shows how SWIG handles inheritance. For clarity, the full C++ code has been omitted.
// shapes.i
%module shapes
%{
#include "shapes.h"
%}
class Shape {
public:
double x,y;
virtual double area() = 0;
virtual double perimeter() = 0;
void set_location(double x, double y);
};
class Circle : public Shape {
public:
Circle(double radius);
~Circle();
double area();
double perimeter();
};
class Square : public Shape {
public:
Square(double size);
~Square();
double area();
double perimeter();
}
When wrapped into Python, we can perform the following operations (shown using the low level Python accessors):
$ python >>> import shapes >>> circle = shapes.new_Circle(7) >>> square = shapes.new_Square(10) >>> print shapes.Circle_area(circle) 153.93804004599999757 >>> print shapes.Shape_area(circle) 153.93804004599999757 >>> print shapes.Shape_area(square) 100.00000000000000000 >>> shapes.Shape_set_location(square,2,-3) >>> print shapes.Shape_perimeter(square) 40.00000000000000000 >>>
In this example, Circle and Square objects have been created. Member functions can be invoked on each object by making calls to Circle_area, Square_area, and so on. However, the same results can be accomplished by simply using the Shape_area function on either object.
One important point concerning inheritance is that the low-level accessor functions are only generated for classes in which they are actually declared. For instance, in the above example, the method set_location() is only accessible as Shape_set_location() and not as Circle_set_location() or Square_set_location() . Of course, the Shape_set_location() function will accept any kind of object derived from Shape. Similarly, accessor functions for the attributes x and y are generated as Shape_x_get(), Shape_x_set(), Shape_y_get(), and Shape_y_set(). Functions such as Circle_x_get() are not available--instead you should use Shape_x_get().
Note that there is a one to one correlation between the low-level accessor functions and the proxy methods and therefore there is also a one to one correlation between the C++ class methods and the generated proxy class methods.
Note: For the best results, SWIG requires all base classes to be defined in an interface. Otherwise, you may get a warning message like this:
example.i:18: Warning(401): Nothing known about base class 'Foo'. Ignored.
If any base class is undefined, SWIG still generates correct type relationships. For instance, a function accepting a Foo * will accept any object derived from Foo regardless of whether or not SWIG actually wrapped the Foo class. If you really don't want to generate wrappers for the base class, but you want to silence the warning, you might consider using the %import directive to include the file that defines Foo. %import simply gathers type information, but doesn't generate wrappers. Alternatively, you could just define Foo as an empty class in the SWIG interface or use warning suppression .
Note: typedef-names can be used as base classes. For example:
class Foo {
...
};
typedef Foo FooObj;
class Bar : public FooObj { // Ok. Base class is Foo
...
};
Similarly, typedef allows unnamed structures to be used as base classes. For example:
typedef struct {
...
} Foo;
class Bar : public Foo { // Ok.
...
};
Compatibility Note: Starting in version 1.3.7, SWIG only generates low-level accessor wrappers for the declarations that are actually defined in each class. This differs from SWIG1.1 which used to inherit all of the declarations defined in base classes and regenerate specialized accessor functions such as Circle_x_get(), Square_x_get(), Circle_set_location(), and Square_set_location(). This behavior resulted in huge amounts of replicated code for large class hierarchies and made it awkward to build applications spread across multiple modules (since accessor functions are duplicated in every single module). It is also unnecessary to have such wrappers when advanced features like proxy classes are used. Note: Further optimizations are enabled when using the -fvirtual option, which avoids the regenerating of wrapper functions for virtual members that are already defined in a base class.
When a target scripting language refers to a C++ object, it normally uses a tagged pointer object that contains both the value of the pointer and a type string. For example, in Tcl, a C++ pointer might be encoded as a string like this:
_808fea88_p_Circle
A somewhat common question is whether or not the type-tag could be safely removed from the pointer. For instance, to get better performance, could you strip all type tags and just use simple integers instead?
In general, the answer to this question is no. In the wrappers, all pointers are converted into a common data representation in the target language. Typically this is the equivalent of casting a pointer to void *. This means that any C++ type information associated with the pointer is lost in the conversion.
The problem with losing type information is that it is needed to properly support many advanced C++ features--especially multiple inheritance. For example, suppose you had code like this:
class A {
public:
int x;
};
class B {
public:
int y;
};
class C : public A, public B {
};
int A_function(A *a) {
return a->x;
}
int B_function(B *b) {
return b->y;
}
Now, consider the following code that uses void *.
C *c = new C(); void *p = (void *) c; ... int x = A_function((A *) p); int y = B_function((B *) p);
In this code, both A_function() and B_function() may legally accept an object of type C * (via inheritance). However, one of the functions will always return the wrong result when used as shown. The reason for this is that even though p points to an object of type C, the casting operation doesn't work like you would expect. Internally, this has to do with the data representation of C. With multiple inheritance, the data from each base class is stacked together. For example:
------------ <--- (C *), (A *)
| A |
|------------| <--- (B *)
| B |
------------
Because of this stacking, a pointer of type C * may change value when it is converted to a A * or B *. However, this adjustment does not occur if you are converting from a void *.
The use of type tags marks all pointers with the real type of the underlying object. This extra information is then used by SWIG generated wrappers to correctly cast pointer values under inheritance (avoiding the above problem).
Some of the language modules are able to solve the problem by storing multiple instances of the pointer, for example, A *, in the A proxy class as well as C * in the C proxy class. The correct cast can then be made by choosing the correct void * pointer to use and is guaranteed to work as the cast to a void pointer and back to the same type does not lose any type information:
C *c = new C(); void *p = (void *) c; void *pA = (void *) c; void *pB = (void *) c; ... int x = A_function((A *) pA); int y = B_function((B *) pB);
In practice, the pointer is held as an integral number in the target language proxy class.
In many language modules, SWIG provides partial support for overloaded functions, methods, and constructors. For example, if you supply SWIG with overloaded functions like this:
void foo(int x) {
printf("x is %d\n", x);
}
void foo(char *x) {
printf("x is '%s'\n", x);
}
The function is used in a completely natural way. For example:
>>> foo(3)
x is 3
>>> foo("hello")
x is 'hello'
>>>
Overloading works in a similar manner for methods and constructors. For example if you have this code,
class Foo {
public:
Foo();
Foo(const Foo &); // Copy constructor
void bar(int x);
void bar(char *s, int y);
};
it might be used like this
>>> f = Foo() # Create a Foo
>>> f.bar(3)
>>> g = Foo(f) # Copy Foo
>>> f.bar("hello",2)
The implementation of overloaded functions and methods is somewhat complicated due to the dynamic nature of scripting languages. Unlike C++, which binds overloaded methods at compile time, SWIG must determine the proper function as a runtime check for scripting language targets. This check is further complicated by the typeless nature of certain scripting languages. For instance, in Tcl, all types are simply strings. Therefore, if you have two overloaded functions like this,
void foo(char *x); void foo(int x);
the order in which the arguments are checked plays a rather critical role.
For statically typed languages, SWIG uses the language's method overloading mechanism. To implement overloading for the scripting languages, SWIG generates a dispatch function that checks the number of passed arguments and their types. To create this function, SWIG first examines all of the overloaded methods and ranks them according to the following rules:
Argument type precedence. All C++ datatypes are assigned a numeric type precedence value (which is determined by the language module).
Type Precedence ---------------- ---------- TYPE * 0 (High) void * 20 Integers 40 Floating point 60 char 80 Strings 100 (Low)
Using these precedence values, overloaded methods with the same number of required arguments are sorted in increased order of precedence values.
This may sound very confusing, but an example will help. Consider the following collection of overloaded methods:
void foo(double); void foo(int); void foo(Bar *); void foo(); void foo(int x, int y, int z, int w); void foo(int x, int y, int z = 3); void foo(double x, double y); void foo(double x, Bar *z);
The first rule simply ranks the functions by required argument count. This would produce the following list:
rank ----- [0] foo() [1] foo(double); [2] foo(int); [3] foo(Bar *); [4] foo(int x, int y, int z = 3); [5] foo(double x, double y) [6] foo(double x, Bar *z) [7] foo(int x, int y, int z, int w);
The second rule, simply refines the ranking by looking at argument type precedence values.
rank ----- [0] foo() [1] foo(Bar *); [2] foo(int); [3] foo(double); [4] foo(int x, int y, int z = 3); [5] foo(double x, Bar *z) [6] foo(double x, double y) [7] foo(int x, int y, int z, int w);
Finally, to generate the dispatch function, the arguments passed to an overloaded method are simply checked in the same order as they appear in this ranking.
If you're still confused, don't worry about it---SWIG is probably doing the right thing.
Regrettably, SWIG is not able to support every possible use of valid C++ overloading. Consider the following example:
void foo(int x); void foo(long x);
In C++, this is perfectly legal. However, in a scripting language, there is generally only one kind of integer object. Therefore, which one of these functions do you pick? Clearly, there is no way to truly make a distinction just by looking at the value of the integer itself ( int and long may even be the same precision). Therefore, when SWIG encounters this situation, it may generate a warning message like this for scripting languages:
example.i:4: Warning(509): Overloaded foo(long) is shadowed by foo(int) at example.i:3.
or for statically typed languages like Java:
example.i:4: Warning(516): Overloaded method foo(long) ignored. Method foo(int) at example.i:3 used.
This means that the second overloaded function will be inaccessible from a scripting interface or the method won't be wrapped at all. This is done as SWIG does not know how to disambiguate it from an earlier method.
Ambiguity problems are known to arise in the following situations:
When an ambiguity arises, methods are checked in the same order as they appear in the interface file. Therefore, earlier methods will shadow methods that appear later.
When wrapping an overloaded function, there is a chance that you will get an error message like this:
example.i:3: Warning(467): Overloaded foo(int) not supported (no type checking rule for 'int').
This error means that the target language module supports overloading, but for some reason there is no type-checking rule that can be used to generate a working dispatch function. The resulting behavior is then undefined. You should report this as a bug to the SWIG bug tracking database.
If you get an error message such as the following,
foo.i:6. Overloaded declaration ignored. Spam::foo(double ) foo.i:5. Previous declaration is Spam::foo(int ) foo.i:7. Overloaded declaration ignored. Spam::foo(Bar *,Spam *,int ) foo.i:5. Previous declaration is Spam::foo(int )
it means that the target language module has not yet implemented support for overloaded functions and methods. The only way to fix the problem is to read the next section.
If an ambiguity in overload resolution occurs or if a module doesn't allow overloading, there are a few strategies for dealing with the problem. First, you can tell SWIG to ignore one of the methods. This is easy---simply use the %ignore directive. For example:
%ignore foo(long); void foo(int); void foo(long); // Ignored. Oh well.
The other alternative is to rename one of the methods. This can be done using %rename. For example:
%rename("foo_short") foo(short);
%rename(foo_long) foo(long);
void foo(int);
void foo(short); // Accessed as foo_short()
void foo(long); // Accessed as foo_long()
Note that the quotes around the new name are optional, however, should the new name be a C/C++ keyword they would be essential in order to avoid a parsing error. The %ignore and %rename directives are both rather powerful in their ability to match declarations. When used in their simple form, they apply to both global functions and methods. For example:
/* Forward renaming declarations */
%rename(foo_i) foo(int);
%rename(foo_d) foo(double);
...
void foo(int); // Becomes 'foo_i'
void foo(char *c); // Stays 'foo' (not renamed)
class Spam {
public:
void foo(int); // Becomes 'foo_i'
void foo(double); // Becomes 'foo_d'
...
};
If you only want the renaming to apply to a certain scope, the C++ scope resolution operator (::) can be used. For example:
%rename(foo_i) ::foo(int); // Only rename foo(int) in the global scope.
// (will not rename class members)
%rename(foo_i) Spam::foo(int); // Only rename foo(int) in class Spam
When a renaming operator is applied to a class as in Spam::foo(int), it is applied to that class and all derived classes. This can be used to apply a consistent renaming across an entire class hierarchy with only a few declarations. For example:
%rename(foo_i) Spam::foo(int);
%rename(foo_d) Spam::foo(double);
class Spam {
public:
virtual void foo(int); // Renamed to foo_i
virtual void foo(double); // Renamed to foo_d
...
};
class Bar : public Spam {
public:
virtual void foo(int); // Renamed to foo_i
virtual void foo(double); // Renamed to foo_d
...
};
class Grok : public Bar {
public:
virtual void foo(int); // Renamed to foo_i
virtual void foo(double); // Renamed to foo_d
...
};
It is also possible to include %rename specifications in the class definition itself. For example:
class Spam {
%rename(foo_i) foo(int);
%rename(foo_d) foo(double);
public:
virtual void foo(int); // Renamed to foo_i
virtual void foo(double); // Renamed to foo_d
...
};
class Bar : public Spam {
public:
virtual void foo(int); // Renamed to foo_i
virtual void foo(double); // Renamed to foo_d
...
};
In this case, the %rename directives still get applied across the entire inheritance hierarchy, but it's no longer necessary to explicitly specify the class prefix Spam::.
A special form of %rename can be used to apply a renaming just to class members (of all classes):
%rename(foo_i) *::foo(int); // Only rename foo(int) if it appears in a class.
Note: the *:: syntax is non-standard C++, but the '*' is meant to be a wildcard that matches any class name (we couldn't think of a better alternative so if you have a better idea, send email to the swig-devel mailing list.
Although this discussion has primarily focused on %rename all of the same rules also apply to %ignore. For example:
%ignore foo(double); // Ignore all foo(double) %ignore Spam::foo; // Ignore foo in class Spam %ignore Spam::foo(double); // Ignore foo(double) in class Spam %ignore *::foo(double); // Ignore foo(double) in all classes
When applied to a base class, %ignore forces all definitions in derived classes to disappear. For example, %ignore Spam::foo(double) will eliminate foo(double) in Spam and all classes derived from Spam.
Notes on %rename and %ignore:
Since, the %rename declaration is used to declare a renaming in advance, it can be placed at the start of an interface file. This makes it possible to apply a consistent name resolution without having to modify header files. For example:
%module foo /* Rename these overloaded functions */ %rename(foo_i) foo(int); %rename(foo_d) foo(double); %include "header.h"
The scope qualifier (::) can also be used on simple names. For example:
%rename(bar) ::foo; // Rename foo to bar in global scope only %rename(bar) Spam::foo; // Rename foo to bar in class Spam only %rename(bar) *::foo; // Rename foo in classes only
Name matching tries to find the most specific match that is defined. A qualified name such as Spam::foo always has higher precedence than an unqualified name foo. Spam::foo has higher precedence than *::foo and *::foo has higher precedence than foo. A parameterized name has higher precedence than an unparameterized name within the same scope level. However, an unparameterized name with a scope qualifier has higher precedence than a parameterized name in global scope (e.g., a renaming of Spam::foo takes precedence over a renaming of foo(int) ).
The order in which %rename directives are defined does not matter as long as they appear before the declarations to be renamed. Thus, there is no difference between saying:
%rename(bar) foo; %rename(foo_i) Spam::foo(int); %rename(Foo) Spam::foo;
and this
%rename(Foo) Spam::foo; %rename(bar) foo; %rename(foo_i) Spam::foo(int);
(the declarations are not stored in a linked list and order has no importance). Of course, a repeated %rename directive will change the setting for a previous %rename directive if exactly the same name, scope, and parameters are supplied.
The name matching rules strictly follow member qualification rules. For example, if you have a class like this:
class Spam {
public:
...
void bar() const;
...
};
the declaration
%rename(name) Spam::bar();
will not apply as there is no unqualified member bar(). The following will apply as the qualifier matches correctly:
%rename(name) Spam::bar() const;
An often overlooked C++ feature is that classes can define two different overloaded members that differ only in their qualifiers, like this:
class Spam {
public:
...
void bar(); // Unqualified member
void bar() const; // Qualified member
...
};
%rename can then be used to target each of the overloaded methods individually. For example we can give them separate names in the target language:
%rename(name1) Spam::bar(); %rename(name2) Spam::bar() const;
Similarly, if you merely wanted to ignore one of the declarations, use %ignore with the full qualification. For example, the following directive would tell SWIG to ignore the const version of bar() above:
%ignore Spam::bar() const; // Ignore bar() const, but leave other bar() alone
Currently no resolution is performed in order to match function parameters. This means function parameter types must match exactly. For example, namespace qualifiers and typedefs will not work. The following usage of typedefs demonstrates this:
typedef int Integer;
%rename(foo_i) foo(int);
class Spam {
public:
void foo(Integer); // Stays 'foo' (not renamed)
};
class Ham {
public:
void foo(int); // Renamed to foo_i
};
The name matching rules also use default arguments for finer control when wrapping methods that have default arguments. Recall that methods with default arguments are wrapped as if the equivalent overloaded methods had been parsed (Default arguments section). Let's consider the following example class:
class Spam {
public:
...
void bar(int i=-1, double d=0.0);
...
};
The following %rename will match exactly and apply to all the target language overloaded methods because the declaration with the default arguments exactly matches the wrapped method:
%rename(newbar) Spam::bar(int i=-1, double d=0.0);
The C++ method can then be called from the target language with the new name no matter how many arguments are specified, for example: newbar(2, 2.0), newbar(2) or newbar(). However, if the %rename does not contain the default arguments, it will only apply to the single equivalent target language overloaded method. So if instead we have:
%rename(newbar) Spam::bar(int i, double d);
The C++ method must then be called from the target language with the new name newbar(2, 2.0) when both arguments are supplied or with the original name as bar(2) (one argument) or bar() (no arguments). In fact it is possible to use %rename on the equivalent overloaded methods, to rename all the equivalent overloaded methods:
%rename(bar_2args) Spam::bar(int i, double d); %rename(bar_1arg) Spam::bar(int i); %rename(bar_default) Spam::bar();
Similarly, the extra overloaded methods can be selectively ignored using %ignore.
Compatibility note: The %rename directive introduced the default argument matching rules in SWIG-1.3.23 at the same time as the changes to wrapping methods with default arguments was introduced.
Support for overloaded methods was first added in SWIG-1.3.14. The implementation is somewhat unusual when compared to similar tools. For instance, the order in which declarations appear is largely irrelevant in SWIG. Furthermore, SWIG does not rely upon trial execution or exception handling to figure out which method to invoke.
Internally, the overloading mechanism is completely configurable by the target language module. Therefore, the degree of overloading support may vary from language to language. As a general rule, statically typed languages like Java are able to provide more support than dynamically typed languages like Perl, Python, Ruby, and Tcl.
C++ overloaded operator declarations can be wrapped. For example, consider a class like this:
class Complex {
private:
double rpart, ipart;
public:
Complex(double r = 0, double i = 0) : rpart(r), ipart(i) { }
Complex(const Complex &c) : rpart(c.rpart), ipart(c.ipart) { }
Complex &operator=(const Complex &c) {
rpart = c.rpart;
ipart = c.ipart;
return *this;
}
Complex operator+(const Complex &c) const {
return Complex(rpart+c.rpart, ipart+c.ipart);
}
Complex operator-(const Complex &c) const {
return Complex(rpart-c.rpart, ipart-c.ipart);
}
Complex operator*(const Complex &c) const {
return Complex(rpart*c.rpart - ipart*c.ipart,
rpart*c.ipart + c.rpart*ipart);
}
Complex operator-() const {
return Complex(-rpart, -ipart);
}
double re() const { return rpart; }
double im() const { return ipart; }
};
When operator declarations appear, they are handled in exactly the same manner as regular methods. However, the names of these methods are set to strings like "operator +" or "operator -". The problem with these names is that they are illegal identifiers in most scripting languages. For instance, you can't just create a method called "operator +" in Python--there won't be any way to call it.
Some language modules already know how to automatically handle certain operators (mapping them into operators in the target language). However, the underlying implementation of this is really managed in a very general way using the %rename directive. For example, in Python a declaration similar to this is used:
%rename(__add__) Complex::operator+;
This binds the + operator to a method called __add__ (which is conveniently the same name used to implement the Python + operator). Internally, the generated wrapper code for a wrapped operator will look something like this pseudocode:
_wrap_Complex___add__(args) {
... get args ...
obj->operator+(args);
...
}
When used in the target language, it may now be possible to use the overloaded operator normally. For example:
>>> a = Complex(3,4) >>> b = Complex(5,2) >>> c = a + b # Invokes __add__ method
It is important to realize that there is nothing magical happening here. The %rename directive really only picks a valid method name. If you wrote this:
%rename(add) operator+;
The resulting scripting interface might work like this:
a = Complex(3,4) b = Complex(5,2) c = a.add(b) # Call a.operator+(b)
All of the techniques described to deal with overloaded functions also apply to operators. For example:
%ignore Complex::operator=; // Ignore = in class Complex %ignore *::operator=; // Ignore = in all classes %ignore operator=; // Ignore = everywhere. %rename(__sub__) Complex::operator-; %rename(__neg__) Complex::operator-(); // Unary -
The last part of this example illustrates how multiple definitions of the operator- method might be handled.
Handling operators in this manner is mostly straightforward. However, there are a few subtle issues to keep in mind:
In C++, it is fairly common to define different versions of the operators to account for different types. For example, a class might also include a friend function like this:
class Complex {
public:
friend Complex operator+(Complex &, double);
};
Complex operator+(Complex &, double);
SWIG simply ignores all friend declarations. Furthermore, it doesn't know how to associate the associated operator+ with the class (because it's not a member of the class).
It's still possible to make a wrapper for this operator, but you'll have to handle it like a normal function. For example:
%rename(add_complex_double) operator+(Complex &, double);
Certain operators are ignored by default. For instance, new and delete operators are ignored as well as conversion operators.
New methods can be added to a class using the %extend directive. This directive is primarily used in conjunction with proxy classes to add additional functionality to an existing class. For example :
%module vector
%{
#include "vector.h"
%}
class Vector {
public:
double x,y,z;
Vector();
~Vector();
... bunch of C++ methods ...
%extend {
char *__str__() {
static char temp[256];
sprintf(temp,"[ %g, %g, %g ]", $self->x,$self->y,$self->z);
return &temp[0];
}
}
};
This code adds a __str__ method to our class for producing a string representation of the object. In Python, such a method would allow us to print the value of an object using the print command.
>>> >>> v = Vector(); >>> v.x = 3 >>> v.y = 4 >>> v.z = 0 >>> print(v) [ 3.0, 4.0, 0.0 ] >>>
The C++ 'this' pointer is often needed to access member variables, methods etc. The $self special variable should be used wherever you could use 'this'. The example above demonstrates this for accessing member variables. Note that the members dereferenced by $self must be public members as the code is ultimately generated into a global function and so will not have any access to non-public members. The implicit 'this' pointer that is present in C++ methods is not present in %extend methods. In order to access anything in the extended class or its base class, an explicit 'this' is required. The following example shows how one could access base class members:
struct Base {
virtual void method(int v) {
...
}
int value;
};
struct Derived : Base {
};
%extend Derived {
virtual void method(int v) {
$self->Base::method(v); // akin to this->Base::method(v);
$self->value = v; // akin to this->value = v;
...
}
}
The %extend directive follows all of the same conventions as its use with C structures. Please refer to the Adding member functions to C structures section for further details.
Compatibility note: The %extend directive is a new name for the %addmethods directive in SWIG1.1. Since %addmethods could be used to extend a structure with more than just methods, a more suitable directive name has been chosen.
Template type names may appear anywhere a type is expected in an interface file. For example:
void foo(vector<int> *a, int n); void bar(list<int,100> *x);
There are some restrictions on the use of non-type arguments. Simple literals are supported, and so are some constant expressions. However, use of '<' and '>' within a constant expressions currently is not supported by SWIG ('<=' and '>=' are though). For example:
void bar(list<int,100> *x); // OK void bar(list<int,2*50> *x); // OK void bar(list<int,(2>1 ? 100 : 50)> *x) // Not supported
The type system is smart enough to figure out clever games you might try to play with typedef. For instance, consider this code:
typedef int Integer; void foo(vector<int> *x, vector<Integer> *y);
In this case, vector<Integer> is exactly the same type as vector<int>. The wrapper for foo() will accept either variant.
Starting with SWIG-1.3.7, simple C++ template declarations can also be wrapped. SWIG-1.3.12 greatly expands upon the earlier implementation. Before discussing this any further, there are a few things you need to know about template wrapping. First, a bare C++ template does not define any sort of runnable object-code for which SWIG can normally create a wrapper. Therefore, in order to wrap a template, you need to give SWIG information about a particular template instantiation (e.g., vector<int>, array<double>, etc.). Second, an instantiation name such as vector<int> is generally not a valid identifier name in most target languages. Thus, you will need to give the template instantiation a more suitable name such as intvector when creating a wrapper.
To illustrate, consider the following template definition:
template<class T> class List {
private:
T *data;
int nitems;
int maxitems;
public:
List(int max) {
data = new T [max];
nitems = 0;
maxitems = max;
}
~List() {
delete [] data;
};
void append(T obj) {
if (nitems < maxitems) {
data[nitems++] = obj;
}
}
int length() {
return nitems;
}
T get(int n) {
return data[n];
}
};
By itself, this template declaration is useless--SWIG simply ignores it because it doesn't know how to generate any code until unless a definition of T is provided.
One way to create wrappers for a specific template instantiation is to simply provide an expanded version of the class directly like this:
%rename(intList) List<int>; // Rename to a suitable identifier
class List<int> {
private:
int *data;
int nitems;
int maxitems;
public:
List(int max);
~List();
void append(int obj);
int length();
int get(int n);
};
The %rename directive is needed to give the template class an appropriate identifier name in the target language (most languages would not recognize C++ template syntax as a valid class name). The rest of the code is the same as what would appear in a normal class definition.
Since manual expansion of templates gets old in a hurry, the %template directive can be used to create instantiations of a template class. Semantically, %template is simply a shortcut---it expands template code in exactly the same way as shown above. Here are some examples:
/* Instantiate a few different versions of the template */ %template(intList) List<int>; %template(doubleList) List<double>;
The argument to %template() is the name of the instantiation in the target language. The name you choose should not conflict with any other declarations in the interface file with one exception---it is okay for the template name to match that of a typedef declaration. For example:
%template(intList) List<int>; ... typedef List<int> intList; // OK
SWIG can also generate wrappers for function templates using a similar technique. For example:
// Function template
template<class T> T max(T a, T b) { return a > b ? a : b; }
// Make some different versions of this function
%template(maxint) max<int>;
%template(maxdouble) max<double>;
In this case, maxint and maxdouble become unique names for specific instantiations of the function.
The number of arguments supplied to %template should match that in the original template definition. Template default arguments are supported. For example:
template vector<typename T, int max=100> class vector {
...
};
%template(intvec) vector<int>; // OK
%template(vec1000) vector<int,1000>; // OK
The %template directive should not be used to wrap the same template instantiation more than once in the same scope. This will generate an error. For example:
%template(intList) List<int>; %template(Listint) List<int>; // Error. Template already wrapped.
This error is caused because the template expansion results in two identical classes with the same name. This generates a symbol table conflict. Besides, it probably more efficient to only wrap a specific instantiation only once in order to reduce the potential for code bloat.
Since the type system knows how to handle typedef, it is generally not necessary to instantiate different versions of a template for typenames that are equivalent. For instance, consider this code:
%template(intList) vector<int>; typedef int Integer; ... void foo(vector<Integer> *x);
In this case, vector<Integer> is exactly the same type as vector<int>. Any use of Vector<Integer> is mapped back to the instantiation of vector<int> created earlier. Therefore, it is not necessary to instantiate a new class for the type Integer (doing so is redundant and will simply result in code bloat).
When a template is instantiated using %template, information about that class is saved by SWIG and used elsewhere in the program. For example, if you wrote code like this,
...
%template(intList) List<int>;
...
class UltraList : public List<int> {
...
};
then SWIG knows that List<int> was already wrapped as a class called intList and arranges to handle the inheritance correctly. If, on the other hand, nothing is known about List<int> , you will get a warning message similar to this:
example.h:42. Nothing known about class 'List<int >' (ignored). example.h:42. Maybe you forgot to instantiate 'List<int >' using %template.
If a template class inherits from another template class, you need to make sure that base classes are instantiated before derived classes. For example:
template<class T> class Foo {
...
};
template<class T> class Bar : public Foo<T> {
...
};
// Instantiate base classes first
%template(intFoo) Foo<int>;
%template(doubleFoo) Foo<double>;
// Now instantiate derived classes
%template(intBar) Bar<int>;
%template(doubleBar) Bar<double>;
The order is important since SWIG uses the instantiation names to properly set up the inheritance hierarchy in the resulting wrapper code (and base classes need to be wrapped before derived classes). Don't worry--if you get the order wrong, SWIG should generate a warning message.
Occasionally, you may need to tell SWIG about base classes that are defined by templates, but which aren't supposed to be wrapped. Since SWIG is not able to automatically instantiate templates for this purpose, you must do it manually. To do this, simply use %template with no name. For example:
// Instantiate traits<double,double>, but don't wrap it. %template() traits<double,double>;
If you have to instantiate a lot of different classes for many different types, you might consider writing a SWIG macro. For example:
%define TEMPLATE_WRAP(prefix, T...) %template(prefix ## Foo) Foo<T >; %template(prefix ## Bar) Bar<T >; ... %enddef TEMPLATE_WRAP(int, int) TEMPLATE_WRAP(double, double) TEMPLATE_WRAP(String, char *) TEMPLATE_WRAP(PairStringInt, std::pair<string, int>) ...
Note the use of a vararg macro for the type T. If this wasn't used, the comma in the templated type in the last example would not be possible.
The SWIG template mechanism does support specialization. For instance, if you define a class like this,
template<> class List<int> {
private:
int *data;
int nitems;
int maxitems;
public:
List(int max);
~List();
void append(int obj);
int length();
int get(int n);
};
then SWIG will use this code whenever the user expands List<int> . In practice, this may have very little effect on the underlying wrapper code since specialization is often used to provide slightly modified method bodies (which are ignored by SWIG). However, special SWIG directives such as %typemap, %extend, and so forth can be attached to a specialization to provide customization for specific types.
Partial template specialization is partially supported by SWIG. For example, this code defines a template that is applied when the template argument is a pointer.
template<class T> class List<T*> {
private:
T *data;
int nitems;
int maxitems;
public:
List(int max);
~List();
void append(int obj);
int length();
T get(int n);
};
SWIG should be able to handle most simple uses of partial specialization. However, it may fail to match templates properly in more complicated cases. For example, if you have this code,
template<class T1, class T2> class Foo<T1, T2 *> { };
SWIG isn't able to match it properly for instantiations like Foo<int *, int *>. This problem is not due to parsing, but due to the fact that SWIG does not currently implement all of the C++ argument deduction rules.
Member function templates are supported. The underlying principle is the same as for normal templates--SWIG can't create a wrapper unless you provide more information about types. For example, a class with a member template might look like this:
class Foo {
public:
template<class T> void bar(T x, T y) { ... };
...
};
To expand the template, simply use %template inside the class.
class Foo {
public:
template<class T> void bar(T x, T y) { ... };
...
%template(barint) bar<int>;
%template(bardouble) bar<double>;
};
Or, if you want to leave the original class definition alone, just do this:
class Foo {
public:
template<class T> void bar(T x, T y) { ... };
...
};
...
%extend Foo {
%template(barint) bar<int>;
%template(bardouble) bar<double>;
};
or simply
class Foo {
public:
template<class T> void bar(T x, T y) { ... };
...
};
...
%template(bari) Foo::bar<int>;
%template(bard) Foo::bar<double>;
In this case, the %extend directive is not needed, and %template does the exactly same job, i.e., it adds two new methods to the Foo class.
Note: because of the way that templates are handled, the %template directive must always appear after the definition of the template to be expanded.
Now, if your target language supports overloading, you can even try
%template(bar) Foo::bar<int>; %template(bar) Foo::bar<double>;
and since the two new wrapped methods have the same name 'bar', they will be overloaded, and when called, the correct method will be dispatched depending on the argument type.
When used with members, the %template directive may be placed in another template class. Here is a slightly perverse example:
// A template
template<class T> class Foo {
public:
// A member template
template<class S> T bar(S x, S y) { ... };
...
};
// Expand a few member templates
%extend Foo {
%template(bari) bar<int>;
%template(bard) bar<double>;
}
// Create some wrappers for the template
%template(Fooi) Foo<int>;
%template(Food) Foo<double>;
Miraculously, you will find that each expansion of Foo has member functions bari() and bard() added.
A common use of member templates is to define constructors for copies and conversions. For example:
template<class T1, class T2> struct pair {
T1 first;
T2 second;
pair() : first(T1()), second(T2()) { }
pair(const T1 &x, const T2 &y) : first(x), second(y) { }
template<class U1, class U2> pair(const pair<U1,U2> &x)
: first(x.first),second(x.second) { }
};
This declaration is perfectly acceptable to SWIG, but the constructor template will be ignored unless you explicitly expand it. To do that, you could expand a few versions of the constructor in the template class itself. For example:
%extend pair {
%template(pair) pair<T1,T2>; // Generate default copy constructor
};
When using %extend in this manner, notice how you can still use the template parameters in the original template definition.
Alternatively, you could expand the constructor template in selected instantiations. For example:
// Instantiate a few versions
%template(pairii) pair<int,int>;
%template(pairdd) pair<double,double>;
// Create a default constructor only
%extend pair<int,int> {
%template(paird) pair<int,int>; // Default constructor
};
// Create default and conversion constructors
%extend pair<double,double> {
%template(paird) pair<double,dobule>; // Default constructor
%template(pairc) pair<int,int>; // Conversion constructor
};
And if your target language supports overloading, then you can try instead:
// Create default and conversion constructors
%extend pair<double,double> {
%template(pair) pair<double,dobule>; // Default constructor
%template(pair) pair<int,int>; // Conversion constructor
};
In this case, the default and conversion constructors have the same name. Hence, Swig will overload them and define an unique visible constructor, that will dispatch the proper call depending on the argument type.
If all of this isn't quite enough and you really want to make someone's head explode, SWIG directives such as %rename, %extend, and %typemap can be included directly in template definitions. For example:
// File : list.h
template<class T> class List {
...
public:
%rename(__getitem__) get(int);
List(int max);
~List();
...
T get(int index);
%extend {
char *__str__() {
/* Make a string representation */
...
}
}
};
In this example, the extra SWIG directives are propagated to every template instantiation.
It is also possible to separate these declarations from the template class. For example:
%rename(__getitem__) List::get;
%extend List {
char *__str__() {
/* Make a string representation */
...
}
/* Make a copy */
T *__copy__() {
return new List<T>(*$self);
}
};
...
template<class T> class List {
...
public:
List() { };
T get(int index);
...
};
When %extend is decoupled from the class definition, it is legal to use the same template parameters as provided in the class definition. These are replaced when the template is expanded. In addition, the %extend directive can be used to add additional methods to a specific instantiation. For example:
%template(intList) List<int>;
%extend List<int> {
void blah() {
printf("Hey, I'm an List<int>!\n");
}
};
SWIG even supports overloaded templated functions. As usual the %template directive is used to wrap templated functions. For example:
template<class T> void foo(T x) { };
template<class T> void foo(T x, T y) { };
%template(foo) foo<int>;
This will generate two overloaded wrapper methods, the first will take a single integer as an argument and the second will take two integer arguments.
Needless to say, SWIG's template support provides plenty of opportunities to break the universe. That said, an important final point is that SWIG does not perform extensive error checking of templates! Specifically, SWIG does not perform type checking nor does it check to see if the actual contents of the template declaration make any sense. Since the C++ compiler will hopefully check this when it compiles the resulting wrapper file, there is no practical reason for SWIG to duplicate this functionality (besides, none of the SWIG developers are masochistic enough to want to implement this right now).
Compatibility Note: The first implementation of template support relied heavily on macro expansion in the preprocessor. Templates have been more tightly integrated into the parser and type system in SWIG-1.3.12 and the preprocessor is no longer used. Code that relied on preprocessing features in template expansion will no longer work. However, SWIG still allows the # operator to be used to generate a string from a template argument.
Compatibility Note: In earlier versions of SWIG, the %template directive introduced a new class name. This name could then be used with other directives. For example:
%template(vectori) vector<int>;
%extend vectori {
void somemethod() { }
};
This behavior is no longer supported. Instead, you should use the original template name as the class name. For example:
%template(vectori) vector<int>;
%extend vector<int> {
void somemethod() { }
};
Similar changes apply to typemaps and other customization features.
Support for C++ namespaces is a relatively late addition to SWIG, first appearing in SWIG-1.3.12. Before describing the implementation, it is worth noting that the semantics of C++ namespaces is extremely non-trivial--especially with regard to the C++ type system and class machinery. At a most basic level, namespaces are sometimes used to encapsulate common functionality. For example:
namespace math {
double sin(double);
double cos(double);
class Complex {
double im,re;
public:
...
};
...
};
Members of the namespace are accessed in C++ by prepending the namespace prefix to names. For example:
double x = math::sin(1.0); double magnitude(math::Complex *c); math::Complex c; ...
At this level, namespaces are relatively easy to manage. However, things start to get very ugly when you throw in the other ways a namespace can be used. For example, selective symbols can be exported from a namespace with using.
using math::Complex; double magnitude(Complex *c); // Namespace prefix stripped
Similarly, the contents of an entire namespace can be made available like this:
using namespace math; double x = sin(1.0); double magnitude(Complex *c);
Alternatively, a namespace can be aliased:
namespace M = math; double x = M::sin(1.0); double magnitude(M::Complex *c);
Using combinations of these features, it is possible to write head-exploding code like this:
namespace A {
class Foo {
};
}
namespace B {
namespace C {
using namespace A;
}
typedef C::Foo FooClass;
}
namespace BIGB = B;
namespace D {
using BIGB::FooClass;
class Bar : public FooClass {
}
};
class Spam : public D::Bar {
};
void evil(A::Foo *a, B::FooClass *b, B::C::Foo *c, BIGB::FooClass *d,
BIGB::C::Foo *e, D::FooClass *f);
Given the possibility for such perversion, it's hard to imagine how every C++ programmer might want such code wrapped into the target language. Clearly this code defines three different classes. However, one of those classes is accessible under at least six different names!
SWIG fully supports C++ namespaces in its internal type system and class handling code. If you feed SWIG the above code, it will be parsed correctly, it will generate compilable wrapper code, and it will produce a working scripting language module. However, the default wrapping behavior is to flatten namespaces in the target language. This means that the contents of all namespaces are merged together in the resulting scripting language module. For example, if you have code like this,
%module foo
namespace foo {
void bar(int);
void spam();
}
namespace bar {
void blah();
}
then SWIG simply creates three wrapper functions bar(), spam(), and blah() in the target language. SWIG does not prepend the names with a namespace prefix nor are the functions packaged in any kind of nested scope.
There is some rationale for taking this approach. Since C++ namespaces are often used to define modules in C++, there is a natural correlation between the likely contents of a SWIG module and the contents of a namespace. For instance, it would not be unreasonable to assume that a programmer might make a separate extension module for each C++ namespace. In this case, it would be redundant to prepend everything with an additional namespace prefix when the module itself already serves as a namespace in the target language. Or put another way, if you want SWIG to keep namespaces separate, simply wrap each namespace with its own SWIG interface.
Because namespaces are flattened, it is possible for symbols defined in different namespaces to generate a name conflict in the target language. For example:
namespace A {
void foo(int);
}
namespace B {
void foo(double);
}
When this conflict occurs, you will get an error message that resembles this:
example.i:26. Error. 'foo' is multiply defined in the generated module. example.i:23. Previous declaration of 'foo'
To resolve this error, simply use %rename to disambiguate the declarations. For example:
%rename(B_foo) B::foo;
...
namespace A {
void foo(int);
}
namespace B {
void foo(double); // Gets renamed to B_foo
}
Similarly, %ignore can be used to ignore declarations.
using declarations do not have any effect on the generated wrapper code. They are ignored by SWIG language modules and they do not result in any code. However, these declarations are used by the internal type system to track type-names. Therefore, if you have code like this:
namespace A {
typedef int Integer;
}
using namespace A;
void foo(Integer x);
SWIG knows that Integer is the same as A::Integer which is the same as int.
Namespaces may be combined with templates. If necessary, the %template directive can be used to expand a template defined in a different namespace. For example:
namespace foo {
template<typename T> T max(T a, T b) { return a > b ? a : b; }
}
using foo::max;
%template(maxint) max<int>; // Okay.
%template(maxfloat) foo::max<float>; // Okay (qualified name).
namespace bar {
using namespace foo;
%template(maxdouble) max<double>; // Okay.
}
The combination of namespaces and other SWIG directives may introduce subtle scope-related problems. The key thing to keep in mind is that all SWIG generated wrappers are produced in the global namespace. Symbols from other namespaces are always accessed using fully qualified names---names are never imported into the global space unless the interface happens to do so with a using declaration. In almost all cases, SWIG adjusts typenames and symbols to be fully qualified. However, this is not done in code fragments such as function bodies, typemaps, exception handlers, and so forth. For example, consider the following:
namespace foo {
typedef int Integer;
class bar {
public:
...
};
}
%extend foo::bar {
Integer add(Integer x, Integer y) {
Integer r = x + y; // Error. Integer not defined in this scope
return r;
}
};
In this case, SWIG correctly resolves the added method parameters and return type to foo::Integer. However, since function bodies aren't parsed and such code is emitted in the global namespace, this code produces a compiler error about Integer. To fix the problem, make sure you use fully qualified names. For example:
%extend foo::bar {
Integer add(Integer x, Integer y) {
foo::Integer r = x + y; // Ok.
return r;
}
};
Note: SWIG does not propagate using declarations to the resulting wrapper code. If these declarations appear in an interface, they should also appear in any header files that might have been included in a %{ ... %} section. In other words, don't insert extra using declarations into a SWIG interface unless they also appear in the underlying C++ code.
Note: Code inclusion directives such as %{ ... %} or %inline %{ ... %} should not be placed inside a namespace declaration. The code emitted by these directives will not be enclosed in a namespace and you may get very strange results. If you need to use namespaces with these directives, consider the following:
// Good version
%inline %{
namespace foo {
void bar(int) { ... }
...
}
%}
// Bad version. Emitted code not placed in namespace.
namespace foo {
%inline %{
void bar(int) { ... } /* I'm bad */
...
%}
}
Note: When the %extend directive is used inside a namespace, the namespace name is included in the generated functions. For example, if you have code like this,
namespace foo {
class bar {
public:
%extend {
int blah(int x);
};
};
}
the added method blah() is mapped to a function int foo_bar_blah(foo::bar *self, int x). This function resides in the global namespace.
Note: Although namespaces are flattened in the target language, the SWIG generated wrapper code observes the same namespace conventions as used in the input file. Thus, if there are no symbol conflicts in the input, there will be no conflicts in the generated code.
Note: In the same way that no resolution is performed on parameters, a conversion operator name must match exactly to how it is defined. Do not change the qualification of the operator. For example, suppose you had an interface like this:
namespace foo {
class bar;
class spam {
public:
...
operator bar(); // Conversion of spam -> bar
...
};
}
The following is how the feature is expected to be written for a successful match:
%rename(tofoo) foo::spam::operator bar();
The following does not work as no namespace resolution is performed in the matching of conversion operator names:
%rename(tofoo) foo::spam::operator foo::bar();
Note, however, that if the operator is defined using a qualifier in its name, then the feature must use it too...
%rename(tofoo) foo::spam::operator bar(); // will not match
%rename(tofoo) foo::spam::operator foo::bar(); // will match
namespace foo {
class bar;
class spam {
public:
...
operator foo::bar();
...
};
}
Compatibility Note: Versions of SWIG prior to 1.3.32 were inconsistent in this approach. A fully qualified name was usually required, but would not work in some situations.
Note: The flattening of namespaces is only intended to serve as a basic namespace implementation. None of the target language modules are currently programmed with any namespace awareness. In the future, language modules may or may not provide more advanced namespace support.
As has been mentioned, when %rename includes parameters, the parameter types must match exactly (no typedef or namespace resolution is performed). SWIG treats templated types slightly differently and has an additional matching rule so unlike non-templated types, an exact match is not always required. If the fully qualified templated type is specified, it will have a higher precedence over the generic template type. In the example below, the generic template type is used to rename to bbb and the fully qualified type is used to rename to ccc.
%rename(bbb) Space::ABC::aaa(T t); // will match but with lower precedence than ccc
%rename(ccc) Space::ABC<Space::XYZ>::aaa(Space::XYZ t); // will match but with higher precedence than bbb
namespace Space {
class XYZ {};
template<typename T> struct ABC {
void aaa(T t) {}
};
}
%template(ABCXYZ) Space::ABC<Space::XYZ>;
It should now be apparent that there are many ways to achieve a renaming with %rename. This is demonstrated by the following two examples, which are effectively the same as the above example. Below shows how %rename can be placed inside a namespace.
namespace Space {
%rename(bbb) ABC::aaa(T t); // will match but with lower precedence than ccc
%rename(ccc) ABC<Space::XYZ>::aaa(Space::XYZ t); // will match but with higher precedence than bbb
%rename(ddd) ABC<Space::XYZ>::aaa(XYZ t); // will not match
}
namespace Space {
class XYZ {};
template<typename T> struct ABC {
void aaa(T t) {}
};
}
%template(ABCXYZ) Space::ABC<Space::XYZ>;
Note that ddd does not match as there is no namespace resolution for parameter types and the fully qualified type must be specified for template type expansion. The following example shows how %rename can be placed within %extend.
namespace Space {
%extend ABC {
%rename(bbb) aaa(T t); // will match but with lower precedence than ccc
}
%extend ABC<Space::XYZ> {
%rename(ccc) aaa(Space::XYZ t); // will match but with higher precedence than bbb
%rename(ddd) aaa(XYZ t); // will not match
}
}
namespace Space {
class XYZ {};
template<typename T> struct ABC {
void aaa(T t) {}
};
}
%template(ABCXYZ) Space::ABC<Space::XYZ>;
When C++ programs utilize exceptions, exceptional behavior is sometimes specified as part of a function or method declaration. For example:
class Error { };
class Foo {
public:
...
void blah() throw(Error);
...
};
If an exception specification is used, SWIG automatically generates wrapper code for catching the indicated exception and, when possible, rethrowing it into the target language, or converting it into an error in the target language otherwise. For example, in Python, you can write code like this:
f = Foo()
try:
f.blah()
except Error,e:
# e is a wrapped instance of "Error"
Details of how to tailor code for handling the caught C++ exception and converting it into the target language's exception/error handling mechanism is outlined in the "throws" typemap section.
Since exception specifications are sometimes only used sparingly, this alone may not be enough to properly handle C++ exceptions. To do that, a different set of special SWIG directives are used. Consult the "Exception handling with %exception" section for details. The next section details a way of simulating an exception specification or replacing an existing one.
Exceptions are automatically handled for methods with an exception specification. Similar handling can be achieved for methods without exception specifications through the %catches feature. It is also possible to replace any declared exception specification using the %catches feature. In fact, %catches uses the same "throws" typemaps that SWIG uses for exception specifications in handling exceptions. The %catches feature must contain a list of possible types that can be thrown. For each type that is in the list, SWIG will generate a catch handler, in the same way that it would for types declared in the exception specification. Note that the list can also include the catch all specification "...". For example,
struct EBase { virtual ~EBase(); };
struct Error1 : EBase { };
struct Error2 : EBase { };
struct Error3 : EBase { };
struct Error4 : EBase { };
%catches(Error1,Error2,...) Foo::bar();
%catches(EBase) Foo::blah();
class Foo {
public:
...
void bar();
void blah() throw(Error1,Error2,Error3,Error4);
...
};
For the Foo::bar() method, which can throw anything, SWIG will generate catch handlers for Error1, Error2 as well as a catch all handler (...). Each catch handler will convert the caught exception and convert it into a target language error/exception. The catch all handler will convert the caught exception into an unknown error/exception.
Without the %catches feature being attached to Foo::blah(), SWIG will generate catch handlers for all of the types in the exception specification, that is, Error1, Error2, Error3, Error4. However, with the %catches feature above, just a single catch handler for the base class, EBase will be generated to convert the C++ exception into a target language error/exception.
Starting with SWIG-1.3.7, there is limited parsing support for pointers to C++ class members. For example:
double do_op(Object *o, double (Object::*callback)(double,double)); extern double (Object::*fooptr)(double,double); %constant double (Object::*FOO)(double,double) = &Object::foo;
Although these kinds of pointers can be parsed and represented by the SWIG type system, few language modules know how to handle them due to implementation differences from standard C pointers. Readers are strongly advised to consult an advanced text such as the "The Annotated C++ Manual" for specific details.
When pointers to members are supported, the pointer value might appear as a special string like this:
>>> print example.FOO _ff0d54a800000000_m_Object__f_double_double__double >>>
In this case, the hexadecimal digits represent the entire value of the pointer which is usually the contents of a small C++ structure on most machines.
SWIG's type-checking mechanism is also more limited when working with member pointers. Normally SWIG tries to keep track of inheritance when checking types. However, no such support is currently provided for member pointers.
In some C++ programs, objects are often encapsulated by smart-pointers or proxy classes. This is sometimes done to implement automatic memory management (reference counting) or persistence. Typically a smart-pointer is defined by a template class where the -> operator has been overloaded. This class is then wrapped around some other class. For example:
// Smart-pointer class
template<class T> class SmartPtr {
T *pointee;
public:
...
T *operator->() {
return pointee;
}
...
};
// Ordinary class
class Foo_Impl {
public:
int x;
virtual void bar();
...
};
// Smart-pointer wrapper
typedef SmartPtr<Foo_Impl> Foo;
// Create smart pointer Foo
Foo make_Foo() {
return SmartPtr(new Foo_Impl());
}
// Do something with smart pointer Foo
void do_something(Foo f) {
printf("x = %d\n", f->x);
f->bar();
}
A key feature of this approach is that by defining operator-> the methods and attributes of the object wrapped by a smart pointer are transparently accessible. For example, expressions such as these (from the previous example),
f->x f->bar()
are transparently mapped to the following
(f.operator->())->x; (f.operator->())->bar();
When generating wrappers, SWIG tries to emulate this functionality to the extent that it is possible. To do this, whenever operator->() is encountered in a class, SWIG looks at its returned type and uses it to generate wrappers for accessing attributes of the underlying object. For example, wrapping the above code produces wrappers like this:
int Foo_x_get(Foo *f) {
return (*f)->x;
}
void Foo_x_set(Foo *f, int value) {
(*f)->x = value;
}
void Foo_bar(Foo *f) {
(*f)->bar();
}
These wrappers take a smart-pointer instance as an argument, but dereference it in a way to gain access to the object returned by operator->(). You should carefully compare these wrappers to those in the first part of this chapter (they are slightly different).
The end result is that access looks very similar to C++. For example, you could do this in Python:
>>> f = make_Foo() >>> print f.x 0 >>> f.bar() >>>
When generating wrappers through a smart-pointer, SWIG tries to generate wrappers for all methods and attributes that might be accessible through operator->(). This includes any methods that might be accessible through inheritance. However, there are a number of restrictions:
If the smart-pointer class and the underlying object both define a method or variable of the same name, then the smart-pointer version has precedence. For example, if you have this code
class Foo {
public:
int x;
};
class Bar {
public:
int x;
Foo *operator->();
};
then the wrapper for Bar::x accesses the x defined in Bar, and not the x defined in Foo.
If your intent is to only expose the smart-pointer class in the interface, it is not necessary to wrap both the smart-pointer class and the class for the underlying object. However, you must still tell SWIG about both classes if you want the technique described in this section to work. To only generate wrappers for the smart-pointer class, you can use the %ignore directive. For example:
%ignore Foo;
class Foo { // Ignored
};
class Bar {
public:
Foo *operator->();
...
};
Alternatively, you can import the definition of Foo from a separate file using %import.
Note: When a class defines operator->(), the operator itself is wrapped as a method __deref__(). For example:
f = Foo() # Smart-pointer p = f.__deref__() # Raw pointer from operator->
Note: To disable the smart-pointer behavior, use %ignore to ignore operator->(). For example:
%ignore Bar::operator->;
Note: Smart pointer support was first added in SWIG-1.3.14.
using declarations are sometimes used to adjust access to members of base classes. For example:
class Foo {
public:
int blah(int x);
};
class Bar {
public:
double blah(double x);
};
class FooBar : public Foo, public Bar {
public:
using Foo::blah;
using Bar::blah;
char *blah(const char *x);
};
In this example, the using declarations make different versions of the overloaded blah() method accessible from the derived class. For example:
FooBar *f;
f->blah(3); // Ok. Invokes Foo::blah(int)
f->blah(3.5); // Ok. Invokes Bar::blah(double)
f->blah("hello"); // Ok. Invokes FooBar::blah(const char *);
SWIG emulates the same functionality when creating wrappers. For example, if you wrap this code in Python, the module works just like you would expect:
>>> import example
>>> f = example.FooBar()
>>> f.blah(3)
>>> f.blah(3.5)
>>> f.blah("hello")
using declarations can also be used to change access when applicable. For example:
class Foo {
protected:
int x;
int blah(int x);
};
class Bar : public Foo {
public:
using Foo::x; // Make x public
using Foo::blah; // Make blah public
};
This also works in SWIG---the exposed declarations will be wrapped normally.
When using declarations are used as shown in these examples, declarations from the base classes are copied into the derived class and wrapped normally. When copied, the declarations retain any properties that might have been attached using %rename , %ignore, or %feature. Thus, if a method is ignored in a base class, it will also be ignored by a using declaration.
Because a using declaration does not provide fine-grained control over the declarations that get imported, it may be difficult to manage such declarations in applications that make heavy use of SWIG customization features. If you can't get using to work correctly, you can always change the interface to the following:
class FooBar : public Foo, public Bar {
public:
#ifndef SWIG
using Foo::blah;
using Bar::blah;
#else
int blah(int x); // explicitly tell SWIG about other declarations
double blah(double x);
#endif
char *blah(const char *x);
};
Notes:
If a derived class redefines a method defined in a base class, then a using declaration won't cause a conflict. For example:
class Foo {
public:
int blah(int );
double blah(double);
};
class Bar : public Foo {
public:
using Foo::blah; // Only imports blah(double);
int blah(int);
};
Resolving ambiguity in overloading may prevent declarations from being imported by using. For example:
%rename(blah_long) Foo::blah(long);
class Foo {
public:
int blah(int);
long blah(long); // Renamed to blah_long
};
class Bar : public Foo {
public:
using Foo::blah; // Only imports blah(int)
double blah(double x);
};
There is limited support for nested structs and unions when wrapping C code, see Nested structures for further details. However, there is no nested class/struct/union support when wrapping C++ code (using the -c++ commandline option). This may be added at a future date, however, until then some of the following workarounds can be applied.
It might be possible to use partial class information. Since SWIG does not need the entire class specification to work, conditional compilation can be used to comment out the problematic nested class definition, you might do this:
class Foo {
public:
#ifndef SWIG
class Bar {
public:
...
};
#endif
Foo();
~Foo();
...
};
The next workaround assumes you cannot modify the source code as was done above and it provides a solution for methods that use nested class types. Imagine we are wrapping the Outer class which contains a nested class Inner:
// File outer.h
class Outer {
public:
class Inner {
public:
int var;
Inner(int v = 0) : var(v) {}
};
void method(Inner inner);
};
The following interface file works around SWIG nested class limitations by redefining the nested class as a global class. A typedef for the compiler is also required in order for the generated wrappers to compile.
// File : example.i
%module example
// Suppress SWIG warning
#pragma SWIG nowarn=SWIGWARN_PARSE_NESTED_CLASS
// Redefine nested class in global scope in order for SWIG to generate
// a proxy class. Only SWIG parses this definition.
class Inner {
public:
int var;
Inner(int v = 0) : var(v) {}
};
%{
#include "outer.h"
%}
%include "outer.h"
%{
// SWIG thinks that Inner is a global class, so we need to trick the C++
// compiler into understanding this so called global type.
typedef Outer::Inner Inner;
%}
The downside to this approach is having to maintain two definitions of Inner, the real one and the one in the interface file that SWIG parses.
A common issue when working with C++ programs is dealing with all possible ways in which the const qualifier (or lack thereof) will break your program, all programs linked against your program, and all programs linked against those programs.
Although SWIG knows how to correctly deal with const in its internal type system and it knows how to generate wrappers that are free of const-related warnings, SWIG does not make any attempt to preserve const-correctness in the target language. Thus, it is possible to pass const qualified objects to non-const methods and functions. For example, consider the following code in C++:
const Object * foo();
void bar(Object *);
...
// C++ code
void blah() {
bar(foo()); // Error: bar discards const
};
Now, consider the behavior when wrapped into a Python module:
>>> bar(foo()) # Okay >>>
Although this is clearly a violation of the C++ type-system, fixing the problem doesn't seem to be worth the added implementation complexity that would be required to support it in the SWIG run-time type system. There are no plans to change this in future releases (although we'll never rule anything out entirely).
The bottom line is that this particular issue does not appear to be a problem for most SWIG projects. Of course, you might want to consider using another tool if maintaining constness is the most important part of your project.
If you're wrapping serious C++ code, you might want to pick up a copy of "The Annotated C++ Reference Manual" by Ellis and Stroustrup. This is the reference document we use to guide a lot of SWIG's C++ support.
SWIG includes its own enhanced version of the C preprocessor. The preprocessor supports the standard preprocessor directives and macro expansion rules. However, a number of modifications and enhancements have been made. This chapter describes some of these modifications.
To include another file into a SWIG interface, use the %include directive like this:
%include "pointer.i"
Unlike, #include, %include includes each file once (and will not reload the file on subsequent %include declarations). Therefore, it is not necessary to use include-guards in SWIG interfaces.
By default, the #include is ignored unless you run SWIG with the -includeall option. The reason for ignoring traditional includes is that you often don't want SWIG to try and wrap everything included in standard header system headers and auxiliary files.
SWIG provides another file inclusion directive with the %import directive. For example:
%import "foo.i"
The purpose of %import is to collect certain information from another SWIG interface file or a header file without actually generating any wrapper code. Such information generally includes type declarations (e.g., typedef) as well as C++ classes that might be used as base-classes for class declarations in the interface. The use of %import is also important when SWIG is used to generate extensions as a collection of related modules. This is an advanced topic and is described in later in the Working with Modules chapter.
The -importall directive tells SWIG to follow all #include statements as imports. This might be useful if you want to extract type definitions from system header files without generating any wrappers.
SWIG fully supports the use of #if, #ifdef, #ifndef, #else, #endif to conditionally include parts of an interface. The following symbols are predefined by SWIG when it is parsing the interface:
SWIG Always defined when SWIG is processing a file
SWIGIMPORTED Defined when SWIG is importing a file with %import
SWIGMAC Defined when running SWIG on the Macintosh
SWIGWIN Defined when running SWIG under Windows
SWIG_VERSION Hexadecimal number containing SWIG version,
such as 0x010311 (corresponding to SWIG-1.3.11).
SWIGALLEGROCL Defined when using Allegro CL
SWIGCFFI Defined when using CFFI
SWIGCHICKEN Defined when using CHICKEN
SWIGCLISP Defined when using CLISP
SWIGCSHARP Defined when using C#
SWIGGUILE Defined when using Guile
SWIGJAVA Defined when using Java
SWIGLUA Defined when using Lua
SWIGMODULA3 Defined when using Modula-3
SWIGMZSCHEME Defined when using Mzscheme
SWIGOCAML Defined when using Ocaml
SWIGOCTAVE Defined when using Octave
SWIGPERL Defined when using Perl
SWIGPHP Defined when using PHP
SWIGPIKE Defined when using Pike
SWIGPYTHON Defined when using Python
SWIGR Defined when using R
SWIGRUBY Defined when using Ruby
SWIGSEXP Defined when using S-expressions
SWIGTCL Defined when using Tcl
SWIGXML Defined when using XML
In addition, SWIG defines the following set of standard C/C++ macros:
__LINE__ Current line number __FILE__ Current file name __STDC__ Defined to indicate ANSI C __cplusplus Defined when -c++ option used
Interface files can look at these symbols as necessary to change the way in which an interface is generated or to mix SWIG directives with C code. These symbols are also defined within the C code generated by SWIG (except for the symbol `SWIG' which is only defined within the SWIG compiler).
Traditional preprocessor macros can be used in SWIG interfaces. Be aware that the #define statement is also used to try and detect constants. Therefore, if you have something like this in your file,
#ifndef _FOO_H 1 #define _FOO_H 1 ... #endif
you may get some extra constants such as _FOO_H showing up in the scripting interface.
More complex macros can be defined in the standard way. For example:
#define EXTERN extern #ifdef __STDC__ #define _ANSI(args) (args) #else #define _ANSI(args) () #endif
The following operators can appear in macro definitions:
SWIG provides an enhanced macro capability with the %define and %enddef directives. For example:
%define ARRAYHELPER(type,name)
%inline %{
type *new_ ## name (int nitems) {
return (type *) malloc(sizeof(type)*nitems);
}
void delete_ ## name(type *t) {
free(t);
}
type name ## _get(type *t, int index) {
return t[index];
}
void name ## _set(type *t, int index, type val) {
t[index] = val;
}
%}
%enddef
ARRAYHELPER(int, IntArray)
ARRAYHELPER(double, DoubleArray)
The primary purpose of %define is to define large macros of code. Unlike normal C preprocessor macros, it is not necessary to terminate each line with a continuation character (\)--the macro definition extends to the first occurrence of %enddef. Furthermore, when such macros are expanded, they are reparsed through the C preprocessor. Thus, SWIG macros can contain all other preprocessor directives except for nested %define statements.
The SWIG macro capability is a very quick and easy way to generate large amounts of code. In fact, many of SWIG's advanced features and libraries are built using this mechanism (such as C++ template support).
SWIG-1.3.12 and newer releases support variadic preprocessor macros. For example:
#define DEBUGF(fmt,...) fprintf(stderr,fmt,__VA_ARGS__)
When used, any extra arguments to ... are placed into the special variable __VA_ARGS__. This also works with special SWIG macros defined using %define.
SWIG allows a variable number of arguments to be empty. However, this often results in an extra comma (,) and syntax error in the resulting expansion. For example:
DEBUGF("hello"); --> fprintf(stderr,"hello",);
To get rid of the extra comma, use ## like this:
#define DEBUGF(fmt,...) fprintf(stderr,fmt, ##__VA_ARGS__)
SWIG also supports GNU-style variadic macros. For example:
#define DEBUGF(fmt, args...) fprintf(stdout,fmt,args)
Comment: It's not entirely clear how variadic macros might be useful to interface building. However, they are used internally to implement a number of SWIG directives and are provided to make SWIG more compatible with C99 code.
The SWIG preprocessor does not process any text enclosed in a code block %{ ... %}. Therefore, if you write code like this,
%{
#ifdef NEED_BLAH
int blah() {
...
}
#endif
%}
the contents of the %{ ... %} block are copied without modification to the output (including all preprocessor directives).
SWIG always runs the preprocessor on text appearing inside { ... }. However, sometimes it is desirable to make a preprocessor directive pass through to the output file. For example:
%extend Foo {
void bar() {
#ifdef DEBUG
printf("I'm in bar\n");
#endif
}
}
By default, SWIG will interpret the #ifdef DEBUG statement. However, if you really wanted that code to actually go into the wrapper file, prefix the preprocessor directives with % like this:
%extend Foo {
void bar() {
%#ifdef DEBUG
printf("I'm in bar\n");
%#endif
}
}
SWIG will strip the extra % and leave the preprocessor directive in the code.
Typemaps support a special attribute called noblock where the { ... } delimiters can be used, but the delimiters are not actually generated into the code. The effect is then similar to using "" or %{ %} delimiters but the code is run through the preprocessor. For example:
#define SWIG_macro(CAST) (CAST)$input
%typemap(in) Int {$1= SWIG_macro(int);}
might generate
{
arg1=(int)jarg1;
}
whereas
#define SWIG_macro(CAST) (CAST)$input
%typemap(in,noblock=1) Int {$1= SWIG_macro(int);}
might generate
arg1=(int)jarg1;
and
#define SWIG_macro(CAST) (CAST)$input
%typemap(in) Int %{$1=SWIG_macro(int);%}
would generate
arg1=SWIG_macro(int);
Like many compilers, SWIG supports a -E command line option to display the output from the preprocessor. When the -E switch is used, SWIG will not generate any wrappers. Instead the results after the preprocessor has run are displayed. This might be useful as an aid to debugging and viewing the results of macro expansions.
SWIG supports the commonly used #warning and #error preprocessor directives. The #warning directive will cause SWIG to issue a warning then continue processing. The #error directive will cause SWIG to exit with a fatal error. Example usage:
#error "This is a fatal error message" #warning "This is a warning message"
The #error behaviour can be made to work like #warning if the -cpperraswarn commandline option is used. Alternatively, the #pragma directive can be used to the same effect, for example:
/* Modified behaviour: #error does not cause SWIG to exit with error */ #pragma SWIG cpperraswarn=1 /* Normal behaviour: #error does cause SWIG to exit with error */ #pragma SWIG cpperraswarn=0
To help build extension modules, SWIG is packaged with a library of support files that you can include in your own interfaces. These files often define new SWIG directives or provide utility functions that can be used to access parts of the standard C and C++ libraries. This chapter provides a reference to the current set of supported library files.
Compatibility note: Older versions of SWIG included a number of library files for manipulating pointers, arrays, and other structures. Most these files are now deprecated and have been removed from the distribution. Alternative libraries provide similar functionality. Please read this chapter carefully if you used the old libraries.
Library files are included using the %include directive. When searching for files, directories are searched in the following order:
Within each directory, SWIG first looks for a subdirectory corresponding to a target language (e.g., python, tcl , etc.). If found, SWIG will search the language specific directory first. This allows for language-specific implementations of library files.
You can ignore the installed SWIG library by setting the SWIG_LIB environment variable. Set the environment variable to hold an alternative library directory.
The directories that are searched are displayed when using -verbose commandline option.
This section describes library modules for manipulating low-level C arrays and pointers. The primary use of these modules is in supporting C declarations that manipulate bare pointers such as int *, double *, or void *. The modules can be used to allocate memory, manufacture pointers, dereference memory, and wrap pointers as class-like objects. Since these functions provide direct access to memory, their use is potentially unsafe and you should exercise caution.
The cpointer.i module defines macros that can be used to used to generate wrappers around simple C pointers. The primary use of this module is in generating pointers to primitive datatypes such as int and double.
%pointer_functions(type,name)
Generates a collection of four functions for manipulating a pointer type *:
type *new_name()
Creates a new object of type type and returns a pointer to it. In C, the object is created using calloc(). In C++, new is used.
type *copy_name(type value)
Creates a new object of type type and returns a pointer to it. An initial value is set by copying it from value. In C, the object is created using calloc(). In C++, new is used.
type *delete_name(type *obj)
Deletes an object type type.
void name_assign(type *obj, type value)
Assigns *obj = value.
type name_value(type *obj)
Returns the value of *obj.
When using this macro, type may be any type and name must be a legal identifier in the target language. name should not correspond to any other name used in the interface file.
Here is a simple example of using %pointer_functions():
%module example %include "cpointer.i" /* Create some functions for working with "int *" */ %pointer_functions(int, intp); /* A function that uses an "int *" */ void add(int x, int y, int *result);
Now, in Python:
>>> import example >>> c = example.new_intp() # Create an "int" for storing result >>> example.add(3,4,c) # Call function >>> example.intp_value(c) # Dereference 7 >>> example.delete_intp(c) # Delete
%pointer_class(type,name)
Wraps a pointer of type * inside a class-based interface. This interface is as follows:
struct name {
name(); // Create pointer object
~name(); // Delete pointer object
void assign(type value); // Assign value
type value(); // Get value
type *cast(); // Cast the pointer to original type
static name *frompointer(type *); // Create class wrapper from existing
// pointer
};
When using this macro, type is restricted to a simple type name like int, float, or Foo. Pointers and other complicated types are not allowed. name must be a valid identifier not already in use. When a pointer is wrapped as a class, the "class" may be transparently passed to any function that expects the pointer.
If the target language does not support proxy classes, the use of this macro will produce the example same functions as %pointer_functions() macro.
It should be noted that the class interface does introduce a new object or wrap a pointer inside a special structure. Instead, the raw pointer is used directly.
Here is the same example using a class instead:
%module example %include "cpointer.i" /* Wrap a class interface around an "int *" */ %pointer_class(int, intp); /* A function that uses an "int *" */ void add(int x, int y, int *result);
Now, in Python (using proxy classes)
>>> import example >>> c = example.intp() # Create an "int" for storing result >>> example.add(3,4,c) # Call function >>> c.value() # Dereference 7
Of the two macros, %pointer_class is probably the most convenient when working with simple pointers. This is because the pointers are access like objects and they can be easily garbage collected (destruction of the pointer object destroys the underlying object).
%pointer_cast(type1, type2, name)
Creates a casting function that converts type1 to type2 . The name of the function is name. For example:
%pointer_cast(int *, unsigned int *, int_to_uint);
In this example, the function int_to_uint() would be used to cast types in the target language.
Note: None of these macros can be used to safely work with strings (char * or char **).
Note: When working with simple pointers, typemaps can often be used to provide more seamless operation.
This module defines macros that assist in wrapping ordinary C pointers as arrays. The module does not provide any safety or an extra layer of wrapping--it merely provides functionality for creating, destroying, and modifying the contents of raw C array data.
%array_functions(type,name)
Creates four functions.
type *new_name(int nelements)
Creates a new array of objects of type type. In C, the array is allocated using calloc(). In C++, new [] is used.
type *delete_name(type *ary)
Deletes an array. In C, free() is used. In C++, delete [] is used.
type name_getitem(type *ary, int index)
Returns the value ary[index].
void name_setitem(type *ary, int index, type value)
Assigns ary[index] = value.
When using this macro, type may be any type and name must be a legal identifier in the target language. name should not correspond to any other name used in the interface file.
Here is an example of %array_functions(). Suppose you had a function like this:
void print_array(double x[10]) {
int i;
for (i = 0; i < 10; i++) {
printf("[%d] = %g\n", i, x[i]);
}
}
To wrap it, you might write this:
%module example %include "carrays.i" %array_functions(double, doubleArray); void print_array(double x[10]);
Now, in a scripting language, you might write this:
a = new_doubleArray(10) # Create an array
for i in range(0,10):
doubleArray_setitem(a,i,2*i) # Set a value
print_array(a) # Pass to C
delete_doubleArray(a) # Destroy array
%array_class(type,name)
Wraps a pointer of type * inside a class-based interface. This interface is as follows:
struct name {
name(int nelements); // Create an array
~name(); // Delete array
type getitem(int index); // Return item
void setitem(int index, type value); // Set item
type *cast(); // Cast to original type
static name *frompointer(type *); // Create class wrapper from
// existing pointer
};
When using this macro, type is restricted to a simple type name like int or float. Pointers and other complicated types are not allowed. name must be a valid identifier not already in use. When a pointer is wrapped as a class, it can be transparently passed to any function that expects the pointer.
When combined with proxy classes, the %array_class() macro can be especially useful. For example:
%module example %include "carrays.i" %array_class(double, doubleArray); void print_array(double x[10]);
Allows you to do this:
import example
c = example.doubleArray(10) # Create double[10]
for i in range(0,10):
c[i] = 2*i # Assign values
example.print_array(c) # Pass to C
Note: These macros do not encapsulate C arrays inside a special data structure or proxy. There is no bounds checking or safety of any kind. If you want this, you should consider using a special array object rather than a bare pointer.
Note: %array_functions() and %array_class() should not be used with types of char or char *.
This module defines macros for wrapping the low-level C memory allocation functions malloc(), calloc(), realloc(), and free().
%malloc(type [,name=type])
Creates a wrapper around malloc() with the following prototype:
type *malloc_name(int nbytes = sizeof(type));
If type is void, then the size parameter nbytes is required. The name parameter only needs to be specified when wrapping a type that is not a valid identifier (e.g., " int *", "double **", etc.).
%calloc(type [,name=type])
Creates a wrapper around calloc() with the following prototype:
type *calloc_name(int nobj =1, int sz = sizeof(type));
If type is void, then the size parameter sz is required.
%realloc(type [,name=type])
Creates a wrapper around realloc() with the following prototype:
type *realloc_name(type *ptr, int nitems);
Note: unlike the C realloc(), the wrapper generated by this macro implicitly includes the size of the corresponding type. For example, realloc_int(p, 100) reallocates p so that it holds 100 integers.
%free(type [,name=type])
Creates a wrapper around free() with the following prototype:
void free_name(type *ptr);
%sizeof(type [,name=type])
Creates the constant:
%constant int sizeof_name = sizeof(type);
%allocators(type [,name=type])
Generates wrappers for all five of the above operations.
Here is a simple example that illustrates the use of these macros:
// SWIG interface %module example %include "cmalloc.i" %malloc(int); %free(int); %malloc(int *, intp); %free(int *, intp); %allocators(double);
Now, in a script:
>>> from example import * >>> a = malloc_int() >>> a '_000efa70_p_int' >>> free_int(a) >>> b = malloc_intp() >>> b '_000efb20_p_p_int' >>> free_intp(b) >>> c = calloc_double(50) >>> c '_000fab98_p_double' >>> c = realloc_double(100000) >>> free_double(c) >>> print sizeof_double 8 >>>
The cdata.i module defines functions for converting raw C data to and from strings in the target language. The primary applications of this module would be packing/unpacking of binary data structures---for instance, if you needed to extract data from a buffer. The target language must support strings with embedded binary data in order for this to work.
char *cdata(void *ptr, int nbytes)
Converts nbytes of data at ptr into a string. ptr can be any pointer.
void memmove(void *ptr, char *s)
Copies all of the string data in s into the memory pointed to by ptr. The string may contain embedded NULL bytes. The length of the string is implicitly determined in the underlying wrapper code.
One use of these functions is packing and unpacking data from memory. Here is a short example:
// SWIG interface %module example %include "carrays.i" %include "cdata.i" %array_class(int, intArray);
Python example:
>>> a = intArray(10) >>> for i in range(0,10): ... a[i] = i >>> b = cdata(a,40) >>> b '\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04 \x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t' >>> c = intArray(10) >>> memmove(c,b) >>> print c[4] 4 >>>
Since the size of data is not always known, the following macro is also defined:
%cdata(type [,name=type])
Generates the following function for extracting C data for a given type.
char *cdata_name(type* ptr, int nitems)
nitems is the number of items of the given type to extract.
Note: These functions provide direct access to memory and can be used to overwrite data. Clearly they are unsafe.
A common problem when working with C programs is dealing with functions that manipulate raw character data using char *. In part, problems arise because there are different interpretations of char *---it could be a NULL-terminated string or it could point to binary data. Moreover, functions that manipulate raw strings may mutate data, perform implicit memory allocations, or utilize fixed-sized buffers.
The problems (and perils) of using char * are well-known. However, SWIG is not in the business of enforcing morality. The modules in this section provide basic functionality for manipulating raw C strings.
Suppose you have a C function with this prototype:
char *foo(char *s);
The default wrapping behavior for this function is to set s to a raw char * that refers to the internal string data in the target language. In other words, if you were using a language like Tcl, and you wrote this,
% foo Hello
then s would point to the representation of "Hello" inside the Tcl interpreter. When returning a char *, SWIG assumes that it is a NULL-terminated string and makes a copy of it. This gives the target language its own copy of the result.
There are obvious problems with the default behavior. First, since a char * argument points to data inside the target language, it is NOT safe for a function to modify this data (doing so may corrupt the interpreter and lead to a crash). Furthermore, the default behavior does not work well with binary data. Instead, strings are assumed to be NULL-terminated.
If you have a function that expects binary data,
int parity(char *str, int len, int initial);
you can wrap the parameters (char *str, int len) as a single argument using a typemap. Just do this:
%apply (char *STRING, int LENGTH) { (char *str, int len) };
...
int parity(char *str, int len, int initial);
Now, in the target language, you can use binary string data like this:
>>> s = "H\x00\x15eg\x09\x20" >>> parity(s,0)
In the wrapper function, the passed string will be expanded to a pointer and length parameter.
If you have a function that allocates memory like this,
char *foo() {
char *result = (char *) malloc(...);
...
return result;
}
then the SWIG generated wrappers will have a memory leak--the returned data will be copied into a string object and the old contents ignored.
To fix the memory leak, use the %newobject directive.
%newobject foo; ... char *foo();
This will release the result.
The cstring.i library file provides a collection of macros for dealing with functions that either mutate string arguments or which try to output string data through their arguments. An example of such a function might be this rather questionable implementation:
void get_path(char *s) {
// Potential buffer overflow---uh, oh.
sprintf(s,"%s/%s", base_directory, sub_directory);
}
...
// Somewhere else in the C program
{
char path[1024];
...
get_path(path);
...
}
(Off topic rant: If your program really has functions like this, you would be well-advised to replace them with safer alternatives involving bounds checking).
The macros defined in this module all expand to various combinations of typemaps. Therefore, the same pattern matching rules and ideas apply.
%cstring_bounded_output(parm, maxsize)
Turns parameter parm into an output value. The output string is assumed to be NULL-terminated and smaller than maxsize characters. Here is an example:
%cstring_bounded_output(char *path, 1024); ... void get_path(char *path);
In the target language:
>>> get_path() /home/beazley/packages/Foo/Bar >>>
Internally, the wrapper function allocates a small buffer (on the stack) of the requested size and passes it as the pointer value. Data stored in the buffer is then returned as a function return value. If the function already returns a value, then the return value and the output string are returned together (multiple return values). If more than maxsize bytes are written, your program will crash with a buffer overflow!
%cstring_chunk_output(parm, chunksize)
Turns parameter parm into an output value. The output string is always chunksize and may contain binary data. Here is an example:
%cstring_chunk_output(char *packet, PACKETSIZE); ... void get_packet(char *packet);
In the target language:
>>> get_packet() '\xa9Y:\xf6\xd7\xe1\x87\xdbH;y\x97\x7f\xd3\x99\x14V\xec\x06\xea\xa2\x88' >>>
This macro is essentially identical to %cstring_bounded_output . The only difference is that the result is always chunksize characters. Furthermore, the result can contain binary data. If more than maxsize bytes are written, your program will crash with a buffer overflow!
%cstring_bounded_mutable(parm, maxsize)
Turns parameter parm into a mutable string argument. The input string is assumed to be NULL-terminated and smaller than maxsize characters. The output string is also assumed to be NULL-terminated and less than maxsize characters.
%cstring_bounded_mutable(char *ustr, 1024); ... void make_upper(char *ustr);
In the target language:
>>> make_upper("hello world")
'HELLO WORLD'
>>>
Internally, this macro is almost exactly the same as %cstring_bounded_output. The only difference is that the parameter accepts an input value that is used to initialize the internal buffer. It is important to emphasize that this function does not mutate the string value passed---instead it makes a copy of the input value, mutates it, and returns it as a result. If more than maxsize bytes are written, your program will crash with a buffer overflow!
%cstring_mutable(parm [, expansion])
Turns parameter parm into a mutable string argument. The input string is assumed to be NULL-terminated. An optional parameter expansion specifies the number of extra characters by which the string might grow when it is modified. The output string is assumed to be NULL-terminated and less than the size of the input string plus any expansion characters.
%cstring_mutable(char *ustr); ... void make_upper(char *ustr); %cstring_mutable(char *hstr, HEADER_SIZE); ... void attach_header(char *hstr);
In the target language:
>>> make_upper("hello world")
'HELLO WORLD'
>>> attach_header("Hello world")
'header: Hello world'
>>>
This macro differs from %cstring_bounded_mutable() in that a buffer is dynamically allocated (on the heap using malloc/new ). This buffer is always large enough to store a copy of the input value plus any expansion bytes that might have been requested. It is important to emphasize that this function does not directly mutate the string value passed---instead it makes a copy of the input value, mutates it, and returns it as a result. If the function expands the result by more than expansion extra bytes, then the program will crash with a buffer overflow!
%cstring_output_maxsize(parm, maxparm)
This macro is used to handle bounded character output functions where both a char * and a maximum length parameter are provided. As input, a user simply supplies the maximum length. The return value is assumed to be a NULL-terminated string.
%cstring_output_maxsize(char *path, int maxpath); ... void get_path(char *path, int maxpath);
In the target language:
>>> get_path(1024) '/home/beazley/Packages/Foo/Bar' >>>
This macro provides a safer alternative for functions that need to write string data into a buffer. User supplied buffer size is used to dynamically allocate memory on heap. Results are placed into that buffer and returned as a string object.
%cstring_output_withsize(parm, maxparm)
This macro is used to handle bounded character output functions where both a char * and a pointer int * are passed. Initially, the int * parameter points to a value containing the maximum size. On return, this value is assumed to contain the actual number of bytes. As input, a user simply supplies the maximum length. The output value is a string that may contain binary data.
%cstring_output_withsize(char *data, int *maxdata); ... void get_data(char *data, int *maxdata);
In the target language:
>>> get_data(1024) 'x627388912' >>> get_data(1024) 'xyzzy' >>>
This macro is a somewhat more powerful version of %cstring_output_chunk(). Memory is dynamically allocated and can be arbitrary large. Furthermore, a function can control how much data is actually returned by changing the value of the maxparm argument.
%cstring_output_allocate(parm, release)
This macro is used to return strings that are allocated within the program and returned in a parameter of type char **. For example:
void foo(char **s) {
*s = (char *) malloc(64);
sprintf(*s, "Hello world\n");
}
The returned string is assumed to be NULL-terminated. release specifies how the allocated memory is to be released (if applicable). Here is an example:
%cstring_output_allocate(char **s, free(*$1)); ... void foo(char **s);
In the target language:
>>> foo() 'Hello world\n' >>>
%cstring_output_allocate_size(parm, szparm, release)
This macro is used to return strings that are allocated within the program and returned in two parameters of type char ** and int *. For example:
void foo(char **s, int *sz) {
*s = (char *) malloc(64);
*sz = 64;
// Write some binary data
...
}
The returned string may contain binary data. release specifies how the allocated memory is to be released (if applicable). Here is an example:
%cstring_output_allocate_size(char **s, int *slen, free(*$1)); ... void foo(char **s, int *slen);
In the target language:
>>> foo() '\xa9Y:\xf6\xd7\xe1\x87\xdbH;y\x97\x7f\xd3\x99\x14V\xec\x06\xea\xa2\x88' >>>
This is the safest and most reliable way to return binary string data in SWIG. If you have functions that conform to another prototype, you might consider wrapping them with a helper function. For example, if you had this:
char *get_data(int *len);
You could wrap it with a function like this:
void my_get_data(char **result, int *len) {
*result = get_data(len);
}
Comments:
The library modules in this section provide access to parts of the standard C++ library including the STL. SWIG support for the STL is an ongoing effort. Support is quite comprehensive for some language modules but some of the lesser used modules do not have quite as much library code written.
The following table shows which C++ classes are supported and the equivalent SWIG interface library file for the C++ library.
| C++ class | C++ Library file | SWIG Interface library file |
| std::deque | deque | std_deque.i |
| std::list | list | std_list.i |
| std::map | map | std_map.i |
| std::pair | utility | std_pair.i |
| std::set | set | std_set.i |
| std::string | string | std_string.i |
| std::vector | vector | std_vector.i |
The list is by no means complete; some language modules support a subset of the above and some support additional STL classes. Please look for the library files in the appropriate language library directory.
The std_string.i library provides typemaps for converting C++ std::string objects to and from strings in the target scripting language. For example:
%module example %include "std_string.i" std::string foo(); void bar(const std::string &x);
In the target language:
x = foo(); # Returns a string object
bar("Hello World"); # Pass string as std::string
A common problem that people encounter is that of classes/structures containing a std::string. This can be overcome by defining a typemap. For example:
%module example
%include "std_string.i"
%apply const std::string& {std::string* foo};
struct my_struct
{
std::string foo;
};
In the target language:
x = my_struct(); x.foo="Hello World"; # assign with string print x.foo; # print as string
This module only supports types std::string and const std::string &. Pointers and non-const references are left unmodified and returned as SWIG pointers.
This library file is fully aware of C++ namespaces. If you export std::string or rename it with a typedef, make sure you include those declarations in your interface. For example:
%module example %include "std_string.i" using namespace std; typedef std::string String; ... void foo(string s, const String &t); // std_string typemaps still applied
Note: The std_string library is incompatible with Perl on some platforms. We're looking into it.
The std_vector.i library provides support for the C++ vector class in the STL. Using this library involves the use of the %template directive. All you need to do is to instantiate different versions of vector for the types that you want to use. For example:
%module example
%include "std_vector.i"
namespace std {
%template(vectori) vector<int>;
%template(vectord) vector<double>;
};
When a template vector<X> is instantiated a number of things happen:
To illustrate the use of this library, consider the following functions:
/* File : example.h */
#include <vector>
#include <algorithm>
#include <functional>
#include <numeric>
double average(std::vector<int> v) {
return std::accumulate(v.begin(),v.end(),0.0)/v.size();
}
std::vector<double> half(const std::vector<double>& v) {
std::vector<double> w(v);
for (unsigned int i=0; i<w.size(); i++)
w[i] /= 2.0;
return w;
}
void halve_in_place(std::vector<double>& v) {
std::transform(v.begin(),v.end(),v.begin(),
std::bind2nd(std::divides<double>(),2.0));
}
To wrap with SWIG, you might write the following:
%module example
%{
#include "example.h"
%}
%include "std_vector.i"
// Instantiate templates used by example
namespace std {
%template(IntVector) vector<int>;
%template(DoubleVector) vector<double>;
}
// Include the header file with above prototypes
%include "example.h"
Now, to illustrate the behavior in the scripting interpreter, consider this Python example:
>>> from example import *
>>> iv = IntVector(4) # Create an vector<int>
>>> for i in range(0,4):
... iv[i] = i
>>> average(iv) # Call method
1.5
>>> average([0,1,2,3]) # Call with list
1.5
>>> half([1,2,3]) # Half a list
(0.5,1.0,1.5)
>>> halve_in_place([1,2,3]) # Oops
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: Type error. Expected _p_std__vectorTdouble_t
>>> dv = DoubleVector(4)
>>> for i in range(0,4):
... dv[i] = i
>>> halve_in_place(dv) # Ok
>>> for i in dv:
... print i
...
0.0
0.5
1.0
1.5
>>> dv[20] = 4.5
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "example.py", line 81, in __setitem__
def __setitem__(*args): return apply(examplec.DoubleVector___setitem__,args)
IndexError: vector index out of range
>>>
This library module is fully aware of C++ namespaces. If you use vectors with other names, make sure you include the appropriate using or typedef directives. For example:
%include "std_vector.i"
namespace std {
%template(IntVector) vector<int>;
}
using namespace std;
typedef std::vector Vector;
void foo(vector<int> *x, const Vector &x);
Note: This module makes use of several advanced SWIG features including templatized typemaps and template partial specialization. If you are trying to wrap other C++ code with templates, you might look at the code contained in std_vector.i. Alternatively, you can show them the code if you want to make their head explode.
Note: This module is defined for all SWIG target languages. However argument conversion details and the public API exposed to the interpreter vary.
Note: std_vector.i was written by Luigi "The Amazing" Ballabio.
Many of the STL wrapper functions add parameter checking and will throw a language dependent error/exception should the values not be valid. The classic example is array bounds checking. The library wrappers are written to throw a C++ exception in the case of error. The C++ exception in turn gets converted into an appropriate error/exception for the target language. By and large this handling should not need customising, however, customisation can easily be achieved by supplying appropriate "throws" typemaps. For example:
%module example
%include "std_vector.i"
%typemap(throws) std::out_of_range {
// custom exception handler
}
%template(VectInt) std::vector<int>;
The custom exception handler might, for example, log the exception then convert it into a specific error/exception for the target language.
When using the STL it is advisable to add in an exception handler to catch all STL exceptions. The %exception directive can be used by placing the following code before any other methods or libraries to be wrapped:
%include "exception.i"
%exception {
try {
$action
} catch (const std::exception& e) {
SWIG_exception(SWIG_RuntimeError, e.what());
}
}
Any thrown STL exceptions will then be gracefully handled instead of causing a crash.
The exception.i library provides a language-independent function for raising a run-time exception in the target language. This library is largely used by the SWIG library writers. If possible, use the error handling scheme available to your target language as there is greater flexibility in what errors/exceptions can be thrown.
SWIG_exception(int code, const char *message)
Raises an exception in the target language. code is one of the following symbolic constants:
SWIG_MemoryError SWIG_IOError SWIG_RuntimeError SWIG_IndexError SWIG_TypeError SWIG_DivisionByZero SWIG_OverflowError SWIG_SyntaxError SWIG_ValueError SWIG_SystemError
message is a string indicating more information about the problem.
The primary use of this module is in writing language-independent exception handlers. For example:
%include "exception.i"
%exception std::vector::getitem {
try {
$action
} catch (std::out_of_range& e) {
SWIG_exception(SWIG_IndexError,const_cast<char*>(e.what()));
}
}
In Chapter 3, SWIG's treatment of basic datatypes and pointers was described. In particular, primitive types such as int and double are mapped to corresponding types in the target language. For everything else, pointers are used to refer to structures, classes, arrays, and other user-defined datatypes. However, in certain applications it is desirable to change SWIG's handling of a specific datatype. For example, you might want to return multiple values through the arguments of a function. This chapter describes some of the techniques for doing this.
This section describes the typemaps.i library file--commonly used to change certain properties of argument conversion.
Suppose you had a C function like this:
void add(double a, double b, double *result) {
*result = a + b;
}
From reading the source code, it is clear that the function is storing a value in the double *result parameter. However, since SWIG does not examine function bodies, it has no way to know that this is the underlying behavior.
One way to deal with this is to use the typemaps.i library file and write interface code like this:
// Simple example using typemaps
%module example
%include "typemaps.i"
%apply double *OUTPUT { double *result };
%inlne %{
extern void add(double a, double b, double *result);
%}
The %apply directive tells SWIG that you are going to apply a special type handling rule to a type. The "double *OUTPUT" specification is the name of a rule that defines how to return an output value from an argument of type double *. This rule gets applied to all of the datatypes listed in curly braces-- in this case " double *result".
When the resulting module is created, you can now use the function like this (shown for Python):
>>> a = add(3,4) >>> print a 7 >>>
In this case, you can see how the output value normally returned in the third argument has magically been transformed into a function return value. Clearly this makes the function much easier to use since it is no longer necessary to manufacture a special double * object and pass it to the function somehow.
Once a typemap has been applied to a type, it stays in effect for all future occurrences of the type and name. For example, you could write the following:
%module example
%include "typemaps.i"
%apply double *OUTPUT { double *result };
%inline %{
extern void add(double a, double b, double *result);
extern void sub(double a, double b, double *result);
extern void mul(double a, double b, double *result);
extern void div(double a, double b, double *result);
%}
...
In this case, the double *OUTPUT rule is applied to all of the functions that follow.
Typemap transformations can even be extended to multiple return values. For example, consider this code:
%include "typemaps.i"
%apply int *OUTPUT { int *width, int *height };
// Returns a pair (width,height)
void getwinsize(int winid, int *width, int *height);
In this case, the function returns multiple values, allowing it to be used like this:
>>> w,h = genwinsize(wid) >>> print w 400 >>> print h 300 >>>
It should also be noted that although the %apply directive is used to associate typemap rules to datatypes, you can also use the rule names directly in arguments. For example, you could write this:
// Simple example using typemaps
%module example
%include "typemaps.i"
%{
extern void add(double a, double b, double *OUTPUT);
%}
extern void add(double a, double b, double *OUTPUT);
Typemaps stay in effect until they are explicitly deleted or redefined to something else. To clear a typemap, the %clear directive should be used. For example:
%clear double *result; // Remove all typemaps for double *result
The following typemaps instruct SWIG that a pointer really only holds a single input value:
int *INPUT short *INPUT long *INPUT unsigned int *INPUT unsigned short *INPUT unsigned long *INPUT double *INPUT float *INPUT
When used, it allows values to be passed instead of pointers. For example, consider this function:
double add(double *a, double *b) {
return *a+*b;
}
Now, consider this SWIG interface:
%module example
%include "typemaps.i"
...
%{
extern double add(double *, double *);
%}
extern double add(double *INPUT, double *INPUT);
When the function is used in the scripting language interpreter, it will work like this:
result = add(3,4)
The following typemap rules tell SWIG that pointer is the output value of a function. When used, you do not need to supply the argument when calling the function. Instead, one or more output values are returned.
int *OUTPUT short *OUTPUT long *OUTPUT unsigned int *OUTPUT unsigned short *OUTPUT unsigned long *OUTPUT double *OUTPUT float *OUTPUT
These methods can be used as shown in an earlier example. For example, if you have this C function :
void add(double a, double b, double *c) {
*c = a+b;
}
A SWIG interface file might look like this :
%module example
%include "typemaps.i"
...
%inline %{
extern void add(double a, double b, double *OUTPUT);
%}
In this case, only a single output value is returned, but this is not a restriction. An arbitrary number of output values can be returned by applying the output rules to more than one argument (as shown previously).
If the function also returns a value, it is returned along with the argument. For example, if you had this:
extern int foo(double a, double b, double *OUTPUT);
The function will return two values like this:
iresult, dresult = foo(3.5, 2)
When a pointer serves as both an input and output value you can use the following typemaps :
int *INOUT short *INOUT long *INOUT unsigned int *INOUT unsigned short *INOUT unsigned long *INOUT double *INOUT float *INOUT
A C function that uses this might be something like this:
void negate(double *x) {
*x = -(*x);
}
To make x function as both and input and output value, declare the function like this in an interface file :
%module example
%include typemaps.i
...
%{
extern void negate(double *);
%}
extern void negate(double *INOUT);
Now within a script, you can simply call the function normally :
a = negate(3); # a = -3 after calling this
One subtle point of the INOUT rule is that many scripting languages enforce mutability constraints on primitive objects (meaning that simple objects like integers and strings aren't supposed to change). Because of this, you can't just modify the object's value in place as the underlying C function does in this example. Therefore, the INOUT rule returns the modified value as a new object rather than directly overwriting the value of the original input object.
Compatibility note : The INOUT rule used to be known as BOTH in earlier versions of SWIG. Backwards compatibility is preserved, but deprecated.
As previously shown, the %apply directive can be used to apply the INPUT, OUTPUT, and INOUT typemaps to different argument names. For example:
// Make double *result an output value
%apply double *OUTPUT { double *result };
// Make Int32 *in an input value
%apply int *INPUT { Int32 *in };
// Make long *x inout
%apply long *INOUT {long *x};
To clear a rule, the %clear directive is used:
%clear double *result; %clear Int32 *in, long *x;
Typemap declarations are lexically scoped so a typemap takes effect from the point of definition to the end of the file or a matching %clear declaration.
In addition to changing the handling of various input values, it is also possible to use typemaps to apply constraints. For example, maybe you want to insure that a value is positive, or that a pointer is non-NULL. This can be accomplished including the constraints.i library file.
The constraints library is best illustrated by the following interface file :
// Interface file with constraints %module example %include "constraints.i" double exp(double x); double log(double POSITIVE); // Allow only positive values double sqrt(double NONNEGATIVE); // Non-negative values only double inv(double NONZERO); // Non-zero values void free(void *NONNULL); // Non-NULL pointers only
The behavior of this file is exactly as you would expect. If any of the arguments violate the constraint condition, a scripting language exception will be raised. As a result, it is possible to catch bad values, prevent mysterious program crashes and so on.
The following constraints are currently available
POSITIVE Any number > 0 (not zero) NEGATIVE Any number < 0 (not zero) NONNEGATIVE Any number >= 0 NONPOSITIVE Any number <= 0 NONZERO Nonzero number NONNULL Non-NULL pointer (pointers only).
The constraints library only supports the primitive C datatypes, but it is easy to apply it to new datatypes using %apply. For example :
// Apply a constraint to a Real variable
%apply Number POSITIVE { Real in };
// Apply a constraint to a pointer type
%apply Pointer NONNULL { Vector * };
The special types of "Number" and "Pointer" can be applied to any numeric and pointer variable type respectively. To later remove a constraint, the %clear directive can be used :
%clear Real in; %clear Vector *;
Disclaimer: This chapter is under construction!
Chances are, you are reading this chapter for one of two reasons; you either want to customize SWIG's behavior or you overheard someone mumbling some incomprehensible drivel about "typemaps" and you asked yourself "typemaps, what are those?" That said, let's start with a short disclaimer that "typemaps" are an advanced customization feature that provide direct access to SWIG's low-level code generator. Not only that, they are an integral part of the SWIG C++ type system (a non-trivial topic of its own). Typemaps are generally not a required part of using SWIG. Therefore, you might want to re-read the earlier chapters if you have found your way to this chapter with only a vague idea of what SWIG already does by default.
One of the most important problems in wrapper code generation is the conversion of datatypes between programming languages. Specifically, for every C/C++ declaration, SWIG must somehow generate wrapper code that allows values to be passed back and forth between languages. Since every programming language represents data differently, this is not a simple of matter of simply linking code together with the C linker. Instead, SWIG has to know something about how data is represented in each language and how it can be manipulated.
To illustrate, suppose you had a simple C function like this:
int factorial(int n);
To access this function from Python, a pair of Python API functions are used to convert integer values. For example:
long PyInt_AsLong(PyObject *obj); /* Python --> C */ PyObject *PyInt_FromLong(long x); /* C --> Python */
The first function is used to convert the input argument from a Python integer object to C long. The second function is used to convert a value from C back into a Python integer object.
Inside the wrapper function, you might see these functions used like this:
PyObject *wrap_factorial(PyObject *self, PyObject *args) {
int arg1;
int result;
PyObject *obj1;
PyObject *resultobj;
if (!PyArg_ParseTuple("O:factorial", &obj1)) return NULL;
arg1 = PyInt_AsLong(obj1);
result = factorial(arg1);
resultobj = PyInt_FromLong(result);
return resultobj;
}
Every target language supported by SWIG has functions that work in a similar manner. For example, in Perl, the following functions are used:
IV SvIV(SV *sv); /* Perl --> C */ void sv_setiv(SV *sv, IV val); /* C --> Perl */
In Tcl:
int Tcl_GetLongFromObj(Tcl_Interp *interp, Tcl_Obj *obj, long *value); Tcl_Obj *Tcl_NewIntObj(long value);
The precise details are not so important. What is important is that all of the underlying type conversion is handled by collections of utility functions and short bits of C code like this---you simply have to read the extension documentation for your favorite language to know how it works (an exercise left to the reader).
Since type handling is so central to wrapper code generation, SWIG allows it to be completely defined (or redefined) by the user. To do this, a special %typemap directive is used. For example:
/* Convert from Python --> C */
%typemap(in) int {
$1 = PyInt_AsLong($input);
}
/* Convert from C --> Python */
%typemap(out) int {
$result = PyInt_FromLong($1);
}
At first glance, this code will look a little confusing. However, there is really not much to it. The first typemap (the "in" typemap) is used to convert a value from the target language to C. The second typemap (the "out" typemap) is used to convert in the other direction. The content of each typemap is a small fragment of C code that is inserted directly into the SWIG generated wrapper functions. Within this code, a number of special variables prefixed with a $ are expanded. These are really just placeholders for C variables that are generated in the course of creating the wrapper function. In this case, $input refers to an input object that needs to be converted to C and $result refers to an object that is going to be returned by a wrapper function. $1 refers to a C variable that has the same type as specified in the typemap declaration (an int in this example).
A short example might make this a little more clear. If you were wrapping a function like this:
int gcd(int x, int y);
A wrapper function would look approximately like this:
PyObject *wrap_gcd(PyObject *self, PyObject *args) {
int arg1;
int arg2;
int result;
PyObject *obj1;
PyObject *obj2;
PyObject *resultobj;
if (!PyArg_ParseTuple("OO:gcd", &obj1, &obj2)) return NULL;
/* "in" typemap, argument 1 */
{
arg1 = PyInt_AsLong(obj1);
}
/* "in" typemap, argument 2 */
{
arg2 = PyInt_AsLong(obj2);
}
result = gcd(arg1,arg2);
/* "out" typemap, return value */
{
resultobj = PyInt_FromLong(result);
}
return resultobj;
}
In this code, you can see how the typemap code has been inserted into the function. You can also see how the special $ variables have been expanded to match certain variable names inside the wrapper function. This is really the whole idea behind typemaps--they simply let you insert arbitrary code into different parts of the generated wrapper functions. Because arbitrary code can be inserted, it possible to completely change the way in which values are converted.
As the name implies, the purpose of a typemap is to "map" C datatypes to types in the target language. Once a typemap is defined for a C datatype, it is applied to all future occurrences of that type in the input file. For example:
/* Convert from Perl --> C */
%typemap(in) int {
$1 = SvIV($input);
}
...
int factorial(int n);
int gcd(int x, int y);
int count(char *s, char *t, int max);
The matching of typemaps to C datatypes is more than a simple textual match. In fact, typemaps are fully built into the underlying type system. Therefore, typemaps are unaffected by typedef, namespaces, and other declarations that might hide the underlying type. For example, you could have code like this:
/* Convert from Ruby--> C */
%typemap(in) int {
$1 = NUM2INT($input);
}
...
typedef int Integer;
namespace foo {
typedef Integer Number;
};
int foo(int x);
int bar(Integer y);
int spam(foo::Number a, foo::Number b);
In this case, the typemap is still applied to the proper arguments even though typenames don't always match the text "int". This ability to track types is a critical part of SWIG--in fact, all of the target language modules work merely define a set of typemaps for the basic types. Yet, it is never necessary to write new typemaps for typenames introduced by typedef.
In addition to tracking typenames, typemaps may also be specialized to match against a specific argument name. For example, you could write a typemap like this:
%typemap(in) double nonnegative {
$1 = PyFloat_AsDouble($input);
if ($1 < 0) {
PyErr_SetString(PyExc_ValueError,"argument must be nonnegative.");
return NULL;
}
}
...
double sin(double x);
double cos(double x);
double sqrt(double nonnegative);
typedef double Real;
double log(Real nonnegative);
...
For certain tasks such as input argument conversion, typemaps can be defined for sequences of consecutive arguments. For example:
%typemap(in) (char *str, int len) {
$1 = PyString_AsString($input); /* char *str */
$2 = PyString_Size($input); /* int len */
}
...
int count(char *str, int len, char c);
In this case, a single input object is expanded into a pair of C arguments. This example also provides a hint to the unusual variable naming scheme involving $1, $2, and so forth.
Typemaps are normally defined for specific type and argument name patterns. However, typemaps can also be copied and reused. One way to do this is to use assignment like this:
%typemap(in) Integer = int; %typemap(in) (char *buffer, int size) = (char *str, int len);
A more general form of copying is found in the %apply directive like this:
%typemap(in) int {
/* Convert an integer argument */
...
}
%typemap(out) int {
/* Return an integer value */
...
}
/* Apply all of the integer typemaps to size_t */
%apply int { size_t };
%apply merely takes all of the typemaps that are defined for one type and applies them to other types. Note: you can include a comma separated set of types in the { ... } part of %apply.
It should be noted that it is not necessary to copy typemaps for types that are related by typedef. For example, if you have this,
typedef int size_t;
then SWIG already knows that the int typemaps apply. You don't have to do anything.
The primary use of typemaps is for defining wrapper generation behavior at the level of individual C/C++ datatypes. There are currently six general categories of problems that typemaps address:
Argument handling
int foo(int x, double y, char *s);
Return value handling
int foo(int x, double y, char *s);
Exception handling
int foo(int x, double y, char *s) throw(MemoryError, IndexError);
Global variables
int foo;
Member variables
struct Foo {
int x[20];
};
Constant creation
#define FOO 3
%constant int BAR = 42;
enum { ALE, LAGER, STOUT };
Details of each of these typemaps will be covered shortly. Also, certain language modules may define additional typemaps that expand upon this list. For example, the Java module defines a variety of typemaps for controlling additional aspects of the Java bindings. Consult language specific documentation for further details.
Typemaps can't be used to define properties that apply to C/C++ declarations as a whole. For example, suppose you had a declaration like this,
Foo *make_Foo();
and you wanted to tell SWIG that make_Foo() returned a newly allocated object (for the purposes of providing better memory management). Clearly, this property of make_Foo() is not a property that would be associated with the datatype Foo * by itself. Therefore, a completely different SWIG customization mechanism (%feature) is used for this purpose. Consult the Customization Features chapter for more information about that.
Typemaps also can't be used to rearrange or transform the order of arguments. For example, if you had a function like this:
void foo(int, char *);
you can't use typemaps to interchange the arguments, allowing you to call the function like this:
foo("hello",3) # Reversed arguments
If you want to change the calling conventions of a function, write a helper function instead. For example:
%rename(foo) wrap_foo;
%inline %{
void wrap_foo(char *s, int x) {
foo(x,s);
}
%}
The rest of this chapter provides detailed information for people who want to write new typemaps. This information is of particular importance to anyone who intends to write a new SWIG target language module. Power users can also use this information to write application specific type conversion rules.
Since typemaps are strongly tied to the underlying C++ type system, subsequent sections assume that you are reasonably familiar with the basic details of values, pointers, references, arrays, type qualifiers (e.g., const), structures, namespaces, templates, and memory management in C/C++. If not, you would be well-advised to consult a copy of "The C Programming Language" by Kernighan and Ritchie or "The C++ Programming Language" by Stroustrup before going any further.
This section describes the behavior of the %typemap directive itself.
New typemaps are defined using the %typemap declaration. The general form of this declaration is as follows (parts enclosed in [ ... ] are optional):
%typemap(method [, modifiers]) typelist code ;
method is a simply a name that specifies what kind of typemap is being defined. It is usually a name like "in", "out", or "argout". The purpose of these methods is described later.
modifiers is an optional comma separated list of name="value" values. These are sometimes to attach extra information to a typemap and is often target-language dependent.
typelist is a list of the C++ type patterns that the typemap will match. The general form of this list is as follows:
typelist : typepattern [, typepattern, typepattern, ... ] ;
typepattern : type [ (parms) ]
| type name [ (parms) ]
| ( typelist ) [ (parms) ]
Each type pattern is either a simple type, a simple type and argument name, or a list of types in the case of multi-argument typemaps. In addition, each type pattern can be parameterized with a list of temporary variables (parms). The purpose of these variables will be explained shortly.
code specifies the code used in the typemap. Usually this is C/C++ code, but in the statically typed target languages, such as Java and C#, this can contain target language code for certain typemaps. It can take any one of the following forms:
code : { ... }
| " ... "
| %{ ... %}
Note that the preprocessor will expand code within the {} delimiters, but not in the last two styles of delimiters, see Preprocessor and Typemaps. Here are some examples of valid typemap specifications:
/* Simple typemap declarations */
%typemap(in) int {
$1 = PyInt_AsLong($input);
}
%typemap(in) int "$1 = PyInt_AsLong($input);";
%typemap(in) int %{
$1 = PyInt_AsLong($input);
%}
/* Typemap with extra argument name */
%typemap(in) int nonnegative {
...
}
/* Multiple types in one typemap */
%typemap(in) int, short, long {
$1 = SvIV($input);
}
/* Typemap with modifiers */
%typemap(in,doc="integer") int "$1 = gh_scm2int($input);";
/* Typemap applied to patterns of multiple arguments */
%typemap(in) (char *str, int len),
(char *buffer, int size)
{
$1 = PyString_AsString($input);
$2 = PyString_Size($input);
}
/* Typemap with extra pattern parameters */
%typemap(in, numinputs=0) int *output (int temp),
long *output (long temp)
{
$1 = &temp;
}
Admittedly, it's not the most readable syntax at first glance. However, the purpose of the individual pieces will become clear.
Once defined, a typemap remains in effect for all of the declarations that follow. A typemap may be redefined for different sections of an input file. For example:
// typemap1
%typemap(in) int {
...
}
int fact(int); // typemap1
int gcd(int x, int y); // typemap1
// typemap2
%typemap(in) int {
...
}
int isprime(int); // typemap2
One exception to the typemap scoping rules pertains to the %extend declaration. %extend is used to attach new declarations to a class or structure definition. Because of this, all of the declarations in an %extend block are subject to the typemap rules that are in effect at the point where the class itself is defined. For example:
class Foo {
...
};
%typemap(in) int {
...
}
%extend Foo {
int blah(int x); // typemap has no effect. Declaration is attached to Foo which
// appears before the %typemap declaration.
};
A typemap is copied by using assignment. For example:
%typemap(in) Integer = int;
or this:
%typemap(in) Integer, Number, int32_t = int;
Types are often managed by a collection of different typemaps. For example:
%typemap(in) int { ... }
%typemap(out) int { ... }
%typemap(varin) int { ... }
%typemap(varout) int { ... }
To copy all of these typemaps to a new type, use %apply. For example:
%apply int { Integer }; // Copy all int typemaps to Integer
%apply int { Integer, Number }; // Copy all int typemaps to both Integer and Number
The patterns for %apply follow the same rules as for %typemap. For example:
%apply int *output { Integer *output }; // Typemap with name
%apply (char *buf, int len) { (char *buffer, int size) }; // Multiple arguments
A typemap can be deleted by simply defining no code. For example:
%typemap(in) int; // Clears typemap for int %typemap(in) int, long, short; // Clears typemap for int, long, short %typemap(in) int *output;
The %clear directive clears all typemaps for a given type. For example:
%clear int; // Removes all types for int %clear int *output, long *output;
Note: Since SWIG's default behavior is defined by typemaps, clearing a fundamental type like int will make that type unusable unless you also define a new set of typemaps immediately after the clear operation.
Typemap declarations can be declared in the global scope, within a C++ namespace, and within a C++ class. For example:
%typemap(in) int {
...
}
namespace std {
class string;
%typemap(in) string {
...
}
}
class Bar {
public:
typedef const int & const_reference;
%typemap(out) const_reference {
...
}
};
When a typemap appears inside a namespace or class, it stays in effect until the end of the SWIG input (just like before). However, the typemap takes the local scope into account. Therefore, this code
namespace std {
class string;
%typemap(in) string {
...
}
}
is really defining a typemap for the type std::string. You could have code like this:
namespace std {
class string;
%typemap(in) string { /* std::string */
...
}
}
namespace Foo {
class string;
%typemap(in) string { /* Foo::string */
...
}
}
In this case, there are two completely distinct typemaps that apply to two completely different types (std::string and Foo::string).
It should be noted that for scoping to work, SWIG has to know that string is a typename defined within a particular namespace. In this example, this is done using the class declaration class string .
The section describes the pattern matching rules by which C datatypes are associated with typemaps.
Typemaps are matched using both a type and a name (typically the name of a argument). For a given TYPE NAME pair, the following rules are applied, in order, to find a match. The first typemap found is used.
If TYPE includes qualifiers (const, volatile, etc.), they are stripped and the following checks are made:
If TYPE is an array. The following transformation is made:
To illustrate, suppose that you had a function like this:
int foo(const char *s);
To find a typemap for the argument const char *s, SWIG will search for the following typemaps:
const char *s Exact type and name match const char * Exact type match char *s Type and name match (stripped qualifiers) char * Type match (stripped qualifiers)
When more than one typemap rule might be defined, only the first match found is actually used. Here is an example that shows how some of the basic rules are applied:
%typemap(in) int *x {
... typemap 1
}
%typemap(in) int * {
... typemap 2
}
%typemap(in) const int *z {
... typemap 3
}
%typemap(in) int [4] {
... typemap 4
}
%typemap(in) int [ANY] {
... typemap 5
}
void A(int *x); // int *x rule (typemap 1)
void B(int *y); // int * rule (typemap 2)
void C(const int *x); // int *x rule (typemap 1)
void D(const int *z); // int * rule (typemap 3)
void E(int x[4]); // int [4] rule (typemap 4)
void F(int x[1000]); // int [ANY] rule (typemap 5)
If no match is found using the rules in the previous section, SWIG applies a typedef reduction to the type and repeats the typemap search for the reduced type. To illustrate, suppose you had code like this:
%typemap(in) int {
... typemap 1
}
typedef int Integer;
void blah(Integer x);
To find the typemap for Integer x, SWIG will first search for the following typemaps:
Integer x Integer
Finding no match, it then applies a reduction Integer -> int to the type and repeats the search.
int x int --> match: typemap 1
Even though two types might be the same via typedef, SWIG allows typemaps to be defined for each typename independently. This allows for interesting customization possibilities based solely on the typename itself. For example, you could write code like this:
typedef double pdouble; // Positive double
// typemap 1
%typemap(in) double {
... get a double ...
}
// typemap 2
%typemap(in) pdouble {
... get a positive double ...
}
double sin(double x); // typemap 1
pdouble sqrt(pdouble x); // typemap 2
When reducing the type, only one typedef reduction is applied at a time. The search process continues to apply reductions until a match is found or until no more reductions can be made.
For complicated types, the reduction process can generate a long list of patterns. Consider the following:
typedef int Integer; typedef Integer Row4[4]; void foo(Row4 rows[10]);
To find a match for the Row4 rows[10] argument, SWIG would check the following patterns, stopping only when it found a match:
Row4 rows[10] Row4 [10] Row4 rows[ANY] Row4 [ANY] # Reduce Row4 --> Integer[4] Integer rows[10][4] Integer [10][4] Integer rows[ANY][ANY] Integer [ANY][ANY] # Reduce Integer --> int int rows[10][4] int [10][4] int rows[ANY][ANY] int [ANY][ANY]
For parameterized types like templates, the situation is even more complicated. Suppose you had some declarations like this:
typedef int Integer; typedef foo<Integer,Integer> fooii; void blah(fooii *x);
In this case, the following typemap patterns are searched for the argument fooii *x:
fooii *x fooii * # Reduce fooii --> foo<Integer,Integer> foo<Integer,Integer> *x foo<Integer,Integer> * # Reduce Integer -> int foo<int, Integer> *x foo<int, Integer> * # Reduce Integer -> int foo<int, int> *x foo<int, int> *
Typemap reductions are always applied to the left-most type that appears. Only when no reductions can be made to the left-most type are reductions made to other parts of the type. This behavior means that you could define a typemap for foo<int,Integer>, but a typemap for foo<Integer,int> would never be matched. Admittedly, this is rather esoteric--there's little practical reason to write a typemap quite like that. Of course, you could rely on this to confuse your coworkers even more.
Most SWIG language modules use typemaps to define the default behavior of the C primitive types. This is entirely straightforward. For example, a set of typemaps are written like this:
%typemap(in) int "convert an int"; %typemap(in) short "convert a short"; %typemap(in) float "convert a float"; ...
Since typemap matching follows all typedef declarations, any sort of type that is mapped to a primitive type through typedef will be picked up by one of these primitive typemaps.
The default behavior for pointers, arrays, references, and other kinds of types are handled by specifying rules for variations of the reserved SWIGTYPE type. For example:
%typemap(in) SWIGTYPE * { ... default pointer handling ... }
%typemap(in) SWIGTYPE & { ... default reference handling ... }
%typemap(in) SWIGTYPE [] { ... default array handling ... }
%typemap(in) enum SWIGTYPE { ... default handling for enum values ... }
%typemap(in) SWIGTYPE (CLASS::*) { ... default pointer member handling ... }
These rules match any kind of pointer, reference, or array--even when multiple levels of indirection or multiple array dimensions are used. Therefore, if you wanted to change SWIG's default handling for all types of pointers, you would simply redefine the rule for SWIGTYPE *.
Finally, the following typemap rule is used to match against simple types that don't match any other rules:
%typemap(in) SWIGTYPE { ... handle an unknown type ... }
This typemap is important because it is the rule that gets triggered when call or return by value is used. For instance, if you have a declaration like this:
double dot_product(Vector a, Vector b);
The Vector type will usually just get matched against SWIGTYPE. The default implementation of SWIGTYPE is to convert the value into pointers (as described in chapter 3).
By redefining SWIGTYPE it may be possible to implement other behavior. For example, if you cleared all typemaps for SWIGTYPE, SWIG simply won't wrap any unknown datatype (which might be useful for debugging). Alternatively, you might modify SWIGTYPE to marshal objects into strings instead of converting them to pointers.
The best way to explore the default typemaps is to look at the ones already defined for a particular language module. Typemaps definitions are usually found in the SWIG library in a file such as python.swg , tcl8.swg, etc.
The default typemaps described above can be mixed with const and with each other. For example the SWIGTYPE * typemap is for default pointer handling, but if a const SWIGTYPE * typemap is defined it will be used instead for constant pointers. Some further examples follow:
%typemap(in) enum SWIGTYPE & { ... enum references ... }
%typemap(in) const enum SWIGTYPE & { ... const enum references ... }
%typemap(in) SWIGTYPE *& { ... pointers passed by reference ... }
%typemap(in) SWIGTYPE * const & { ... constant pointers passed by reference ... }
%typemap(in) SWIGTYPE[ANY][ANY] { ... 2D arrays ... }
Note that the the typedef reduction described earlier is also used with these mixed default typemaps. For example, say the following typemaps are defined and SWIG is looking for the best match for the enum shown below:
%typemap(in) const Hello & { ... }
%typemap(in) const enum SWIGTYPE & { ... }
%typemap(in) enum SWIGTYPE & { ... }
%typemap(in) SWIGTYPE & { ... }
%typemap(in) SWIGTYPE { ... }
enum Hello {};
const Hello &hi;
The typemap at the top of the list will be chosen, not because it is defined first, but because it is the closest match for the type being wrapped. If any of the typemaps in the above list were not defined, then the next one on the list would have precedence. In other words the typemap chosen is the closest explicit match.
Compatibility note: The mixed default typemaps were introduced in SWIG-1.3.23, but were not used much in this version. Expect to see them being used more and more within the various libraries in later versions of SWIG.
When multi-argument typemaps are specified, they take precedence over any typemaps specified for a single type. For example:
%typemap(in) (char *buffer, int len) {
// typemap 1
}
%typemap(in) char *buffer {
// typemap 2
}
void foo(char *buffer, int len, int count); // (char *buffer, int len)
void bar(char *buffer, int blah); // char *buffer
Multi-argument typemaps are also more restrictive in the way that they are matched. Currently, the first argument follows the matching rules described in the previous section, but all subsequent arguments must match exactly.
This section describes rules by which typemap code is inserted into the generated wrapper code.
When a typemap is defined like this:
%typemap(in) int {
$1 = PyInt_AsLong($input);
}
the typemap code is inserted into the wrapper function using a new block scope. In other words, the wrapper code will look like this:
wrap_whatever() {
...
// Typemap code
{
arg1 = PyInt_AsLong(obj1);
}
...
}
Because the typemap code is enclosed in its own block, it is legal to declare temporary variables for use during typemap execution. For example:
%typemap(in) short {
long temp; /* Temporary value */
if (Tcl_GetLongFromObj(interp, $input, &temp) != TCL_OK) {
return TCL_ERROR;
}
$1 = (short) temp;
}
Of course, any variables that you declare inside a typemap are destroyed as soon as the typemap code has executed (they are not visible to other parts of the wrapper function or other typemaps that might use the same variable names).
Occasionally, typemap code will be specified using a few alternative forms. For example:
%typemap(in) int "$1 = PyInt_AsLong($input);";
%typemap(in) int %{
$1 = PyInt_AsLong($input);
%}
These two forms are mainly used for cosmetics--the specified code is not enclosed inside a block scope when it is emitted. This sometimes results in a less complicated looking wrapper function.
Sometimes it is useful to declare a new local variable that exists within the scope of the entire wrapper function. A good example of this might be an application in which you wanted to marshal strings. Suppose you had a C++ function like this
int foo(std::string *s);
and you wanted to pass a native string in the target language as an argument. For instance, in Perl, you wanted the function to work like this:
$x = foo("Hello World");
To do this, you can't just pass a raw Perl string as the std::string * argument. Instead, you have to create a temporary std::string object, copy the Perl string data into it, and then pass a pointer to the object. To do this, simply specify the typemap with an extra parameter like this:
%typemap(in) std::string * (std::string temp) {
unsigned int len;
char *s;
s = SvPV($input,len); /* Extract string data */
temp.assign(s,len); /* Assign to temp */
$1 = &temp; /* Set argument to point to temp */
}
In this case, temp becomes a local variable in the scope of the entire wrapper function. For example:
wrap_foo() {
std::string temp; <--- Declaration of temp goes here
...
/* Typemap code */
{
...
temp.assign(s,len);
...
}
...
}
When you set temp to a value, it persists for the duration of the wrapper function and gets cleaned up automatically on exit.
It is perfectly safe to use more than one typemap involving local variables in the same declaration. For example, you could declare a function as :
void foo(std::string *x, std::string *y, std::string *z);
This is safely handled because SWIG actually renames all local variable references by appending an argument number suffix. Therefore, the generated code would actually look like this:
wrap_foo() {
int *arg1; /* Actual arguments */
int *arg2;
int *arg3;
std::string temp1; /* Locals declared in the typemap */
std::string temp2;
std::string temp3;
...
{
char *s;
unsigned int len;
...
temp1.assign(s,len);
arg1 = *temp1;
}
{
char *s;
unsigned int len;
...
temp2.assign(s,len);
arg2 = &temp2;
}
{
char *s;
unsigned int len;
...
temp3.assign(s,len);
arg3 = &temp3;
}
...
}
Some typemaps do not recognize local variables (or they may simply not apply). At this time, only typemaps that apply to argument conversion support this.
Note:
When declaring a typemap for multiple types, each type must have its own local variable declaration.
%typemap(in) const std::string *, std::string * (std::string temp) // NO! // only std::string * has a local variable // const std::string * does not (oops) .... %typemap(in) const std::string * (std::string temp), std::string * (std::string temp) // Correct ....
Within all typemaps, the following special variables are expanded.
| Variable | Meaning |
|---|---|
| $n | A C local variable corresponding to type n in the typemap pattern. |
| $argnum | Argument number. Only available in typemaps related to argument conversion |
| $n_name | Argument name |
| $n_type | Real C datatype of type n. |
| $n_ltype | ltype of type n |
| $n_mangle | Mangled form of type n. For example _p_Foo |
| $n_descriptor | Type descriptor structure for type n. For example SWIGTYPE_p_Foo. This is primarily used when interacting with the run-time type checker (described later). |
| $*n_type | Real C datatype of type n with one pointer removed. |
| $*n_ltype | ltype of type n with one pointer removed. |
| $*n_mangle | Mangled form of type n with one pointer removed. |
| $*n_descriptor | Type descriptor structure for type n with one pointer removed. |
| $&n_type | Real C datatype of type n with one pointer added. |
| $&n_ltype | ltype of type n with one pointer added. |
| $&n_mangle | Mangled form of type n with one pointer added. |
| $&n_descriptor | Type descriptor structure for type n with one pointer added. |
| $n_basetype | Base typename with all pointers and qualifiers stripped. |
Within the table, $n refers to a specific type within the typemap specification. For example, if you write this
%typemap(in) int *INPUT {
}
then $1 refers to int *INPUT. If you have a typemap like this,
%typemap(in) (int argc, char *argv[]) {
...
}
then $1 refers to int argc and $2 refers to char *argv[].
Substitutions related to types and names always fill in values from the actual code that was matched. This is useful when a typemap might match multiple C datatype. For example:
%typemap(in) int, short, long {
$1 = ($1_ltype) PyInt_AsLong($input);
}
In this case, $1_ltype is replaced with the datatype that is actually matched.
When typemap code is emitted, the C/C++ datatype of the special variables $1 and $2 is always an "ltype." An "ltype" is simply a type that can legally appear on the left-hand side of a C assignment operation. Here are a few examples of types and ltypes:
type ltype ------ ---------------- int int const int int const int * int * int [4] int * int [4][5] int (*)[5]
In most cases a ltype is simply the C datatype with qualifiers stripped off. In addition, arrays are converted into pointers.
Variables such as $&1_type and $*1_type are used to safely modify the type by removing or adding pointers. Although not needed in most typemaps, these substitutions are sometimes needed to properly work with typemaps that convert values between pointers and values.
If necessary, type related substitutions can also be used when declaring locals. For example:
%typemap(in) int * ($*1_type temp) {
temp = PyInt_AsLong($input);
$1 = &temp;
}
There is one word of caution about declaring local variables in this manner. If you declare a local variable using a type substitution such as $1_ltype temp, it won't work like you expect for arrays and certain kinds of pointers. For example, if you wrote this,
%typemap(in) int [10][20] {
$1_ltype temp;
}
then the declaration of temp will be expanded as
int (*)[20] temp;
This is illegal C syntax and won't compile. There is currently no straightforward way to work around this problem in SWIG due to the way that typemap code is expanded and processed. However, one possible workaround is to simply pick an alternative type such as void * and use casts to get the correct type when needed. For example:
%typemap(in) int [10][20] {
void *temp;
...
(($1_ltype) temp)[i][j] = x; /* set a value */
...
}
Another approach, which only works for arrays is to use the $1_basetype substitution. For example:
%typemap(in) int [10][20] {
$1_basetype temp[10][20];
...
temp[i][j] = x; /* set a value */
...
}
The set of typemaps recognized by a language module may vary. However, the following typemap methods are nearly universal:
The "in" typemap is used to convert function arguments from the target language to C. For example:
%typemap(in) int {
$1 = PyInt_AsLong($input);
}
The following special variables are available:
$input - Input object holding value to be converted. $symname - Name of function/method being wrapped
This is probably the most commonly redefined typemap because it can be used to implement customized conversions.
In addition, the "in" typemap allows the number of converted arguments to be specified. The numinputs attributes facilitates this. For example:
// Ignored argument.
%typemap(in, numinputs=0) int *out (int temp) {
$1 = &temp;
}
At this time, only zero or one arguments may be converted. When numinputs is set to 0, the argument is effectively ignored and cannot be supplied from the target language. The argument is still required when making the C/C++ call and the above typemap shows the value used is instead obtained from a locally declared variable called temp. Usually numinputs is not specified, whereupon the default value is 1, that is, there is a one to one mapping of the number of arguments when used from the target language to the C/C++ call. Multi-argument typemaps provide a similar concept where the number of arguments mapped from the target language to C/C++ can be changed for more tha multiple adjacent C/C++ arguments.
Compatibility note: Specifying numinputs=0 is the same as the old "ignore" typemap.
The "typecheck" typemap is used to support overloaded functions and methods. It merely checks an argument to see whether or not it matches a specific type. For example:
%typemap(typecheck,precedence=SWIG_TYPECHECK_INTEGER) int {
$1 = PyInt_Check($input) ? 1 : 0;
}
For typechecking, the $1 variable is always a simple integer that is set to 1 or 0 depending on whether or not the input argument is the correct type.
If you define new "in" typemaps and your program uses overloaded methods, you should also define a collection of "typecheck" typemaps. More details about this follow in a later section on "Typemaps and Overloading."
The "out" typemap is used to convert function/method return values from C into the target language. For example:
%typemap(out) int {
$result = PyInt_FromLong($1);
}
The following special variables are available.
$result - Result object returned to target language. $symname - Name of function/method being wrapped
The "out" typemap supports an optional attribute flag called "optimal". This is for code optimisation and is detailed in the Optimal code generation when returning by value section.
The "arginit" typemap is used to set the initial value of a function argument--before any conversion has occurred. This is not normally necessary, but might be useful in highly specialized applications. For example:
// Set argument to NULL before any conversion occurs
%typemap(arginit) int *data {
$1 = NULL;
}
The "default" typemap is used to turn an argument into a default argument. For example:
%typemap(default) int flags {
$1 = DEFAULT_FLAGS;
}
...
int foo(int x, int y, int flags);
The primary use of this typemap is to either change the wrapping of default arguments or specify a default argument in a language where they aren't supported (like C). Target languages that do not support optional arguments, such as Java and C#, effectively ignore the value specified by this typemap as all arguments must be given.
Once a default typemap has been applied to an argument, all arguments that follow must have default values. See the Default/optional arguments section for further information on default argument wrapping.
The "check" typemap is used to supply value checking code during argument conversion. The typemap is applied after arguments have been converted. For example:
%typemap(check) int positive {
if ($1 <= 0) {
SWIG_exception(SWIG_ValueError,"Expected positive value.");
}
}
The "argout" typemap is used to return values from arguments. This is most commonly used to write wrappers for C/C++ functions that need to return multiple values. The "argout" typemap is almost always combined with an "in" typemap---possibly to ignore the input value. For example:
/* Set the input argument to point to a temporary variable */
%typemap(in, numinputs=0) int *out (int temp) {
$1 = &temp;
}
%typemap(argout) int *out {
// Append output value $1 to $result
...
}
The following special variables are available.
$result - Result object returned to target language. $input - The original input object passed. $symname - Name of function/method being wrapped
The code supplied to the "argout" typemap is always placed after the "out" typemap. If multiple return values are used, the extra return values are often appended to return value of the function.
See the typemaps.i library for examples.
The "freearg" typemap is used to cleanup argument data. It is only used when an argument might have allocated resources that need to be cleaned up when the wrapper function exits. The "freearg" typemap usually cleans up argument resources allocated by the "in" typemap. For example:
// Get a list of integers
%typemap(in) int *items {
int nitems = Length($input);
$1 = (int *) malloc(sizeof(int)*nitems);
}
// Free the list
%typemap(freearg) int *items {
free($1);
}
The "freearg" typemap inserted at the end of the wrapper function, just before control is returned back to the target language. This code is also placed into a special variable $cleanup that may be used in other typemaps whenever a wrapper function needs to abort prematurely.
The "newfree" typemap is used in conjunction with the %newobject directive and is used to deallocate memory used by the return result of a function. For example:
%typemap(newfree) string * {
delete $1;
}
%typemap(out) string * {
$result = PyString_FromString($1->c_str());
}
...
%newobject foo;
...
string *foo();
See Object ownership and %newobject for further details.
The "memberin" typemap is used to copy data from an already converted input value into a structure member. It is typically used to handle array members and other special cases. For example:
%typemap(memberin) int [4] {
memmove($1, $input, 4*sizeof(int));
}
It is rarely necessary to write "memberin" typemaps---SWIG already provides a default implementation for arrays, strings, and other objects.
The "varin" typemap is used to convert objects in the target language to C for the purposes of assigning to a C/C++ global variable. This is implementation specific.
The "varout" typemap is used to convert a C/C++ object to an object in the target language when reading a C/C++ global variable. This is implementation specific.
The "throws" typemap is only used when SWIG parses a C++ method with an exception specification or has the %catches feature attached to the method. It provides a default mechanism for handling C++ methods that have declared the exceptions they will throw. The purpose of this typemap is to convert a C++ exception into an error or exception in the target language. It is slightly different to the other typemaps as it is based around the exception type rather than the type of a parameter or variable. For example:
%typemap(throws) const char * %{
PyErr_SetString(PyExc_RuntimeError, $1);
SWIG_fail;
%}
void bar() throw (const char *);
As can be seen from the generated code below, SWIG generates an exception handler with the catch block comprising the "throws" typemap content.
...
try {
bar();
}
catch(char const *_e) {
PyErr_SetString(PyExc_RuntimeError, _e);
SWIG_fail;
}
...
Note that if your methods do not have an exception specification yet they do throw exceptions, SWIG cannot know how to deal with them. For a neat way to handle these, see the Exception handling with %exception section.
This section contains a few examples. Consult language module documentation for more examples.
A common use of typemaps is to provide support for C arrays appearing both as arguments to functions and as structure members.
For example, suppose you had a function like this:
void set_vector(int type, float value[4]);
If you wanted to handle float value[4] as a list of floats, you might write a typemap similar to this:
%typemap(in) float value[4] (float temp[4]) {
int i;
if (!PySequence_Check($input)) {
PyErr_SetString(PyExc_ValueError,"Expected a sequence");
return NULL;
}
if (PySequence_Length($input) != 4) {
PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected 4 elements");
return NULL;
}
for (i = 0; i < 4; i++) {
PyObject *o = PySequence_GetItem($input,i);
if (PyNumber_Check(o)) {
temp[i] = (float) PyFloat_AsDouble(o);
} else {
PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");
return NULL;
}
}
$1 = temp;
}
In this example, the variable temp allocates a small array on the C stack. The typemap then populates this array and passes it to the underlying C function.
When used from Python, the typemap allows the following type of function call:
>>> set_vector(type, [ 1, 2.5, 5, 20 ])
If you wanted to generalize the typemap to apply to arrays of all dimensions you might write this:
%typemap(in) float value[ANY] (float temp[$1_dim0]) {
int i;
if (!PySequence_Check($input)) {
PyErr_SetString(PyExc_ValueError,"Expected a sequence");
return NULL;
}
if (PySequence_Length($input) != $1_dim0) {
PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements");
return NULL;
}
for (i = 0; i < $1_dim0; i++) {
PyObject *o = PySequence_GetItem($input,i);
if (PyNumber_Check(o)) {
temp[i] = (float) PyFloat_AsDouble(o);
} else {
PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");
return NULL;
}
}
$1 = temp;
}
In this example, the special variable $1_dim0 is expanded with the actual array dimensions. Multidimensional arrays can be matched in a similar manner. For example:
%typemap(in) float matrix[ANY][ANY] (float temp[$1_dim0][$1_dim1]) {
... convert a 2d array ...
}
For large arrays, it may be impractical to allocate storage on the stack using a temporary variable as shown. To work with heap allocated data, the following technique can be used.
%typemap(in) float value[ANY] {
int i;
if (!PySequence_Check($input)) {
PyErr_SetString(PyExc_ValueError,"Expected a sequence");
return NULL;
}
if (PySequence_Length($input) != $1_dim0) {
PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements");
return NULL;
}
$1 = (float *) malloc($1_dim0*sizeof(float));
for (i = 0; i < $1_dim0; i++) {
PyObject *o = PySequence_GetItem($input,i);
if (PyNumber_Check(o)) {
$1[i] = (float) PyFloat_AsDouble(o);
} else {
PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");
free($1);
return NULL;
}
}
}
%typemap(freearg) float value[ANY] {
if ($1) free($1);
}
In this case, an array is allocated using malloc. The freearg typemap is then used to release the argument after the function has been called.
Another common use of array typemaps is to provide support for array structure members. Due to subtle differences between pointers and arrays in C, you can't just "assign" to a array structure member. Instead, you have to explicitly copy elements into the array. For example, suppose you had a structure like this:
struct SomeObject {
float value[4];
...
};
When SWIG runs, it won't produce any code to set the vec member. You may even get a warning message like this:
swig -python example.i Generating wrappers for Python example.i:10. Warning. Array member value will be read-only.
These warning messages indicate that SWIG does not know how you want to set the vec field.
To fix this, you can supply a special "memberin" typemap like this:
%typemap(memberin) float [ANY] {
int i;
for (i = 0; i < $1_dim0; i++) {
$1[i] = $input[i];
}
}
The memberin typemap is used to set a structure member from data that has already been converted from the target language to C. In this case, $input is the local variable in which converted input data is stored. This typemap then copies this data into the structure.
When combined with the earlier typemaps for arrays, the combination of the "in" and "memberin" typemap allows the following usage:
>>> s = SomeObject() >>> s.x = [1, 2.5, 5, 10]
Related to structure member input, it may be desirable to return structure members as a new kind of object. For example, in this example, you will get very odd program behavior where the structure member can be set nicely, but reading the member simply returns a pointer:
>>> s = SomeObject() >>> s.x = [1, 2.5, 5, 10] >>> print s.x _1008fea8_p_float >>>
To fix this, you can write an "out" typemap. For example:
%typemap(out) float [ANY] {
int i;
$result = PyList_New($1_dim0);
for (i = 0; i < $1_dim0; i++) {
PyObject *o = PyFloat_FromDouble((double) $1[i]);
PyList_SetItem($result,i,o);
}
}
Now, you will find that member access is quite nice:
>>> s = SomeObject() >>> s.x = [1, 2.5, 5, 10] >>> print s.x [ 1, 2.5, 5, 10]
Compatibility Note: SWIG1.1 used to provide a special "memberout" typemap. However, it was mostly useless and has since been eliminated. To return structure members, simply use the "out" typemap.
One particularly interesting application of typemaps is the implementation of argument constraints. This can be done with the "check" typemap. When used, this allows you to provide code for checking the values of function arguments. For example :
%module math
%typemap(check) double posdouble {
if ($1 < 0) {
croak("Expecting a positive number");
}
}
...
double sqrt(double posdouble);
This provides a sanity check to your wrapper function. If a negative number is passed to this function, a Perl exception will be raised and your program terminated with an error message.
This kind of checking can be particularly useful when working with pointers. For example :
%typemap(check) Vector * {
if ($1 == 0) {
PyErr_SetString(PyExc_TypeError,"NULL Pointer not allowed");
return NULL;
}
}
will prevent any function involving a Vector * from accepting a NULL pointer. As a result, SWIG can often prevent a potential segmentation faults or other run-time problems by raising an exception rather than blindly passing values to the underlying C/C++ program.
Note: A more advanced constraint checking system is in development. Stay tuned.
The code within typemaps is usually language dependent, however, many languages support the same typemaps. In order to distinguish typemaps across different languages, the preprocessor should be used. For example, the "in" typemap for Perl and Ruby could be written as:
#if defined(SWIGPERL) %typemap(in) int "$1 = ($1_ltype) SvIV($input);" #elif defined(SWIGRUBY) %typemap(in) int "$1 = NUM2INT($input);" #else #warning no "in" typemap defined #endif
The full set of language specific macros is defined in the Conditional Compilation section. The example above also shows a common approach of issuing a warning for an as yet unsupported language.
Compatibility note: In SWIG-1.1 different languages could be
distinguished with the language name being put within the %typemap
directive, for example,
%typemap(ruby,in) int "$1 = NUM2INT($input);".
The "out" typemap is the main typemap for return types. This typemap supports an optional attribute flag called "optimal", which is for reducing temporary variables and the amount of generated code. It only really makes a difference when returning objects by value and it cannot always be used, as explained later on.
When a function returns an object by value, SWIG generates code that instantiates the default type on the stack then assigns the value returned by the function call to it. A copy of this object is then made on the heap and this is what is ultimately stored and used from the target language. This will be clearer considering an example. Consider running the following code through SWIG:
%typemap(out) SWIGTYPE %{
$result = new $1_ltype((const $1_ltype &)$1);
%}
%inline %{
#include <iostream>
using namespace std;
struct XX {
XX() { cout << "XX()" << endl; }
XX(int i) { cout << "XX(" << i << ")" << endl; }
XX(const XX &other) { cout << "XX(const XX &)" << endl; }
XX & operator =(const XX &other) { cout << "operator=(const XX &)" << endl; return *this; }
~XX() { cout << "~XX()" << endl; }
static XX create() {
return XX(0);
}
};
%}
The "out" typemap shown is the default typemap for C# when returning by objects by value. When making a call to XX::create() from C#, the output is as follows:
XX() XX(0) operator=(const XX &) ~XX() XX(const XX &) ~XX() ~XX()
Note that three objects are being created as well as an assignment. Wouldn't it be great if the XX::create() method was the only time a constructor was called? As the method returns by value, this is asking a lot and the code that SWIG generates by default makes it impossible for the compiler to make this type of optimisation. However, this is where the "optimal" attribute in the "out" typemap can help out. If the typemap code is kept the same and just the "optimal" attribute specified like this:
%typemap(out, optimal="1") SWIGTYPE %{
$result = new $1_ltype((const $1_ltype &)$1);
%}
then when the code is run again, the output is simply:
XX(0) ~XX()
How the "optimal" attribute works is best explained using the generated code. Without "optimal", the generated code is:
SWIGEXPORT void * SWIGSTDCALL CSharp_XX_create() {
void * jresult ;
XX result;
result = XX::create();
jresult = new XX((const XX &)result);
return jresult;
}
With the "optimal" attribute, the code is:
SWIGEXPORT void * SWIGSTDCALL CSharp_XX_create() {
void * jresult ;
jresult = new XX((const XX &)XX::create());
return jresult;
}
The major difference is the result temporary variable holding the value returned from XX::create() is no longer generated and instead the copy constructor call is made directly from the value returned by XX::create(). With modern compiler optimisations turned on, the copy is not actually done, in fact the object is never created on the stack in XX::create() at all, it is simply created directly on the heap. In the first instance, the $1 special variable in the typemap is expanded into result . In the second instance, $1 is expanded into XX::create() and this is essentially what the "optimal" attribute is telling SWIG to do.
This kind of optimisation is not turned on by default as it has a number of restrictions. Firstly, some code cannot be condensed into a simple call for passing into the copy constructor. One common occurrence is when %exception is used. Consider adding the following %exception to the example:
%exception XX::create() %{
try {
$action
} catch(const std::exception &e) {
cout << e.what() << endl;
}
%}
SWIG can detect when the "optimal" attribute cannot be used and will ignore it and in this case will issue the following warning:
example.i:28: Warning(474): Method XX::create() usage of the optimal attribute in the out
typemap at example.i:14 ignored as the following cannot be used to generate optimal code:
try {
result = XX::create();
} catch(const std::exception &e) {
cout << e.what() << endl;
}
It should be clear that the above code cannot be used as the argument to the copy constructor call, ie for the $1 substitution.
Secondly, if the typemaps uses $1 more than once, then multiple calls to the wrapped function will be made. Obviously that is not very optimal. In fact SWIG attempts to detect this and will issue a warning something like:
example.i:21: Warning(475): Multiple calls to XX::create() might be generated due to optimal attribute usage in the out typemap at example.i:7.
However, it doesn't always get it right, for example when $1 is within some commented out code.
So far, the typemaps presented have focused on the problem of dealing with single values. For example, converting a single input object to a single argument in a function call. However, certain conversion problems are difficult to handle in this manner. As an example, consider the example at the very beginning of this chapter:
int foo(int argc, char *argv[]);
Suppose that you wanted to wrap this function so that it accepted a single list of strings like this:
>>> foo(["ale","lager","stout"])
To do this, you not only need to map a list of strings to char *argv[], but the value of int argc is implicitly determined by the length of the list. Using only simple typemaps, this type of conversion is possible, but extremely painful. Therefore, SWIG1.3 introduces the notion of multi-argument typemaps.
A multi-argument typemap is a conversion rule that specifies how to convert a single object in the target language to set of consecutive function arguments in C/C++. For example, the following multi-argument maps perform the conversion described for the above example:
%typemap(in) (int argc, char *argv[]) {
int i;
if (!PyList_Check($input)) {
PyErr_SetString(PyExc_ValueError, "Expecting a list");
return NULL;
}
$1 = PyList_Size($input);
$2 = (char **) malloc(($1+1)*sizeof(char *));
for (i = 0; i < $1; i++) {
PyObject *s = PyList_GetItem($input,i);
if (!PyString_Check(s)) {
free($2);
PyErr_SetString(PyExc_ValueError, "List items must be strings");
return NULL;
}
$2[i] = PyString_AsString(s);
}
$2[i] = 0;
}
%typemap(freearg) (int argc, char *argv[]) {
if ($2) free($2);
}
A multi-argument map is always specified by surrounding the arguments with parentheses as shown. For example:
%typemap(in) (int argc, char *argv[]) { ... }
Within the typemap code, the variables $1, $2, and so forth refer to each type in the map. All of the usual substitutions apply--just use the appropriate $1 or $2 prefix on the variable name (e.g., $2_type, $1_ltype, etc.)
Multi-argument typemaps always have precedence over simple typemaps and SWIG always performs longest-match searching. Therefore, you will get the following behavior:
%typemap(in) int argc { ... typemap 1 ... }
%typemap(in) (int argc, char *argv[]) { ... typemap 2 ... }
%typemap(in) (int argc, char *argv[], char *env[]) { ... typemap 3 ... }
int foo(int argc, char *argv[]); // Uses typemap 2
int bar(int argc, int x); // Uses typemap 1
int spam(int argc, char *argv[], char *env[]); // Uses typemap 3
It should be stressed that multi-argument typemaps can appear anywhere in a function declaration and can appear more than once. For example, you could write this:
%typemap(in) (int scount, char *swords[]) { ... }
%typemap(in) (int wcount, char *words[]) { ... }
void search_words(int scount, char *swords[], int wcount, char *words[], int maxcount);
Other directives such as %apply and %clear also work with multi-argument maps. For example:
%apply (int argc, char *argv[]) {
(int scount, char *swords[]),
(int wcount, char *words[])
};
...
%clear (int scount, char *swords[]), (int wcount, char *words[]);
...
Although multi-argument typemaps may seem like an exotic, little used feature, there are several situations where they make sense. First, suppose you wanted to wrap functions similar to the low-level read() and write() system calls. For example:
typedef unsigned int size_t; int read(int fd, void *rbuffer, size_t len); int write(int fd, void *wbuffer, size_t len);
As is, the only way to use the functions would be to allocate memory and pass some kind of pointer as the second argument---a process that might require the use of a helper function. However, using multi-argument maps, the functions can be transformed into something more natural. For example, you might write typemaps like this:
// typemap for an outgoing buffer
%typemap(in) (void *wbuffer, size_t len) {
if (!PyString_Check($input)) {
PyErr_SetString(PyExc_ValueError, "Expecting a string");
return NULL;
}
$1 = (void *) PyString_AsString($input);
$2 = PyString_Size($input);
}
// typemap for an incoming buffer
%typemap(in) (void *rbuffer, size_t len) {
if (!PyInt_Check($input)) {
PyErr_SetString(PyExc_ValueError, "Expecting an integer");
return NULL;
}
$2 = PyInt_AsLong($input);
if ($2 < 0) {
PyErr_SetString(PyExc_ValueError, "Positive integer expected");
return NULL;
}
$1 = (void *) malloc($2);
}
// Return the buffer. Discarding any previous return result
%typemap(argout) (void *rbuffer, size_t len) {
Py_XDECREF($result); /* Blow away any previous result */
if (result < 0) { /* Check for I/O error */
free($1);
PyErr_SetFromErrno(PyExc_IOError);
return NULL;
}
$result = PyString_FromStringAndSize($1,result);
free($1);
}
(note: In the above example, $result and result are two different variables. result is the real C datatype that was returned by the function. $result is the scripting language object being returned to the interpreter.).
Now, in a script, you can write code that simply passes buffers as strings like this:
>>> f = example.open("Makefile")
>>> example.read(f,40)
'TOP = ../..\nSWIG = $(TOP)/.'
>>> example.read(f,40)
'./swig\nSRCS = example.c\nTARGET '
>>> example.close(f)
0
>>> g = example.open("foo", example.O_WRONLY | example.O_CREAT, 0644)
>>> example.write(g,"Hello world\n")
12
>>> example.write(g,"This is a test\n")
15
>>> example.close(g)
0
>>>
A number of multi-argument typemap problems also arise in libraries that perform matrix-calculations--especially if they are mapped onto low-level Fortran or C code. For example, you might have a function like this:
int is_symmetric(double *mat, int rows, int columns);
In this case, you might want to pass some kind of higher-level object as an matrix. To do this, you could write a multi-argument typemap like this:
%typemap(in) (double *mat, int rows, int columns) {
MatrixObject *a;
a = GetMatrixFromObject($input); /* Get matrix somehow */
/* Get matrix properties */
$1 = GetPointer(a);
$2 = GetRows(a);
$3 = GetColumns(a);
}
This kind of technique can be used to hook into scripting-language matrix packages such as Numeric Python. However, it should also be stressed that some care is in order. For example, when crossing languages you may need to worry about issues such as row-major vs. column-major ordering (and perform conversions if needed).
Most scripting languages need type information at run-time. This type information can include how to construct types, how to garbage collect types, and the inheritance relationships between types. If the language interface does not provide its own type information storage, the generated SWIG code needs to provide it.
Requirements for the type system:
The run-time type checker is used by many, but not all, of SWIG's supported target languages. The run-time type checker features are not required and are thus not used for strongly typed languages such as Java and C#. The scripting and scheme based languages rely on it and it forms a critical part of SWIG's operation for these languages.
When pointers, arrays, and objects are wrapped by SWIG, they are normally converted into typed pointer objects. For example, an instance of Foo * might be a string encoded like this:
_108e688_p_Foo
At a basic level, the type checker simply restores some type-safety to extension modules. However, the type checker is also responsible for making sure that wrapped C++ classes are handled correctly---especially when inheritance is used. This is especially important when an extension module makes use of multiple inheritance. For example:
class Foo {
int x;
};
class Bar {
int y;
};
class FooBar : public Foo, public Bar {
int z;
};
When the class FooBar is organized in memory, it contains the contents of the classes Foo and Bar as well as its own data members. For example:
FooBar --> | -----------| <-- Foo
| int x |
|------------| <-- Bar
| int y |
|------------|
| int z |
|------------|
Because of the way that base class data is stacked together, the casting of a Foobar * to either of the base classes may change the actual value of the pointer. This means that it is generally not safe to represent pointers using a simple integer or a bare void * ---type tags are needed to implement correct handling of pointer values (and to make adjustments when needed).
In the wrapper code generated for each language, pointers are handled through the use of special type descriptors and conversion functions. For example, if you look at the wrapper code for Python, you will see code like this:
if ((SWIG_ConvertPtr(obj0,(void **) &arg1, SWIGTYPE_p_Foo,1)) == -1) return NULL;
In this code, SWIGTYPE_p_Foo is the type descriptor that describes Foo *. The type descriptor is actually a pointer to a structure that contains information about the type name to use in the target language, a list of equivalent typenames (via typedef or inheritance), and pointer value handling information (if applicable). The SWIG_ConvertPtr() function is simply a utility function that takes a pointer object in the target language and a type-descriptor objects and uses this information to generate a C++ pointer. However, the exact name and calling conventions of the conversion function depends on the target language (see language specific chapters for details).
The actual type code is in swigrun.swg, and gets inserted near the top of the generated swig wrapper file. The phrase "a type X that can cast into a type Y" means that given a type X, it can be converted into a type Y. In other words, X is a derived class of Y or X is a typedef of Y. The structure to store type information looks like this:
/* Structure to store information on one type */
typedef struct swig_type_info {
const char *name; /* mangled name of this type */
const char *str; /* human readable name for this type */
swig_dycast_func dcast; /* dynamic cast function down a hierarchy */
struct swig_cast_info *cast; /* Linked list of types that can cast into this type */
void *clientdata; /* Language specific type data */
} swig_type_info;
/* Structure to store a type and conversion function used for casting */
typedef struct swig_cast_info {
swig_type_info *type; /* pointer to type that is equivalent to this type */
swig_converter_func converter; /* function to cast the void pointers */
struct swig_cast_info *next; /* pointer to next cast in linked list */
struct swig_cast_info *prev; /* pointer to the previous cast */
} swig_cast_info;
Each swig_type_info stores a linked list of types that it is equivalent to. Each entry in this doubly linked list stores a pointer back to another swig_type_info structure, along with a pointer to a conversion function. This conversion function is used to solve the above problem of the FooBar class, correctly returning a pointer to the type we want.
The basic problem we need to solve is verifying and building arguments passed to functions. So going back to the SWIG_ConvertPtr() function example from above, we are expecting a Foo * and need to check if obj0 is in fact a Foo * . From before, SWIGTYPE_p_Foo is just a pointer to the swig_type_info structure describing Foo *. So we loop through the linked list of swig_cast_info structures attached to SWIGTYPE_p_Foo. If we see that the type of obj0 is in the linked list, we pass the object through the associated conversion function and then return a positive. If we reach the end of the linked list without a match, then obj0 can not be converted to a Foo * and an error is generated.
Another issue needing to be addressed is sharing type information between multiple modules. More explicitly, we need to have ONE swig_type_info for each type. If two modules both use the type, the second module loaded must lookup and use the swig_type_info structure from the module already loaded. Because no dynamic memory is used and the circular dependencies of the casting information, loading the type information is somewhat tricky, and not explained here. A complete description is in the Lib/swiginit.swg file (and near the top of any generated file).
Each module has one swig_module_info structure which looks like this:
/* Structure used to store module information
* Each module generates one structure like this, and the runtime collects
* all of these structures and stores them in a circularly linked list.*/
typedef struct swig_module_info {
swig_type_info **types; /* Array of pointers to swig_type_info structs in this module */
int size; /* Number of types in this module */
struct swig_module_info *next; /* Pointer to next element in circularly linked list */
swig_type_info **type_initial; /* Array of initially generated type structures */
swig_cast_info **cast_initial; /* Array of initially generated casting structures */
void *clientdata; /* Language specific module data */
} swig_module_info;
Each module stores an array of pointers to swig_type_info structures and the number of types in this module. So when a second module is loaded, it finds the swig_module_info structure for the first module and searches the array of types. If any of its own types are in the first module and have already been loaded, it uses those swig_type_info structures rather than creating new ones. These swig_module_info structures are chained together in a circularly linked list.
This section covers how to use these functions from typemaps. To learn how to call these functions from external files (not the generated _wrap.c file), see the External access to the run-time system section.
When pointers are converted in a typemap, the typemap code often looks similar to this:
%typemap(in) Foo * {
if ((SWIG_ConvertPtr($input, (void **) &$1, $1_descriptor)) == -1) return NULL;
}
The most critical part is the typemap is the use of the $1_descriptor special variable. When placed in a typemap, this is expanded into the SWIGTYPE_* type descriptor object above. As a general rule, you should always use $1_descriptor instead of trying to hard-code the type descriptor name directly.
There is another reason why you should always use the $1_descriptor variable. When this special variable is expanded, SWIG marks the corresponding type as "in use." When type-tables and type information is emitted in the wrapper file, descriptor information is only generated for those datatypes that were actually used in the interface. This greatly reduces the size of the type tables and improves efficiency.
Occasionally, you might need to write a typemap that needs to convert pointers of other types. To handle this, a special macro substitution $descriptor(type) can be used to generate the SWIG type descriptor name for any C datatype. For example:
%typemap(in) Foo * {
if ((SWIG_ConvertPtr($input, (void **) &$1, $1_descriptor)) == -1) {
Bar *temp;
if ((SWIG_ConvertPtr($input, (void **) &temp, $descriptor(Bar *)) == -1) {
return NULL;
}
$1 = (Foo *) temp;
}
}
The primary use of $descriptor(type) is when writing typemaps for container objects and other complex data structures. There are some restrictions on the argument---namely it must be a fully defined C datatype. It can not be any of the special typemap variables.
In certain cases, SWIG may not generate type-descriptors like you expect. For example, if you are converting pointers in some non-standard way or working with an unusual combination of interface files and modules, you may find that SWIG omits information for a specific type descriptor. To fix this, you may need to use the %types directive. For example:
%types(int *, short *, long *, float *, double *);
When %types is used, SWIG generates type-descriptor information even if those datatypes never appear elsewhere in the interface file.
Further details about the run-time type checking can be found in the documentation for individual language modules. Reading the source code may also help. The file Lib/swigrun.swg in the SWIG library contains all of the source code for type-checking. This code is also included in every generated wrapped file so you probably just look at the output of SWIG to get a better sense for how types are managed.
In many target languages, SWIG fully supports C++ overloaded methods and functions. For example, if you have a collection of functions like this:
int foo(int x); int foo(double x); int foo(char *s, int y);
You can access the functions in a normal way from the scripting interpreter:
# Python
foo(3) # foo(int)
foo(3.5) # foo(double)
foo("hello",5) # foo(char *, int)
# Tcl
foo 3 # foo(int)
foo 3.5 # foo(double)
foo hello 5 # foo(char *, int)
To implement overloading, SWIG generates a separate wrapper function for each overloaded method. For example, the above functions would produce something roughly like this:
// wrapper pseudocode
_wrap_foo_0(argc, args[]) { // foo(int)
int arg1;
int result;
...
arg1 = FromInteger(args[0]);
result = foo(arg1);
return ToInteger(result);
}
_wrap_foo_1(argc, args[]) { // foo(double)
double arg1;
int result;
...
arg1 = FromDouble(args[0]);
result = foo(arg1);
return ToInteger(result);
}
_wrap_foo_2(argc, args[]) { // foo(char *, int)
char *arg1;
int arg2;
int result;
...
arg1 = FromString(args[0]);
arg2 = FromInteger(args[1]);
result = foo(arg1,arg2);
return ToInteger(result);
}
Next, a dynamic dispatch function is generated:
_wrap_foo(argc, args[]) {
if (argc == 1) {
if (IsInteger(args[0])) {
return _wrap_foo_0(argc,args);
}
if (IsDouble(args[0])) {
return _wrap_foo_1(argc,args);
}
}
if (argc == 2) {
if (IsString(args[0]) && IsInteger(args[1])) {
return _wrap_foo_2(argc,args);
}
}
error("No matching function!\n");
}
The purpose of the dynamic dispatch function is to select the appropriate C++ function based on argument types---a task that must be performed at runtime in most of SWIG's target languages.
The generation of the dynamic dispatch function is a relatively tricky affair. Not only must input typemaps be taken into account (these typemaps can radically change the types of arguments accepted), but overloaded methods must also be sorted and checked in a very specific order to resolve potential ambiguity. A high-level overview of this ranking process is found in the "SWIG and C++ " chapter. What isn't mentioned in that chapter is the mechanism by which it is implemented---as a collection of typemaps.
To support dynamic dispatch, SWIG first defines a general purpose type hierarchy as follows:
Symbolic Name Precedence Value ------------------------------ ------------------ SWIG_TYPECHECK_POINTER 0 SWIG_TYPECHECK_VOIDPTR 10 SWIG_TYPECHECK_BOOL 15 SWIG_TYPECHECK_UINT8 20 SWIG_TYPECHECK_INT8 25 SWIG_TYPECHECK_UINT16 30 SWIG_TYPECHECK_INT16 35 SWIG_TYPECHECK_UINT32 40 SWIG_TYPECHECK_INT32 45 SWIG_TYPECHECK_UINT64 50 SWIG_TYPECHECK_INT64 55 SWIG_TYPECHECK_UINT128 60 SWIG_TYPECHECK_INT128 65 SWIG_TYPECHECK_INTEGER 70 SWIG_TYPECHECK_FLOAT 80 SWIG_TYPECHECK_DOUBLE 90 SWIG_TYPECHECK_COMPLEX 100 SWIG_TYPECHECK_UNICHAR 110 SWIG_TYPECHECK_UNISTRING 120 SWIG_TYPECHECK_CHAR 130 SWIG_TYPECHECK_STRING 140 SWIG_TYPECHECK_BOOL_ARRAY 1015 SWIG_TYPECHECK_INT8_ARRAY 1025 SWIG_TYPECHECK_INT16_ARRAY 1035 SWIG_TYPECHECK_INT32_ARRAY 1045 SWIG_TYPECHECK_INT64_ARRAY 1055 SWIG_TYPECHECK_INT128_ARRAY 1065 SWIG_TYPECHECK_FLOAT_ARRAY 1080 SWIG_TYPECHECK_DOUBLE_ARRAY 1090 SWIG_TYPECHECK_CHAR_ARRAY 1130 SWIG_TYPECHECK_STRING_ARRAY 1140
(These precedence levels are defined in swig.swg, a library file that's included by all target language modules.)
In this table, the precedence-level determines the order in which types are going to be checked. Low values are always checked before higher values. For example, integers are checked before floats, single values are checked before arrays, and so forth.
Using the above table as a guide, each target language defines a collection of "typecheck" typemaps. The follow excerpt from the Python module illustrates this:
/* Python type checking rules */
/* Note: %typecheck(X) is a macro for %typemap(typecheck,precedence=X) */
%typecheck(SWIG_TYPECHECK_INTEGER)
int, short, long,
unsigned int, unsigned short, unsigned long,
signed char, unsigned char,
long long, unsigned long long,
const int &, const short &, const long &,
const unsigned int &, const unsigned short &, const unsigned long &,
const long long &, const unsigned long long &,
enum SWIGTYPE,
bool, const bool &
{
$1 = (PyInt_Check($input) || PyLong_Check($input)) ? 1 : 0;
}
%typecheck(SWIG_TYPECHECK_DOUBLE)
float, double,
const float &, const double &
{
$1 = (PyFloat_Check($input) || PyInt_Check($input) || PyLong_Check($input)) ? 1 : 0;
}
%typecheck(SWIG_TYPECHECK_CHAR) char {
$1 = (PyString_Check($input) && (PyString_Size($input) == 1)) ? 1 : 0;
}
%typecheck(SWIG_TYPECHECK_STRING) char * {
$1 = PyString_Check($input) ? 1 : 0;
}
%typecheck(SWIG_TYPECHECK_POINTER) SWIGTYPE *, SWIGTYPE &, SWIGTYPE [] {
void *ptr;
if (SWIG_ConvertPtr($input, (void **) &ptr, $1_descriptor, 0) == -1) {
$1 = 0;
PyErr_Clear();
} else {
$1 = 1;
}
}
%typecheck(SWIG_TYPECHECK_POINTER) SWIGTYPE {
void *ptr;
if (SWIG_ConvertPtr($input, (void **) &ptr, $&1_descriptor, 0) == -1) {
$1 = 0;
PyErr_Clear();
} else {
$1 = 1;
}
}
%typecheck(SWIG_TYPECHECK_VOIDPTR) void * {
void *ptr;
if (SWIG_ConvertPtr($input, (void **) &ptr, 0, 0) == -1) {
$1 = 0;
PyErr_Clear();
} else {
$1 = 1;
}
}
%typecheck(SWIG_TYPECHECK_POINTER) PyObject *
{
$1 = ($input != 0);
}
It might take a bit of contemplation, but this code has merely organized all of the basic C++ types, provided some simple type-checking code, and assigned each type a precedence value.
Finally, to generate the dynamic dispatch function, SWIG uses the following algorithm:
If you haven't written any typemaps of your own, it is unnecessary to worry about the typechecking rules. However, if you have written new input typemaps, you might have to supply a typechecking rule as well. An easy way to do this is to simply copy one of the existing typechecking rules. Here is an example,
// Typemap for a C++ string
%typemap(in) std::string {
if (PyString_Check($input)) {
$1 = std::string(PyString_AsString($input));
} else {
SWIG_exception(SWIG_TypeError, "string expected");
}
}
// Copy the typecheck code for "char *".
%typemap(typecheck) std::string = char *;
The bottom line: If you are writing new typemaps and you are using overloaded methods, you will probably have to write typecheck code or copy existing code. Since this is a relatively new SWIG feature, there are few examples to work with. However, you might look at some of the existing library files likes 'typemaps.i' for a guide.
Notes:
In order to implement certain kinds of program behavior, it is sometimes necessary to write sets of typemaps. For example, t