Wrap Your Code in Boilerplate

The entire implementation of an extension primarily revolves around the

“wrapping” concept that we introduced earlier in Section 13.15.1. You should design your code in such a way that there is a smooth transition between the world of Python and your implementing language. This interfacing code is commonly called “boilerplate” code because it is a necessity if your code is to talk to the Python interpreter.

There are four main pieces to the boilerplate software:

1. Include Python header file

2. Add PyObject* Module_func() Python wrappers for each module function

3. Add PyMethodDef ModuleMethods[] array/table for each module function

4. Add void initModule() module initializer function

Include Python Header File

The first thing you should do is to find your Python include files and make sure your compiler has access to that directory. On most Unix-based systems, this would be either /usr/local/include/python2.x or /usr/

include/python2.x, where the “2.x” is your version of Python. If you compiled and installed your Python interpreter, you should not have a problem because the system generally knows where your files are installed.

Add the inclusion of the Python.h header file to your source. The line will look something like:

#include "Python.h"

That is the easy part. Now you have to add the rest of the boilerplate software.

ptg 22.2 Extending Python by Writing Extensions 969

Add PyObject* Module_func() Python Wrappers for Each Function

This part is the trickiest. For each function you want accessible to the Python environment, you will create a staticPyObject* function with the module name along with an underscore ( _ ) prepended to it.

For example, we want fac() to be one of the functions available for import from Python and we will use Extest as the name of our final module, so we create a “wrapper” called Extest_fac(). In the client Python script, there will be an “import Extest” and an “Extest.fac()” call some- where (or just “fac()” for “from Extestimportfac”).

The job of the wrapper is to take Python values, convert them to C, then make a call to the appropriate function with what we want. When our function has completed, and it is time to return to the world of Python, it is also the job of this wrapper to take whatever return values we designate, convert them to Python, and then perform the return, passing back any values as necessary.

In the case of fac(), when the client program invokes Extest.fac(), our wrapper will be called. We will accept a Python integer, convert it to a C integer, call our C function fac() and obtain another integer result. We then have to take that return value, convert it back to a Python integer, then return from the call. (In your head, try to keep in mind that you are writing the code that will proxy for a “def fac(n)” declaration. When you are returning, it is as if that imaginary Python fac() function is completing.)

So, you’re asking, how does this conversion take place? The answer is with the PyArg_Parse*() functions when going from Python to C, and Py_- BuildValue() when returning from C to Python.

The PyArg_Parse*() functions are similar to the C sscanf() function. It takes a stream of bytes, and, according to some format string, parcels them off to corresponding container variables, which, as expected, take pointer addresses. They both return 1 on successful parsing and 0 otherwise.

Py_BuildValue() works like sprintf(), taking a format string and converting all arguments to a single returned object containing those values in the formats that you requested.

You will find a summary of these functions in Table 22.1.

A set of conversion codes is used to convert data objects between C and Python; they are given in Table 22.2.

These conversion codes are the ones given in the respective format strings that dictate how the values should be converted when moving between both languages. Note: The conversion types are different for Java since all data types are classes. Consult the Jython documentation to obtain the corresponding Java types for Python objects. The same applies for C# and VB.NET.

ptg 970 Chapter 22 Extending Python

Table 22.1 Converting Data Between Python and C/C++

Function Description

Python to C int

PyArg_ParseTuple()

Converts (a tuple of) arguments passed from Python to C int

PyArg_ParseTupleAndKeywords()

Same as PyArg_ParseTuple() but also parses keyword arguments C to Python

PyObject*

Py_BuildValue()

Converts C data values into a Python return object, either a single object or a single tuple of objects

Table 22.2 Common Codes to Convert Data Between Python and C/C++

Format Code Python Type C/C++ Type

s str char*

z str/None char*/NULL

i int int

l long long

c str char

d float double

D complex Py_Complex*

O (any) PyObject*

S str PyStringObject

ptg 22.2 Extending Python by Writing Extensions 971

Here we show you our completed Extest_fac() wrapper function:

static PyObject *

Extest_fac(PyObject *self, PyObject *args) {

int res; // parse result

int num; // arg for fac() PyObject* retval; // return value res = PyArg_ParseTuple(args, "i", &num);

if (!res) { // TypeError return NULL;

}

res = fac(num);

retval = (PyObject*)Py_BuildValue("i", res);

return retval;

}

The first step is to parse the data received from Python. It should be a reg- ular integer, so we use the “i” conversion code to indicate as such. If the value was indeed an integer, then it gets stored in the num variable. Otherwise, PyArg_ParseTuple() will return a NULL, in which case we also return one. In our case, it will generate a TypeError exception that tells the client user that we are expecting an integer.

We then call fac() with the value stored in num and put the result in res, reusing that variable. Now we build our return object, a Python integer, again using a conversion code of “i.” Py_BuildValue() creates an integer Python object which we then return. That’s all there is to it!

In fact, once you have created wrapper after wrapper, you tend to shorten your code somewhat to avoid extraneous use of variables. Try to keep your code legible, though. We take our Extest_fac() function and reduce it to its smaller version given here, using only one variable, num:

static PyObject *

Extest_fac(PyObject *self, PyObject *args) { int num;

if (!PyArg_ParseTuple(args, "i", &num)) return NULL;

return (PyObject*)Py_BuildValue("i", fac(num));

}

What about reverse()? Well, since you already know how to return a single value, we are going to change our reverse() example somewhat, returning two values instead of one. We will return a pair of strings as a tuple, the first element being the string as passed in to us, and the second being the newly reversed string.

ptg 972 Chapter 22 Extending Python

To show you that there is some flexibility, we will call this function Extest.doppel() to indicate that its behavior differs from reverse(). Wrapping our code into an Extest_doppel() function, we get:

static PyObject *

Extest_doppel(PyObject *self, PyObject *args) { char *orig_str;

if (!PyArg_ParseTuple(args, "s", &orig_str)) return NULL;

return (PyObject*)Py_BuildValue("ss", orig_str, \ reverse(strdup(orig_str)));

}

As in Extest_fac(), we take a single input value, this time a string, and store it into orig_str. Notice that we use the “s” conversion code now. We then call strdup() to create a copy of the string. (Since we want to return the original one as well, we need a string to reverse, so the best candidate is just a copy of the string.) strdup() creates and returns a copy, which we immediate dispatch to reverse(). We get back a reversed string.

As you can see, Py_BuildValue() puts together both strings using a conversion string of “ss.” This creates a tuple of two strings, the original string and the reversed one. End of story, right? Unfortunately, no.

We got caught by one of the perils of C programming: the memory leak, that is, when memory is allocated but not freed. Memory leaks are analogous to bor- rowing books from the library but not returning them. You should always release resources that you have acquired when you no longer require them. How did we commit such a crime with our code (which looks innocent enough)?

When Py_BuildValue() puts together the Python object to return, it makes copies of the data it has been passed. In our case here, that would be a pair of strings. The problem is that we allocated the memory for the second string, but we did not release that memory when we finished, leaking it.

What we really want to do is to build the return object and then free the memory that we allocated in our wrapper. We have no choice but to lengthen our code to:

static PyObject *

Extest_doppel(PyObject *self, PyObject *args) {

char *orig_str; // original string char *dupe_str; // reversed string PyObject* retval;

if (!PyArg_ParseTuple(args, "s", &orig_str)) return NULL;

retval = (PyObject*)Py_BuildValue("ss", orig_str, \ dupe_str=reverse(strdup(orig_str)));

free(dupe_str);

return retval;

}

ptg 22.2 Extending Python by Writing Extensions 973

We introduce the dupe_str variable to point to the newly allocated string and build the return object. Then we free() the memory allocated and finally return back to the caller. Now we are done.

Add PyMethodDef ModuleMethods[]

Array/Table for Each Module Function

Now that both of our wrappers are complete, we want to list them some- where so that the Python interpreter knows how to import and access them.

This is the job of the ModuleMethods[] array.

It is made up of an array of arrays, with each individual array containing information about each function, terminated by a NULL array marking the end of the list. For our Extest module, we create the following Extest- Methods[] array:

static PyMethodDef ExtestMethods[] = {

{ "fac", Extest_fac, METH_VARARGS }, { "doppel", Extest_doppel, METH_VARARGS }, { NULL, NULL },

};

The Python-accessible names are given, followed by the corresponding wrapping functions. The constant METH_VARARGS is given, indicating a set of arguments in the form of a tuple. If we are using PyArg_ParseTuple- AndKeywords() with keyworded arguments, we would logically OR this flag with the METH_KEYWORDS constant. Finally, a pair of NULLs properly terminates our list of two functions.

Add void initModule() Module Initializer Function

The final piece to our puzzle is the module initializer function. This code is called when our module is imported for use by the interpreter. In this code, we make one call to Py_InitModule() along with the module name and the name of the ModuleMethods[] array so that the interpreter can access our module functions. For our Extest module, our initExtest() pro- cedure looks like this:

void initExtest() {

Py_InitModule("Extest", ExtestMethods);

}

ptg 974 Chapter 22 Extending Python

We are now done with all our wrapping. We add all this code to our original code from Extest1.c and merge the results into a new file called Extest2.c, concluding the development phase of our example.

Another approach to creating an extension would be to make your wrapping code first, using “stubs” or test or dummy functions which will, during the course of development, be replaced by the fully functional pieces of implemented code. That way you can ensure that your interface between Python and C is correct, and then use Python to test your C code.

Mapping Type Built-in and Factory

Instance Attributes versus Class Attributes