Interfacing Python with C using ctypes

Python is a wonderful “very high level” language with an elegant design. It is an ultimate tool to rapidly develop applications. However, when it comes to performance (speed and memory), Python sucks. It is not meant for performance. So what do you do after building a quick prototype in python if you want it to be lean and mean?

From the beginning, Python’s designers understood this use-case and exposed a “C” API. Using the Python C-API, one can write “modules” in C which can then be imported into Python. This solution is not bad and provided you have enough patience, works great. There is a tool called SWIG which can generate the “glue” code around C code. It automates writing of code using C-API and makes it easier for one to maintain the C “module”. However, since SWIG generates code, when some problem occurs, it is quite painful to debug through the wrapper code. For the lazy developers out there (like me :) ), this can be quite a deterrent.

I’ve been working on a project for using Wikipedia data to assign “topics” to arbitrary pieces of English text. The code written in pure Python takes about 3 minutes to run. When I profiled this code, I found that most of the time was being spent in disk reads (through the db). I decided, after examining the data I had, to load all of it into memory. I tried doing this in pure Python and saw the memory usage creeping up beyond the 4GB capacity when I decided to save my poor machine from thrashing by killing the hog!

On doing some tests, I found that Python consumes about 32 bytes to store an integer! Hmmm…. time to move the data structures to C for more efficient memory usage. I started looking around for tools to interface C code with Python when I came across “ctypes” and immediately fell in love with it.

ctypes lets you load any “shared object” or “dynamic link library” in a Python program and unobtrusively call functions. You can construct C datatypes required by C functions (pointers, longs, ints, chars, arrays, structs) and even specify callback functions to be passed to C code! I will give you a basic introduction to ctypes to get you started with Python – C interfacing.

Code

test.c

#include 

// you can initialize stuff in this function
// it is called when the so is loaded
void _init()
{
    printf("Initialization of shared object\n");
}

// you can do final clean-up in this function
// it is called when the so is getting unloaded
void _fini()
{
    printf("Clean-up of shared object\n");
}

int add(int a, int b)
{
    return(a+b);
}

int sum_values(int *values, int n_values)
{
    int i;
    int sum = 0;

    for (i=0; i

We have to compile test.c to create the shared object.

gcc -fPIC -c test.c
ld -shared -soname libtest.so.1 -o libtest.so.1.0 -lc test.o

test.py

from ctypes import *

# load the shared object
libtest = cdll.LoadLibrary('./libtest.so.1.0')

# call the function, yes it is as simple as that!
print libtest.add(10, 20)

# call the sum_values() function
# we have to create a c int array for this
array_of_5_ints = c_int * 5
nums = array_of_5_ints()

# fill up array with values
for i in xrange(5): nums[i] = i

# since the function expects an array pointer, we pass is byref (provided by ctypes)
print libtest.sum_values(byref(nums), 5)


python test.py

Output

Initialization of shared object
30
10
Clean-up of shared object



How much simpler can Python - C interfacing become? ctypes is a standard module from Python 2.5.

Resources

3 Trackbacks

  1. [...] not just a C++ book). It’s not like there aren’t other options, however, as SWIG and ctypes (latter is for Python only which we’ll talk more about on Friday) are specifically designed [...]

  2. [...] the instance of “c-code.A.dylib” with “libtest.so.1.0″. So just like this tutorial showed me, we will call “LoadLibrary” (done later) to load your C functions which [...]

  3. [...] need to replace the instance of “c-code.A.dylib” with “libtest.so.1.0″. So just like this tutorial showed me, we will call “LoadLibrary” (done later) to load your C functions which returns a [...]

7 Comments

  1. I’m unfamiliar with Python’s implementation so this may seem naive. Why is it bad to use 32-bits to store a 32-bit integer? What am I missing here?

    Posted January 7, 2008 at 11:34 pm | Permalink
  2. Python takes 32 bytes!

    Posted January 8, 2008 at 6:59 am | Permalink
  3. Oops! Gotcha.

    Posted January 8, 2008 at 7:23 am | Permalink
  4. Here is good slide presentation I found about ctypes.

    Posted January 8, 2008 at 12:22 pm | Permalink
  5. wonderful. wonderful. thank you so much.

    Posted January 8, 2008 at 2:55 pm | Permalink
  6. Sharmila

    Superb! Will check this out.

    Posted January 19, 2008 at 11:18 pm | Permalink
  7. If you need to return a char* from a c function and use that as a python string from python, you should do something like this,

    char* get_string()
    {
        char* string = (char*) malloc(10);
        strcpy(string, "bingo");
        return string
    }
    

    from python:

    >>> libtest.get_string.restype = c_char_p
    >>> libtest.get_string()
    'bingo'
    
    Posted August 10, 2009 at 3:29 pm | Permalink