Compiling Python Code with Cython

4 min read
Compiling Python Code with Cython

If you have been developing using Python for a while, perhaps you have heard of Cython and how it speed things up. Cython is an optimizing static compiler for the Python programming language and the Cython  programming language, which is a superset of Python. What Cython does is convert your Python code to C and then build/compile it using a C compiler of your choice. In Python world, this is commonly called as Cythonizing. The speed gain is great but it still depends on how optimized your Python code is.

How to Cythonize Python code?

First step is to have a C compiler available depending on the platform that we are using and the Python version that you are working with. If we are developing on Linux, we do not need to install anything since most Linux boxes comes with GCC compiler installed. If on Windows, there is a recommended set of compilers for specific Python versions available here.

In this guide, we will be using Python 3.7 on Windows 10. The easiest and faster route for us is to download and install Visual Studio Community 2019. During installation, choose Desktop development with C++, click Install, and that's it! You will be downloading tools and SDKs for C and C++ development.

Next step is to install Cython using pip.

pip install cython

Now we can start working on our Python module. Let us say we have a Python file named module.py containing the function hello() and we want to Cythonize it.

#!/usr/bin/env python


def hello():
    print("Hello world!")

First step to Cythonizing is to write a standard setuptools setup.py containing the definition for ext_modules.We will simply pass our module file name to the cythonize() function. In setuptools, our cythonized module is called an extension.

#!/usr/bin/env python
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize('module.py')
)

The last step is to build our extension by executing setup.py. The argument --inplace builds our extension on the same location as module.py.

python setup.py build_ext --inplace

We will end up with the following files and directories. The build directory contains all the files and objects used by the C compiler. What is important to us are the module.c which is the C equivalent of our Python code and module.cp37-win_amd64.pyd which is our compiled extension.

build/
module.c
module.cp37-win_amd64.pyd
module.py
setup.py

To use our compiled module we simply import it like a normal Python module.

#!/usr/bin/env python
from module import hello

if __name__ == '__main__':
    hello()

Output:

$ python example.py
Hello world!

How to Cythonize large Python packages?

For this example, we will be using the amortization module that we use on our previous blogs. Most guides on the internet will simply try to put it this way which is wrong and will not compile our code:

#!/usr/bin/env python
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize('amortization/*.py')
)

The reason for this is that the __init__.py on packages cannot be compiled, at least, under normal methods. There is a somewhat hacky way do it but I will not discuss that here.

LINK : error LNK2001: unresolved external symbol PyInit___init__
build\temp.win-amd64-3.7\Release\amortization\__init__.cp37-win_amd64.lib : fatal error LNK1120: 1 unresolved externals
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.20.27508\\bin\\HostX86\\x64\\link.exe' failed with exit status 1120

To solve this, we will need to refactor our code and move codes out of __init__.py. We need to retain this file empty on the package and not compile it.

#!/usr/bin/env python
from setuptools import setup, Extension

from Cython.Build import cythonize

ext_modules = cythonize([
    Extension("amortization.amount", ["amortization/amount.py"]),
    Extension("amortization.schedule", ["amortization/schedule.py"]),
    Extension("amortization.amortize", ["amortization/amortize.py"]),
])

setup(
    ext_modules=ext_modules
)

After running python setup.py build_ext --inplace, we will end up with the following files.

__init__.py
amortize.c
amortize.cp37-win_amd64.pyd
amortize.py
amount.c
amount.cp37-win_amd64.pyd
amount.py
schedule.c
schedule.cp37-win_amd64.pyd
schedule.py

I moved away the .py files except __init__.py temporarily and ran pytest -v to verify that the code is working though there is no need to do this since Python imports the compiled modules (.so on Unix and .pyd on Windows) if they are available.

tests/test_amortization.py::test_amortization_amount PASSED              [ 50%]
tests/test_amortization.py::test_amortization_schedule PASSED            [100%]

========================== 2 passed in 0.05 seconds ===========================

How to distribute packages with Cython support on PyPI?

By simply running python setup.py bdist_wheel you will end up with a binary wheel that you can use only on platforms with similar Python versions and platforms as you have. Note that you should install the wheel package prior to executing the command. There are two ways to support all platforms and versions:

  1. Build binary wheels on all target platforms and versions and upload to PyPI
  2. Upload only the source to PyPI and let the user build it

The first option takes a lot of effort but we can automate things on a CI/CD pipeline. The fastest route is the second option as you only need to do minor tweaks on setup.py.

#!/usr/bin/env python
from setuptools import setup, Extension

try:
    from Cython.Build import cythonize

    ext_modules = cythonize([
        Extension("amortization.amount", ["amortization/amount.py"]),
        Extension("amortization.schedule", ["amortization/schedule.py"]),
        Extension("amortization.amortize", ["amortization/amortize.py"]),
    ])
except ImportError:
    ext_modules = None

setup(
    ext_modules=ext_modules
)

To build the source-only package, uninstall Cython first and make sure to remove all *.c and *.pyd files in the amortization module then run python setup.py sdist. The only disadvantage of this option is that, the end-user should install cython and a C compiler. Try this by doing the following steps:

# install a C compiler first
pip install cython
pip install amortization -v  # Add -v to see what is happening behind the scenes

To wrap up

Cython increases the speed of a Python module by compiling a Python code to C. Although this is a common use-case for developers to use Cython, we can use it for code obfuscation. If we want to protect our code from other people's eyes, we can definitely build it using Cython and distribute it without the source code.

Read more articles like this in the future by buying me a coffee!

Buy me a coffeeBuy me a coffee