By Serdar Yegulalp
Senior Writer, InfoWorld |
As Python’s popularity rises, its limitations are becoming more clear. For one thing, it can be very hard to write a Python application and distribute it to people who don’t have Python installed.
The most common way to solve this issue is to package the program together with all its supporting libraries and files and the Python runtime. There are tools for doing this, like PyInstaller, but they require a lot of cadging to work correctly. What’s more, it’s often possible to extract the source code for the Python program from the resulting package. For some scenarios, that’s a deal breaker.
Nuitka, a third-party project, offers a radical solution. It compiles a Python program to a C binary—not by packaging the CPython runtime with the program bytecode, but by translating Python instructions into C. The results can be distributed in a zipped bundle or packaged into an installer with another third-party product.
Nuitka also tries to maintain maximum compatibility with the Python ecosystem, so third-party libraries like NumPy work reliably. Nuitka also tries to make performance improvements to compiled Python programs whenever possible, but again without sacrificing overall compatibility. Speedups aren’t guaranteed, either—they vary tremendously between workloads, and some programs might not experience any significant performance improvement. As a general rule, it’s best not to rely on Nuitka to improve performance, but rather as a bundling solution.
Nuitka works with Python 2.6 through 2.7 and Python 3.3 through 3.10. It can compile binaries for Microsoft Windows, macOS, Linux, and FreeBSD/NetBSD. Note that you must build the binaries on the target platform; you cannot cross-compile.
For every platform, aside from needing the Python runtime, you’ll also need a C compiler. On Microsoft Windows, Visual Studio 2022 or higher is recommended, but it is also possible to use MinGW-w64 C11 (
gcc 11.2 or higher). For other platforms, you can use
gcc 5.1 or higher,
g++ 4.4 or higher,
clang-cl on Windows under Visual Studio.
Note that if you use Python 3.3 or Python 3.4 (which are long deprecated), you’ll need Python 2.7 because of tool dependencies. All of this should be an argument to use the most recent version of Python if you can.
It’s best to install Nuitka in a virtual environment along with your project as a development dependency rather than a distribution dependency. Nuitka itself isn’t bundled with or used by your project; it performs the bundling.
Once you have Nuitka installed, use
python -m nuitka to invoke it.
The first thing you’ll want to do with Nuitka is to verify the entire toolchain works, including your C compiler. To test this, you can compile a simple “Hello world” Python program—call it
When you compile a Python program with Nuitka, you pass the name of the entry-point module as a parameter to Nuitka, for example,
nuitka main.py. When invoked like this, Nuitka will take in
main.py and build a single executable from it.
Note that because we’re just testing out Nuitka’s functionality, it will only compile that one Python file to an executable. It will not compile anything else, nor will it bundle anything for redistribution. But compiling one file should be enough to determine if Nuitka’s toolchain is set up correctly.
Once the compilation finishes, you should see a binary executable file placed in the same directory as the Python program. Run the executable to ensure it works.
You can also automatically run your Nuitka-compiled app by passing
--run as a command-line flag.
If your “Hello world” test executable works, you can try packaging it as a redistributable. I’ll explain that process shortly.
Note that when you run your first test compilation with Nuitka, it will probably complete in a matter of seconds. Don’t be fooled by this! Right now, you are only compiling a single module, not your entire program. Compiling a full program with Nuitka can take many minutes or longer, depending on how many modules the program uses.
By default, Nuitka only compiles the module you specify. If your module has imports—whether from elsewhere in your program, from the standard library, or from third-party packages—you’ll need to specify that those should be compiled, too.
Consider a modified version of the “Hello world” program, where you have an adjacent module named
and a modified
To have both modules compiled, you can use the
The switch ensures that all the imports required throughout the program are traced from the
import statements and compiled together.
--nofollow-import-to, lets you exclude specific subdirectories from the import process. This option is useful for screening out test suites, or modules that you know are never used. It also lets you supply a wildcard as an argument.
Figure 1. Compiling a large, complex program with Nuitka. This example involves compiling the Pyglet module along with many modules in the standard library, which takes several minutes.
Now comes one of the wrinkles Python users often confront when trying to package a Python application for distribution. The
--follow-imports option only follows imports explicitly declared in code by way of an
import statement. It doesn’t handle dynamic imports.
To get around this, you can use the
--include-plugin-directory switch to provide one or more paths to modules that are dynamically imported. For instance, for a directory named
mods that contains dynamically imported code, you would use:
If your Python program uses data files loaded at runtime, Nuitka can’t automatically detect those, either. To include individual files and directories with a Nuitka-packaged program, you’d use
--include-data-files lets you specify a wildcard for the files to copy and where you want them copied to. For instance,
--include-data-files=/data/*=data/ takes all the files that match the wildcard
data/* and copies them to
data/ in your distribution directory.
--include-data-dir works roughly the same way, except that it uses no wildcard; it just lets you pass a path to copy and a destination in the distribution folder to copy it to. As an example,
--include-data-dir=/path/to/data=data would copy everything in
/path/to/data to the matching directory
data in your distribution directory.
Another way to specify imports is by using a Python-style package namespace rather than a file path, using the
--include-package option. For instance, the following command would include
mypackage, wherever it is on disk (assuming Python could locate it), and everything below it:
If packages require their own data files, you can include those with the
This command tells Nuitka to pick up any files in the package directory that aren’t actually code.
If you only want to include a single module, you can use
This command tells Nuitka to include only
mypackage.mymodule, but nothing else.
When you want to compile a Python program with Nuitka for redistribution, you can use a command-line switch,
--standalone, that handles most of the work. This switch automatically follows all imports and generates a
dist folder that includes the compiled executable and any support files needed. To redistribute the program, you only need to copy this directory.
Don’t expect a
--standalone-compiled program to work the first time you run it. The general dynamism of Python programs all but guarantees you’ll need to use some of the other above-described options to ensure the compiled program runs properly. For instance, if you have a GUI app that requires specific fonts, you may have to copy them into your distribution with
Also, as noted above, the compilation time for a
--standalone application may be significantly longer than for a test compilation. Budget in the needed build time for testing a standalone-built application once you have some idea of how long it’ll take.
Finally, Nuitka offers another build option,
--onefile. For those familiar with PyInstaller,
--onefile works the same way as the same option in that program: it compresses your entire application, including all its dependent files, into a single executable with no other files needed for redistribution. However, it is important to know that
--onefile works differently on Linux and Microsoft Windows. On Linux, it mounts a virtual filesystem with the contents of the archive. On Windows, it unpacks the files into a temporary directory and runs them from there—and it has to do this for each run of the program. Using
--onefile on Windows may significantly slow down the time it takes to start the program.
Serdar Yegulalp is a senior writer at InfoWorld, focused on machine learning, containerization, devops, the Python ecosystem, and periodic reviews.
Copyright © 2022 IDG Communications, Inc.
Copyright © 2022 IDG Communications, Inc.
Note that any programming tips and code writing requires some knowledge of computer programming. Please, be careful if you do not know what you are doing…
Post expires at 9:34am on Thursday March 23rd, 2023