omniverse theirix's Thoughts About Research and Development

uv for fast wheels

To build or not to build?

You have a Python package. It solves a technical or business problem. You have decided to share it with the open-source community. Okay, just set up GitHub Actions and publish it to the PyPI. Periodically, you carve a new version and publish it. Everybody is happy – developers, consumers, and the community.

Then you suddenly discover that Python is slow. It is perfectly fine for many use cases, but yours is doing CPU-intensive work. You think it’s worth shifting CPU-intensive work from the interpreted language to a native extension written in C, C++ or Rust. There are a few different approaches to do this – Cython, CFFI, pure low-level CPython extensions. CFFI is the most widely used. You need to write native code in C, build it somehow into a library, load the library from Python and call it while adhering to call conventions and ownership semantics.

The problem is how to build and package a binary distribution. Different projects go in different directions. Here lies my experience of trying and using them.

Poetry

Poetry prefers having explicit build scripts, and the Poetry build backend is aware of them.

[build-system]
requires = ["poetry-core", "setuptools"]
build-backend = "poetry.core.masonry.api"

[tool.poetry.build]
script = "scripts/build.py"

The build script is a simple Python module invoking setuptools.command.build_ext and cffi or cython. The downsides of this approach – it is still unofficial and unstable, and the behaviour changed with Poetry 2.

Setuptools

With old, good setuptools, it is easy to use setuptools’ official and supported integrations with ext_modules using cythonize or ffibuilder using the following setup.py:

setup(
    setup_requires=["cffi>=1.0.0"],
    cffi_modules=["scripts/build.py:ffibuilder"],
    install_requires=["cffi>=1.0.0"],
)

Worked like a charm for decades, but now it’s outdated. Even with dependency control via pip freeze you will run into many well-known Python packaging problems.

uv

What if you want to combine a modern packaging tool like uv and build binary distributions?

uv uses pyproject.toml for project configuration. The main benefit of the pyproject format is the standardisation of storing metadata for a project, build backend and external tools. It’s a big step forward from the imperative setup.py. It allows not only storing metadata, but also simplifying tools and significantly improving performance, because dependency resolution no longer requires evaluating Python code in setup.py.

Of course, all the dependencies – build or runtime – can be specified in the same file. uv has a great build backend uv_build (used by default) – extremely fast (because, you know, it’s in Rust :zap:) and reliable. It fully supports the pyproject.toml standard, including the standard project tag.

Binary distributions and uv

Unfortunately, this uv_buildbackend only supports pure Python. For binary extensions, the easiest way to overcome this limitation is to use a setuptools build backend instead.

[build-system]
requires = ["setuptools>=61", "cffi"]
build-backend = "setuptools.build_meta"

setuptools itself perfectly supports multiple formats for configuration – TOML.

[project]
dependencies = [ "numpy", "h3" ]

or setup.py:

setup(install_requires=[ "numpy", "h3" ])

But that’s not for all tools. All the integrated tools should decide for themselves where to get data from. For example, CFFI doesn’t support declarative pyproject.toml as a data source for cffi_module, which is sad because you have to store a setup.py file, albeit simple:

from setuptools import setup
setup(cffi_modules=["scripts/build.py:ffibuilder"])

Okay, now we’re back to setup.py. Poetry 1.x was able to generate setup.py with build instructions under the hood, so this approach is more or less familiar.

The most challenging part is to configure package discovery and file packaging properly. Here’s how we did it in the open-source timezonefinder package with @jannikm. It has a flat layout. Having a flat layout can bring unnecessary files into the package compared to the src layout, and that can be improved with the where = ["src"] directive. If you have a flat-layout, but still follow standard naming conventions, then automatic package discovery could work without any additional tweaks. It excludes typical directories like tests, examples, etc. To include new files, like built binaries in a module timezonefinder/inside_poly_extension, you have to include them in the packages directive explicitly:

[tool.setuptools]
packages = ["tests", "timezonefinder", "timezonefinder.inside_poly_extension"]

[tool.setuptools.package-data]
timezonefinder = ["**/*.json", "**/*.c", "**/*.h", ]

Since automatic package discovery is used, the build.py script cannot be stored at the root of the repository (it is in the exclusion list). Placing the script in the timezonefinder package eliminates this problem. Also, in this example, the tests are packaged too, because we need them in the distribution. The build script itself is simple and compiles a dynamic library from sources.

Essentially, the whole process of building the package is either

uv build --sdist

or

uv build --wheel

uv and CI

uv also simplifies CI setup for many projects.

If a project is pure Python, nothing special is required. Run tests, run a linter, make an sdist and upload it to PyPI. uv can streamline commands and simplify tool installation.

Binary packaging complicates the CI build. Now the project has multiple binary targets. Possible dimensions are cp39,cp310, cp311 for Python versions; musllinux or manylinux for ABI targets. There are two approaches to this.

First, the cibuildwheel tool. It manages the whole process automatically with a single GitHub Actions step by creating Docker containers for all permutations of specified Python versions and architectures. Inside each container, it runs a specified command (setuptools by default) and exports the produced wheels.

- name: "Build wheels"
    uses: pypa/[email protected]
    with:
      output-dir: dist
    env:
      CIBW_MANYLINUX_X86_64_IMAGE: quay.io/pypa/manylinux2014_x86_64:2025.03.09-1
      CIBW_MUSLLINUX_X86_64_IMAGE: quay.io/pypa/musllinux_1_1_x86_64:2024.10.26-1
      CIBW_BUILD: "cp39-* cp310-* cp311-* cp312-* cp313-*"
      CIBW_BUILD_FRONTEND: pip
      CIBW_BEFORE_ALL_LINUX_MANYLINUX2014: yum install -y libffi-devel clang make
      CIBW_BEFORE_ALL_LINUX_MUSLLINUX_1_1: apk add --no-cache libffi-dev clang make

The second approach is to utilise a matrix build. uv can manage Python versions by itself by downloading prebuilt CPython binaries and executing scripts in their context. So the GitHub Actions build becomes even simpler:

strategy:
  matrix:
    python-version:
      - "cp310"
      - "cp311"
      - "cp312"
steps:
  - uses: actions/checkout@v4

  - name: Make wheel
    run: uv build --python $ --wheel

Of course, if you want to achieve manylinux or abi3 compatibility, you’ll need to use cibuildwheel in combination with uv.

You could also use the GitHub action astral-sh/setup-uv@v6, but the example cited above is the shortest version to demonstrate the flow.

Conclusion

uv provides an excellent experience when working with modern Python packaging. Lightning-fast dependency resolution and tool ergonomics are the main benefits.

Binary distribution is not a first-class citizen in uv and requires a fallback to setuptools build backend, while still using uv as a frontend and keeping its ergonomics.

The problem of standardised binary distribution building is not yet solved by any tool. Approaching this could be the next big step for the Python ecosystem, especially when Python is increasingly accelerated with native code.


How Jenkins Age Becomes Tech Debt

Jenkins is a very popular and highly customisable CI system. It is so customisable that it could even become a problem, especially if you use plugins.

I am maintaining a really old and weird CI system. The master worked on 2.289, which dates back to 2021. What is even worse is that agent machines were running the operating systems openSUSE 12 and Debian 7 from 2011 and 2013. Those were glorious times — the Higgs boson was discovered, athletes ran at the London Olympics, and Snowden leaked tons of classified data. Anyway, I didn’t think it was such an ancient system – master was only five years ago. How naive was I…

The Plan

The usual upgrade plan for Jenkins is:

  1. Upgrade plugins
  2. Restart
  3. Upgrade Jenkins WAR to the next version
  4. Restart
  5. Upgrade plugins again because Jenkins is newer now
  6. Enjoy

And it usually works – but not this time.

Plugins disaster

Jenkins Fire Image by Jenkins Project CC BY-SA 3.0

Of course, I didn’t upgrade to the version past 2.479 on Java 17, because Debian 10 only supports OpenJDK 11 out of the box. So it was the easy part.

Then an innocent plugin upgrade led to unforeseen circumstances. It turns out the plugin centre doesn’t provide data for plugins older than one year. It is documented with an obscure wording: “Do not use versions no longer supported by the update centre”. I think it’s crucial information and providing factually incorrect information is just wrong. In practice, it means that you could be provided with an ill-defined set of versions without a proper baseline Jenkins version check.

That’s exactly what happened in my case. I discovered Jenkins in a state where almost all plugins complained about incompatibility with the Jenkins version. I started downgrading them one by one. It’s quite difficult since the Jenkins update centre doesn’t provide a way to downgrade a plugin version, not to mention downgrading to a specific version.

It means that for each plugin, you must navigate to the plugins website, perform version bisecting for a plugin version targeting Jenkins 2.479 at most, download the HPI from releases, upload it to the Jenkins Plugin Manager, and restart. Of course, the dependency graph for failed plugins must be constructed by yourself! The easiest way is to find the most used plugin, whose node has the highest out-degree in the graph, and try to upgrade it. Then move to dependencies. Hopefully, you won’t encounter plugin conflicts with each other (I did).

After a few hours, the plugin hell was finally frozen, and all plugins and Jenkins were in a consistent state with plugins.

In search of OpenJDK

Jenkins agents must run the same version of JDK as the master, or newer. This fact made life a little more complicated, since Debian 7 only supported Java 7, and openSUSE was stuck at 6. Previously, I put OpenJDK 8 in /opt/jdk8 and specified the path to the JDK binary for Jenkins agent in “Launch Method -> Advanced -> JavaPath” without exposing this Java to the system or applications via the environment. It worked well.

JDK Jenkins Agent

Unfortunately, JDK 11 builds are not widely available for these operating systems. Ten years ago, we were able to navigate to the Sun Microsystems Oracle website and download JDK builds for a variety of Unix systems. Sometimes, after an easy registration wall, but it was possible at least, although the options were limited.

Nowadays, Oracle doesn’t provide free JDK builds. The stewardship of Java went to OpenJDK, which is the official reference Java implementation. It produces source code, but not distributions. Surely, you can compile it on your own, but testing, tracking security issues, and ensuring correct platform support are up to you. So, plenty of JDK distributions had emerged (alphabetically):

  • AdoptOpenJDK
  • Azul Zulu
  • Amazon Corretto
  • Eclipse Temurin
  • Liberica OpenJDK
  • Microsoft OpenJDK
  • Oracle Java
  • Red Hat OpenJDK

To be precise, Oracle provides support only for the latest JDK version, so there is no such thing as Oracle LTS.

So the plan is to go and grab any distribution for OpenJDK 11, right? That’s true only if it supports an old enough libc. If somebody builds a distribution on a newer build machine, binaries will use the latest system glibc; hence, it cannot be run on an older machine. So you should either have a dedicated old build machine, which sets a baseline glibc version, or bump glibc requirements.

I found a few distributions which were built a long time ago and are still available at a vendor’s website. For example, for Liberica jdk11.0.27+9:

% nm /opt/jdk11/bin/java | grep '@GLIBC'
  U getenv@@GLIBC_2.2.5
% /lib/x86_64-linux-gnu/libc.so.6
  GNU C Library (Debian EGLIBC 2.13-38+deb7u11) stable release version 2.13, by Roland McGrath et al.

It works because the system glibc version is newer than any symbol referenced from the Java ELF binary. Of course, it is even easier just to launch it and see whether it immediately emits any errors.

Sometimes vendors are kind enough to list system requirements for versions in terms of operating system versions (like Debian 9, RedHat 7 and so on). Only a few of them specify the kernel and glibc version, and in this case one can consult distrowatch or check the glibc version on a real system. Finding a matching and working version was a trial-and-error process.

Things became even more complicated for non-Intel build farms. If you’re thinking about a modern AArch64 ARM – no way, it is a 32-bit ARM7 hard-float. So you have an even smaller chance of finding a binary distribution for this old obscure architecture. The only distribution providing 32-bit ARM7 HF is Azul Zulu. However, all of their recent binaries are for glibc 2.15. And there is only one old build 11.1.8 targeting older glibc 2.13, which is able to run on that machine:

Azul ARM7

The fun doesn’t stop with Linux. Turns out, an old but stable OS X 10.7 machine cannot run any builds from OpenJDK, Azul, or Corretto. It is not only a matter of using a new operating system API, but also of using a newer kernel. All these builds crashed with SIGSEGV upon launch. Suddenly, I discovered a custom OpenJDK build from the open-source enthusiast Jazzzny, who had kindly built JDK 11 for OS X. The binaries are available at GitHub releases page, and binary jdk-11-snowleopard-r1.pkg (sha256 7c51f13993a7f38575d82aa2a7691aace6170c59074847e779a3d9f903f044fb) built for Snow Leopard (10.7) worked perfectly for me. Minecraft gamers used these builds to deal with older Java versions, which proves again that gaming is the driving force of the computer industry.

Outcome

Ten hours into the upgrade, I’d successfully fixed core Jenkins, all the plugins, and agents, and I was able to run a few test builds.

What went well? The diversity of OpenJDK distributions and the openness of the community greatly enhance your chances of finding a proper distribution suited to your needs.

What could go better? The Jenkins plugin management system. The older your system is, the less likely you are able to upgrade. The cut-off of the plugin compatibility information after a year or two, lack of built-in downgrade functionality and clear machine-readable requirement annotation drags you into the manual dependency resolution nightmare. Making a copy of the Jenkins installation for upgrading and switching it later could be a good solution.

What had I learned? The age of your CI infrastructure is the tech debt. Contrary to popular opinion, please do upgrade your Jenkins at the LTS track and the plugins, at least yearly. It enables the possibility of a much smoother upgrade.