CourtBouillon

Authentic people growing open source code with taste

The Python Packaging Hell: The Toolbox (5 / 7)

Python packaging can sometimes be a nightmare. To create, share and install packages, there are a lot of tools, sometimes doing the same thing, but not the same way.

💕💕💕

This article is part of a series of tearful articles about Python packaging:

  1. The Can of Worms
  2. The Roots of Evil
  3. Delusions of Formats
  4. Files Everywhere
  5. The Toolbox
  6. The Expression of Needs (forthcoming)
  7. The Minimal Solution (forthcoming)

Before starting, we would like to send a lot of love to the members of the PyPA team. We complain a lot in this series, but we have a lot of respect for the sisyphean work already done.

That being said, let’s start (again) the whining 😭.

💕💕💕

Tools To Do Everything

Libraries, scripts, executable scripts… Even as a simple user, we have to know and try a lot of tools before using a Python program. You’ll have to create a virtual environment to install packages. With time, you’ll have your own habits, which will change according to evolution and good practices.

We won’t really talk about that. Or just a little bit.

If we focus on package management, there are three distinct steps: installing, creating, publishing to a package repository. Each step can be split into smaller steps, and it would be interesting to understand in detail how it works.

We won’t really talk about that in detail neither.

What we’re trying to make is: drawing a partial, superficial, rough and non-exhaustive picture of what we can use to do these three basic steps. It doesn’t sell dreams but it’s already complex, although it doesn’t seem to be.

We’ll have to look at some details of these steps, we’ll have to look at virtual environments. But we’ll try hard not to get lost in these details, otherwise it would be very long. And you’ve got better things to do.

Let’s start with two tables. The first one lists libraries usable by other tools.

Library Installation Creation Publishing
distutils Yes Yes No
setuptools Yes, based on distutils Yes, based on distutils Yes, until version 42

The second one lists tools offering commands.

Tool Installation Creation Publishing
easy_install Yes, distributed with setuptools No No
pip Yes, includes a partial copy of setuptools Yes, wheels with setuptools and wheel No
wheel No Yes, wheels No
twine No No Yes
pipenv Yes, includes a modified copy of pip No No
pipx Yes, based on pip No No
poetry Yes, based on pip Yes Yes
flit Yes, based on pip Yes Yes

These two tables only list the functionalities we’re interest in, but some of these tools do a lot of other things. It’s not useful to stupidly compare how many boxes they tick, moreover since other parameters must be taken in consideration, like quality and maintainability of code.

Mandatory Related XKCD™
The mandatory XKCD. Taking only a few points of comparison has never been enough. Don’t try this at home.

Moreover, we’re not really going to compare these tools; we’re mostly going to introduce these three functionalities and describe the libraries and tools that can manage them. No need to sulk, it’ll be easier, we promise.

Installing packages

In the beginning, when the concept of packages has been introduced in Python in 2000, distutils has been in charge to create and install packages. As we already said it in the previous articles (but we’ll repeat it): it has been done with no dependency management and no package repository.

distutils is a library that mainly allows to write setup.py files, and that has been for a long time the cornerstone of Python packages. By importing distutils these files are executable and offer two commands: install to install, sdist to create a package. So we can share archives containing all the code, and install them after we decompress them. In other words: these archives are packages.

The idea to have a place to store and share these packages came quickly. PyPI has been put online three years after the birth of distutils; it allows to distribute and find, in a public central place, a lot of Python packages.

The setuptools library comes in 2004 to bring new functionalities, in particular the dependencies management. For Python packaging, it’s a revolution: easy_install is included in the library and allows to install packages from PyPI using their names.

However, setuptools and easy_install are going to show their limits. Created with no real specification, based on a flawed packages format (eggs), they’re going to quickly ask for a replacement.

It will be the case for easy_install, replaced 4 years later by pip. pip’s goal is to install Python packages while correctly managing their metadata, allowing for example to list and uninstall installed packages.

pip has evolved a lot, and today it’s the reference application for packages installation. The tool has been able to adapt to the numerous changes and is now able to handle source packages and wheels. Its thoughtful architecture and its wide use allowed it to integrate features step by step, and to stay alive after more than 10 years.

This tool is used or included in all recent tools for package installation. If we look at tools widely used like Pipenv, Poetry and pipx, all of them use pip to install packages.

Then, why would we use other tools than pip? Pipenv, Poetry, pipx and others allow, each in their own way, to partition installations. By default, pip installs packages in a central folder, which can be annoying when we have different projects using different versions of a same library.

Pipenv and Poetry allow roughly the same mechanic for the installation: they create a virtual environment for each project in which they install the dependencies. For this purpose, they act like a simple capsule around pip and venv, with commands allowing to manage simple cases.

pipx has a different goal: it offers the same interface as pip but to install executable files. It manages to automatically create a virtual environment for each command, and to make this command available for the user. So, it’s more tailored for final users than developers, who are preferential targets of Pipenv and Poetry.

One last surprising thing about pip: it now creates packages in order to install them. Why? You’ll have to read more…

Creating packages

Picture of packages
Reminder: it’s not because the packaging is nice that the content is bound to please.

Yes, you read correctly: pip creates packages now. With the underlying will to get rid of setuptools and other package formats, pip is slowly becoming a simple wheel installer. When there is a source package, it now tries to transform this source package into a wheel before installing it, rather than using the installer of setuptools.

This system has a lot of pros. First of all, it means that, with time, pip could just be an installer of wheels (which are much simpler to install), paired with a source-to-wheel transformer. This kind of mechanism would simplify a lot the source code of pip, that currently does many other things like installing packages from sources using setuptools.

You should also note that, to create packages, it’s no longer required to use setuptools. Other tools exist to create a source package or a wheel. Which means that, from creation to installation, we start to see the end of the tunnel: it’s possible to use quite simple tools, mostly based on specifications, without dealing with the aging setuptools library.

Poetry and Flit are, on this point, rather close. These two tools are able to create packages without setuptools and so without setup.py file. Following PEP 517 and PEP 518, using the pyproject.toml file as the only source of information, they propose an alternative solution to create source packages and classic wheels, installable by pip.

That’s about all Flit does. It also contains stuff to install packages with pip or using symbolic links, which is useful for development. Poetry is more complete: it offers, like Pipenv, the possibility to create virtual environments.

Of course, switching from a setup.py file in Python to a pyproject.toml file limits the possibilities. Despite the effort of these tools to allow a wide flexibility, it’s not possible to do everything we were able to do with setuptools, leaving to this honorable library the responsibility to take care of complex cases, which are mostly the result of sick minds more related to psychiatry than IT.

Publishing packages

In the same way, Poetry and Flit offer the possibility to send packages to PyPI or to compatible servers. This functionality only requires to follow PyPI’s HTTP APIs, and can seem quite simple.

This hasn’t been always like this. For a long time, setuptools proposed a command to send packages, not without trouble. To ensure a compatibility between all Python versions, it has been mandatory to deal with supported TLS protocol versions, security breaches, passwords… And of course, what should have been kept simple has quickly become a sad nightmare of indigestible code.

To fix this issue, the Twine project has been developed. The only goal of Twine is to send packages to PyPI, and to do it well. As a bonus, it offers the possibility to store the password in the system passwords manager, rather than to store it in a plain text file (as setuptools does). Other detail: Twine sends files just as they are, generated by the packaging tool. It seems to be obvious, but you have to know that setuptools recreates the package before publishing it, making testing more difficult.

To Sum Up (Well, We Try)

We’ve gone through a long period of dependency to setuptools, its unintelligible configuration files, its authoritative implementation, its aging architecture and its arguable commands. But those days are almost over, and other solutions already exist to create and send packages.

We’re in a period of uncertainty. It’s difficult, almost impossible, to know which tools will be used tomorrow. It’s hard to build packages with an architecture that will face difficulties through time. But one thing is clear: we have more liberties and possibilities than ever.

After all, having a lot of tools to create packages isn’t something bad, as long as they all create interoperable packages, installable by the same tools. We don’t have the same needs when we create a small pure-Python package, or a packages containing C intended for all platforms.

About that, Flit proposes an interesting view:

Make the easy things easy and the hard things possible is an old motto from the Perl community. Flit is entirely focused on the easy things part of that, and leaves the hard things up to other tools.

(When we look for inspiration from the Perl community, everything is possible…)

It depends on what we want to do. What a good timing: that’s what we’ll talk about in the next article!