CourtBouillon

Authentic people growing open source code with taste

The Python Packaging Hell: the Roots of Evil (2 / 7)

Python packaging can sometimes be a nightmare. Because of the stupidity of Python maintainers? What if it was more complicated?

💕💕💕

This article is part of a series of tearful articles about Python packaging:

  1. The Can of Worms
  2. The Roots of Evil
  3. Delusions of Formats
  4. Files Everywhere (forthcoming)
  5. The Toolbox (forthcoming)
  6. The Expression of Needs (forthcoming)
  7. The Minimal Solution (forthcoming)

Before starting, we would like to send a lot of love to the members of the PyPA team. We complain a lot in this series, but we have a lot of respect for the sisyphean work already done.

That being said, let’s start (again) the whining 😭.

💕💕💕

It’s Not the Right Technology

Last time, we saw that it’s possible to manage package creation and distribution nicely. We just need to look at what Rust developers do, and it doesn’t seem to be so complicated.

That’s not totally wrong, but let’s be honest, that’s not totally true either. First of all, Rust has indisputable technical assets compared with Python: for example, the possibility to share compiled libraries and executable files. Python being interpreted, we have to think about the interpreter installation and other related questions (which interpreter, which version…). Tools like Nuitka bring emerging solutions, but they are doomed to live outside the standard library without being massively used.

Beyond technical aspects, the main advantage of Rust is its age. Created in 2010, it’s a young kid compared to the antique Python born 20 years earlier. Between 1990 and 2010, CVS, Subversion and Git appeared; Netscape, Internet Explorer, Firefox and Chrome too; and so did HTML, CSS and JavaScript. It’s unbelievable, but it’s true, and it puts in perspective the relative situations of Python and Rust.

It’s Not the Right Moment

It’s hard to remember or to imagine how IT was when Python started to germinate in the end of the 80s, but we can easily understand why package creation and distribution weren’t the top issues to deal with.

Starting screen of Netscape 6
The cute splash screen of Netscape 6, launched in 2000, the same year as distutils.

Of course, Python didn’t include tools to distribute code when it’s been first released. The Python Package Authority (PyPA) maintains a very instructive history of the evolution of packaging, where we learn among other things that:

  • distutils was integrated in the standard library in 2000, in Python 1.6;
  • the PyPI has been first deployed online in 2003;
  • the PyPA has been created in 2011 to manage pip (born in 2008) and virtualenv (born in 2007);
  • Python 3.3 has almost integrated the successor of distutils and setuptools, but the project has been abandoned because of a lack of investment;
  • a huge number of PEPs about the evolution of those tools have been proposed and accepted.

You’re right: it’s a mess. Whoever has used easy_install knows that the situation back then was extremely painful, and that installing a package required a lot of perseverance, knowledge, and luck of course. The accumulation of names, tools, internal and external libraries, shows that everyone blindly created partial, risky, even shaky solutions.

As we can hear in this excellent episode of Podcast.__init__, these tools have for a long time been created without any specification. The main reason is a lack of time, but also a lack of means: there’s no company behind PyPI or PyPA, contrary to npm for example. Development is mostly done by volunteers, who wouldn’t say "no" to a helping hand ✋.

Indeed, this inventory doesn’t explain everything. Other big changes have been integrated in the language with way more tact, like coroutines recently. This functionality required a long and stormy discussion before getting a rather appreciated integration. Before etching these syntax changes in the language, less intrusive proposals have been tested, improving the final solution approval.

It’s easy to point finger at people who created those tools. It’s easy to scoff at the lack of view of the Python team who integrated some doubtful solutions in the rush, and left some of them outside the standard library. But if we think that distutils has been integrated at the same moment when the top browser was called Internet Explorer 5.5…

And moreover, it’s not too late to change everything, is it?

It’s Not the Right Standard

In a way, unfortunately, yes, it’s too late.

Mandatory Related XKCD™
The mandatory XKCD

The issue is that, since the beginning of the 2000s, packages have been released in the wild, on PyPI, in public and private repositories. These packages are used by a lot of people, and sometimes by obsolete systems (we’re still looking askance at you, Python 2). And Python, which has painfully suffered from rough and incompatible changes (we’re really thinking about you, Python 3) won’t make this loads of packages uninstallable.

Otherwise, pitchforks and guillotines will be out and about.

All those proposals, about files (requirements.txt, setup.py, Pipfile, setup.cfg, pyproject.toml…) or about tools (easy_install, pip, pipenv, poetry, setuptools, distutils…) have just been added one after the other, with no perfect replacement, without shelving the faults of their ancestors.

But of course, everything is not lost. The most important changes are globally pointing to systematic specification, rationalization and simplification. A lot of PEPs are proposed to describe and discuss before coding, so that implementation details don’t prevail over ideas born out of consensuses.

But the path is long.

It’s Not the Right Solution

If you think that dependency management has been solved for a long time, and that the people behind pip seriously lack important skills, don’t forget that the name of a Python package is often given in the setup.py file, and can thus theoretically depend of various information like the OS, the presence of external libraries, or even the current time. Listing libraries needed by a package sometimes requires to launch an interpreter, and so the dependencies tree is different for each installation.

In practice, solutions have been found to make up for this flexibility, which is sometimes a real nightmare. Fortunately, downloading and launching all the different versions of a package is not mandatory to determinate which ones are required for the installation. But everything is done with new formats and new metadata that are, by definition, not part of the previous packages.

In concrete terms, managing obsolete installation tools will be required for a long and painful time. The wild rythm of new versions of setuptools and pip gives an idea of constant improvements brewing without us realizing; but we’re not ready to discard the fiendish history we’ll have to cart around for a long time.

However, for packagers, the situation is getting better. The topic isn’t really appealing and it’s hard to find a reference documentation (even from PyPA, especially from PyPA). It’s a pity, because it’s becoming almost easy and neat to create Python packages.

With no setup.py, for example.

You’d like to learn more? There are five articles left to take a look at existing tools, to finely define what we want to do, before finally creating our own beautiful package.

To be continued…