Add Python toolchain post, update index

This commit is contained in:
Rob Watson 2020-06-13 00:28:00 +02:00
parent 5587a86494
commit 30e5d9acf7
2 changed files with 44 additions and 0 deletions

View File

@ -1,4 +1,5 @@
---
type: post
title: Posts
url: /
---

View File

@ -0,0 +1,43 @@
---
title: "Diving into the Python toolchain"
date: 2020-06-13T00:12:15+02:00
tags: ["python", "audio", "hacking"]
draft: false
---
I've been inspired to build some audio software for a while and have been hacking a lot with WebAssembly, because in principle at least the potential for doing exciting or even incredible things in the browser with it is very clear (and it's also a great excuse to play with Rust and/or Emscripten). But I've also found myself wanting to build software for which I can't find the supporting libraries in Rust or even C++. This is especially the case for audio analysis, a field for which many of the most interesting libraries are written in Python - a language which doesn't and is unlikely to ever compile to wasm.
But inspired by the cool [audino](https://github.com/midas-research/audino/) project, I decided to consider embracing Python on the backend - after all, in these days of containers and Kubenetes, it's trivial to deploy a backend with whatever language is the most appropriate one, even if it means we have to jump out of the browser's JavaScript environment to actually do the work.
So off I go to play with the first interesting-looking library that I find, which is [librosa](https://librosa.github.io/librosa/).
Getting an isolated environment for an application up-and-running in Python doesn't seem as easy as it could be. First I try [pipenv](https://github.com/pypa/pipenv) which seems popular and promises to do exactly what I want on the surface - which is manage a simple bundle of isolated dependencies for a small throwaway project. But sadly it also seems inexplicably slow to use - spending minutes stuck on a _Locking_ message when trying to install a couple of simple dependencies, for example. A little web searching [suggests](https://github.com/pypa/pipenv/issues/1914) that it's possible to avoid these delays by passing `--skip-lock` to the `pip install` command, but while I don't have any idea what this implies exactly it's also not exactly an interface that inspires confidence.
So next I spend half an hour experimenting with [Anaconda](https://docs.conda.io), which also looks promising on the surface but has unsettling aspects too - not least its threat of installing 3GB of scientific packages as a default, while also claiming to be able to create virtual environments and be a package repository for several languages(!). There is a [minimal variant](https://docs.conda.io/en/latest/miniconda.html) which claims to avoid the worst of this behaviour, but even this raises some questions, like how it interacts with the Python binary that I installed using [asdf](https://github.com/asdf-vm/asdf) earlier today. So sighing slightly I make a note to return to this tool and investigate further later.
Then I try the Python stdlib `venv` tool, which does seem lightweight and efficient but also seems to do a bit more than I want, including adding Python binaries and various directories to the source tree. Other languages seem to have this nailed, and have made it trivial to set up a bundle of dependencies for a given project with a couple of simple commands. Hopefully I'm missing something obvious.
So I end up deciding to leave the dependency management fun for another day, and fall back to installing my libraries globally instead. `librosa` installed perfectly fine like this, and I swiped a simple example script from their tutorials:
```python
import librosa
fname = librosa.util.example_audio_file()
pcm, sr = librosa.load(fname)
print("Got PCM data {} and samplerate {}".format(pcm, sr))
```
but when I run it, an unpleasant backtrace is spat out ending with `ModuleNotFoundError: No module named 'numba.decorators'`. The plot thickens once again. I spend some time getting confused about the librosa [installation instructions](https://librosa.github.io/librosa/install.html) which mention the `numba` library specifically in a way that suggests to me a possible linking or build issue, but a few attempts to uninstall and reinstall the two libraries in differing orders fail to make a difference. Eventually I discover a mention of the `numba` API being updated so that `numba.decorators` changes to `numba.core.decorators` which would neatly explain the error, so I check the librosa GitHub repo and find a mention of `numba` at version `0.43.0`.[^1]
This sounds like a good path to follow but in turn sparks a battle with `pip` to install a specific version of a library. Apparently, despite the version [existing on the PyPI website](https://pypi.org/project/numba/0.43.0/), this isn't enough for pip to be able to install it. This, however, is quickly forgiven when I discover that pip does allow an arbitrary archive URL to be passed to it (so `pip install https://github.com/numba/numba/archive/0.43.0.tar.gz` works exactly as you'd expect). This is a nice touch and something that other dependency management tools would do well to support.
Now, I try to run my Python script again and it appears to find `numba.decorators` successfully! However, it then immediately falls over with a new and even less comprehensible error: `numba/_dynfunc.cpython-38-x86_64-linux-gnu.so: undefined symbol: _PyObject_GC_UNTRACK`. At this point it is almost time to give up and collapse into bed, but something that I've forgotten gives me the inspiration to bump `numba` up a couple of versions - this time to `0.48.0`. I re-run the script and this time see the magic output:
```
Got PCM data [0. 0. 0. ... 0. 0. 0.] and samplerate 22050
```
Most of an evening has been spent on this and it's taken me back to the days of ad-hoc dependency management in Ruby's pre-Bundler era. But I've learned a lot about the Python toolchain in the process, and am now ready to start some audio hacking.
[^1]: The issue was already fixed in [this PR](https://github.com/librosa/librosa/pull/1107).