Make your research reproducible effortless.
This is an interface to pyenv and pdm to allow fully-isolated python environments specifically designed with scientific research and re-usability.
You will get projects that are:
- installable by newbies with a single command
- installable via
pip git+https://etc.etc.
(thus indipendently from the package manager) - completely relocatable (including the python binary)
- easy to maintain even with complex dependencies
- accelerated by the INTEL python libraries (as anaconda)
- People interested in your research may not be able to code
- Young users often don't know how to set up the dependencies to run your code
- You want to use specific versions that are not available on the platform you're using (e.g. a server, Google Colab, Paperspace Gradient, etc.)
- You want your specific setupalways ready at each Colab/Gradient startup
- You want that other researchers can easily use your code
./install
: install Python 3.9.16 in a subdirectory of your project./dust add numpy scipy ipython
: install numpy, scipy, and ipython as dependencies from the INTEL repositories on anaconda./dust shell
: run a shell with the python version set up./dust exec vscode
: run a command (nvim
) with the python version set up- another user
./install
: install Python 3.8.13, numpy, scipy, and ipython all in your project directory - another user
pip install git+https://your-repo-here.com/name
: install this package in its dependencies ./dust pdm publish
: publish to PyPI
It should work on any Unix-like system (Linux, Mac, etc.)
- Option 1 (only Github): select the
use this template
above - Option 2: clone and deinit (or set your custom git url)
git clone https://github.com/00sapo/pyenv <your-dir> rm -rf <your-dir>/.git*
Now, let us install Python inside the pyenv
directory. Simply run:
./install.sh 3.8.13
This will install python 3.8.13. Any version available from pyenv can be used. Use
./install.sh list
to list them out.
During the installation, pdm
will guide you through the setup of the project
metadata.
Moreover, to allow users to install the package, we will answer "yes" when
pdm
ask us if we want to publish the project as a pip package. You if you
answer "no", you can still manually add your project name and version in the
pyproject.toml
file.
After the installation, a check is performed. It should say "OK!".
Now, let's add some dependency:
./dust add numpy
Finally, execute some python code:
./dust -c "import numpy as np; print(np.random.rand())"
It should print a random number.
In general, ./dust
replaces your python
command. However, when the arguments are
add
, remove
, update
, or pdm
, it behaves differently and resorts to use pdm
.
Other special arguments are exec
, that is used to launch external commands with the
python version set-up, and shell
, which spawns a shell with your python environment
set up.
Assume that now you distribute your code and want that other researchers use it. You can just create a readme saying the following:
- Install Linux, WSL, or Mac OS
- Simply run:
./install.sh
from the repo root directory# with pip pip install git+ssh://git@github.com/<user>/<project>.git
The ./install.sh
will install all the dependencies needed, including the
correct python version.
The same will work when moving your code to Google Colab or Paperspace Gradient, that offer you some persistent directory but don't allow you to edit your home file persistently (they will erase them when your environment is killed).
Your interface to the Python installation is ./dust
. Internally, dust
setups pyenv
and runs pyenv exec python $@
. In POSIX-like shells, $@
means
"all the received arguments". So, you can give any command to python
.
python-dust
uses pdm
for interacting with the pyproject.toml
file.
pyproject.toml
is a new standard that allows to maintain project more easily.
pip
supports pyprojects.toml
when installing packages, but it doesn't help you to
edit it. pdm
, instead, allows you to automatically set the version contraints on your
dependencies, easing the work for you and for other developers that want to use your
code.
For this reason, there are 4 special commands to interact with pdm
:
./dust add
: this adds packages topyproject.toml
and accepts all options aspdm add
./dust remove
: this removes packages topyproject.toml
and behaves likepdm remove
./dust update
: this updates the dependencies aspdm update
./dust pdm
: this allows to interact with pdm and give it any command
Note that the above 3 commands add
, remove
, and update
will only change the
pyproject.toml
file using pdm. However, dependencies are then exported to
requirements.txt
file and installed using pip
. This is done to avoid the overhead of
virtualenvs, that are not so much useful now, since we are using a fully isolated python
installation.
As a consequence, you should never use ./dust pdm sync
, because it would install
dependencies that will never be used.
You can also use pip
directly, but it is discouraged. The reason is that if you use
pip
, we can easily update the requirements.txt
file, but not the pyproject.toml
file. Specifically, the pyproject.toml
file will become a long list of dependencies
(including the dependences of your dependencies) that is basically meaningless and is
the exact reason why people use pdm
, poetry
, and similar.
To use pip directly, just run something like:
./dust -m pip install numpy
This may come useful for you if you want to provide special arguments for a specific
package, e.g. ./dust -m pip install --force-reinstall git+https://your-get-repo.com/@commithash
.
Imagine you have edited the pyproject.toml
file by hand. Now you need to install the
new dependencies. You can use ./dust sync
for this.
Note that ./dust sync
is different from pdm sync
, because pdm
tries to install
packages in a virtualenv or in a __pypackages__
directory, while dust
install them
in the root of the python installation via pip
. Specifically, ./dust sync
will
export the pyproject.toml
to requirements.txt
and then it will install it via
pip
.
You can give any option that will then be used when installing via pip
, e.g. ./dust sync --force-reinstall
will reinstall all the dependencies.
You can also run any command in the pyenv
by using the special command exec
.
Using it, the arguments will be executed exactly like they are but inside a
shell where the project's pyenv
is correctly set-up. For instance, at a
certain point of the development, you may want to have two python versions
installed and switch from one to the other:
# install a new python version
./dust exec pyenv install 3.9.1
# switch to the new python version
./dust exec pyenv local 3.9.1
# install dependencies for this new verions (in pdm mode)
./dust update
# switch back to the previous one
./dust exec pyenv local 3.8.3
./dust update
To spawn a shell, you should use the provided command: ./dust shell
, which spawns a new
instance of your current shell. Note, that if you install packages inside a spawn shell
without using dust
, the requirements.txt
won't be automatically updated soon, but
only when you run the exit
command.
Only fish and bash supported for now.
As in the example above, you could spawn a shell which uses your project's python installation:
./dust exec bash
However, if your shell has some init file that conflicts with pyenv
(e.g. if you have
pyenv
installed in your system like me), the above method doesn't work to spawn a
shell.
Sometimes it is useful to just install a package, without resolution dependency.
In pdm
, read the related documentation.
In short, you should put something like this in your pyproject.toml
file.
[tool.pdm.overrides]
asgiref = "3.2.10" # exact version
urllib3 = ">=1.26.2" # version range
pytz = "file:///${PROJECT_ROOT}/pytz-2020.9-py3-none-any.whl" # absolute URL
fairseq = "git+https://github.com/pytorch/fairseq@336942734c85791a90baa373c212d27e7c722662" # git url
You don't need virtualenvs at all, because all the python installation is relative
to this directory. However, to keep compatibility with other softwares, python-dust
creates a symbolic link to the internal pyenv
virtualenv.
To uninstall all the directories related to your project's Python installation, run:
sh ./pyenv/uninstall.sh
This script is not tested and is not ensured that it really deletes all the
directories created by ./install.sh
.
If your dependencies include software not installable via pip, python-dust
offers the option of running shell scripts before or after the installation
process. Specifically, you should add scripts to the folders
./dustenv/pre_install.d/
and ./dustenv/post_install.d/
. The scripts will be
executed in alphabetical order, so you can name them like
00_myscript.sh
.
Arguments | Effect |
---|---|
No arguments | Install python version from .python-version file or the default one (3.9.16), and support for scientific Intel packages from Anaconda servers (numpy, scipy, numba) |
list |
List available versions from pyenv |
<any-version> |
Install version <any-version> , e.g. 3.8.9 |
Arguments | Effect |
---|---|
add <package> |
add a package (with possible version contraint, see pdm docs) |
remove <package> |
remove a package |
update |
update dependencies (see pdm docs) |
pdm <command> |
execute any pdm command |
sync |
export pyproject.toml into a requirements.txt and then install using pip |
exec <command> |
exec any command using the virtualenv set up by pyenv |
shell |
return a shell (bash or fish , depending on your current one) inside the virtualenv |
setup_intel |
set up the INTEL mirrors from anaconda so that |
numpy, scipy, and numba benefit from accelerated libraries | |
anything else | passed as argument to python , e.g. ./dust -m http.server 8000 or ./dust myscript.py |
In general, you can point your editor to the .venv
in your project directory.
Otherwise, ./dust exec <editor-command>
should always work out-of-the box.
- While
pip
is improving and is now able to do backtracking, it doesn't ensure a lock file (requirements.txt) always up-to-date, nor the ability to convert your code to a standalone pypi module uploadable to pypi.org effortless, nor the ability to automatically compute version constraints for the dependencies. - Conda is slow, not standard, and convicts your users to use conda as well (lock-in)
- poetry and Pipenv are slow and are not fully standard
- pip-tools reinvent the standard
To test this project for development purpose, use the branch develop
and test
using the command ./dustenv/test-dust.sh ./install.sh
Federico Simonetta