Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda project #363

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from
Draft

Conda project #363

wants to merge 18 commits into from

Conversation

AlbertDeFusco
Copy link
Collaborator

@AlbertDeFusco AlbertDeFusco commented Feb 5, 2022

  • README update for future of this tool (rip and replace as Conda API is available and conda-lock goes 1.0)
  • conda-* cli entry points (initialize, add/remove-*, dockerize, run)
  • Getting started and other docs converted to new CLI (update style to match conda-docs)
  • submission for forking to conda-incubator
  • CEP for locking (implicit?) to provide a standard in the community as a dependency for project and other things (co-write with conda-lock authors)
  • merge and rebase against Issue 337: Added pkg_key to ensure 'packages' == 'dependencies' #359
  • fix add/remove packages behavior (it will now create a separate project.yml file)
  • ensure backcompatibility using legacy anaconda-project cli or a conversion step
  • motivating usecases: (alternative to pyenv+poetry)
  • commands: {<name>: {command: <str>}} with unix-command/win-command when needed
  • drop supports_http_options in favor of explicit jinja
  • drop notebook, bokeh commands in favor docs with jinja docs

This PR combines #284 and #275 along with rebasing against latest commits to rename anaconda-project to conda-project with no other change in the commands (at this time).

conda install -c defusco/label/dev conda-project

The default project file is now project.yml, but may also be called conda-project.yml or anaconda-project.yml.

The largest change is thatenvironment.yml or requirements.txt files can be used directly without the need to create a project.yml (nor will the file be created for you).

Enabled use cases

  • conda-project prepare
  • conda-project run <executable> [<arg1>, <arg2>, ...]

Note The run command can execute any executable in the environment and pass arguments to it. Commands need not be specified in an project.yml file to be able to run.

In the two use cases shown below there is no project.yml file and it will not be created with the commands shown. For both cases you can use conda-project prepare to create the file and import the required packages from either environmnent.yml or requirements.txt.

environment.yml

Here's a typical environment specification file.

name: envYaml

channels:
  - defaults
  - conda-forge

dependencies:
  - python=3.8
  - tranquilizer>=0.7
  - pip:
    - requests

Create the environment within the envs directory of the project. The name of the env_spec will match the name: key in the environment.yml.

> conda-project prepare
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /Users/adefusco/Desktop/envYaml/envs/envYaml

  added / updated specs:
    - python=3.8
    - tranquilizer[version='>=0.7']


The following NEW packages will be INSTALLED:

  aniso8601          pkgs/main/noarch::aniso8601-9.0.1-pyhd3eb1b0_0
  attrs              pkgs/main/noarch::attrs-21.4.0-pyhd3eb1b0_0
  ca-certificates    pkgs/main/osx-arm64::ca-certificates-2021.10.26-hca03da5_2
  certifi            pkgs/main/osx-arm64::certifi-2021.10.8-py38hca03da5_2
  click              pkgs/main/noarch::click-8.0.3-pyhd3eb1b0_0
  flask              pkgs/main/noarch::flask-1.1.2-pyhd3eb1b0_0
  flask-cors         pkgs/main/noarch::flask-cors-3.0.10-pyhd3eb1b0_0
  flask-jwt-extended conda-forge/noarch::flask-jwt-extended-4.3.1-pyhd8ed1ab_0
  flask-restx        pkgs/main/noarch::flask-restx-0.5.1-pyhd3eb1b0_0
  importlib-metadata pkgs/main/osx-arm64::importlib-metadata-4.8.2-py38hca03da5_0
  importlib_metadata pkgs/main/noarch::importlib_metadata-4.8.2-hd3eb1b0_0
  itsdangerous       pkgs/main/noarch::itsdangerous-2.0.1-pyhd3eb1b0_0
  jinja2             pkgs/main/noarch::jinja2-3.0.2-pyhd3eb1b0_0
  jsonschema         pkgs/main/noarch::jsonschema-3.2.0-pyhd3eb1b0_2
  libcxx             pkgs/main/osx-arm64::libcxx-12.0.0-hf6beb65_1
  libffi             pkgs/main/osx-arm64::libffi-3.4.2-hc377ac9_2
  markupsafe         pkgs/main/osx-arm64::markupsafe-2.0.1-py38h1a28f6b_0
  ncurses            pkgs/main/osx-arm64::ncurses-6.3-h1a28f6b_2
  openssl            pkgs/main/osx-arm64::openssl-1.1.1m-h1a28f6b_0
  pip                pkgs/main/osx-arm64::pip-21.2.4-py38hca03da5_0
  pyjwt              pkgs/main/osx-arm64::pyjwt-2.1.0-py38hca03da5_0
  pyrsistent         pkgs/main/osx-arm64::pyrsistent-0.18.0-py38h1a28f6b_0
  python             pkgs/main/osx-arm64::python-3.8.11-hbdb9e5c_5
  python-dateutil    pkgs/main/noarch::python-dateutil-2.8.2-pyhd3eb1b0_0
  pytz               pkgs/main/noarch::pytz-2021.3-pyhd3eb1b0_0
  readline           pkgs/main/osx-arm64::readline-8.1.2-h1a28f6b_1
  setuptools         pkgs/main/osx-arm64::setuptools-58.0.4-py38hca03da5_1
  six                pkgs/main/noarch::six-1.16.0-pyhd3eb1b0_0
  sqlite             pkgs/main/osx-arm64::sqlite-3.37.2-h1058600_0
  tk                 pkgs/main/osx-arm64::tk-8.6.11-hb8d0fd4_0
  tranquilizer       conda-forge/noarch::tranquilizer-0.7.0-pyhd8ed1ab_0
  werkzeug           pkgs/main/noarch::werkzeug-1.0.1-pyhd3eb1b0_0
  wheel              pkgs/main/noarch::wheel-0.37.1-pyhd3eb1b0_0
  xz                 pkgs/main/osx-arm64::xz-5.2.5-h1a28f6b_0
  zipp               pkgs/main/noarch::zipp-3.7.0-pyhd3eb1b0_0
  zlib               pkgs/main/osx-arm64::zlib-1.2.11-h5a0b063_4


Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
#     $ conda activate /Users/adefusco/Desktop/envYaml/envs/envYaml
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Collecting requests
  Using cached requests-2.27.1-py2.py3-none-any.whl (63 kB)
Requirement already satisfied: certifi>=2017.4.17 in ./envs/envYaml/lib/python3.8/site-packages (from requests) (2021.10.8)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.8-py2.py3-none-any.whl (138 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting charset-normalizer~=2.0.0
  Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Installing collected packages: urllib3, idna, charset-normalizer, requests
Successfully installed charset-normalizer-2.0.12 idna-3.3 requests-2.27.1 urllib3-1.26.8
The project is ready to run commands.
Use `anaconda-project list-commands` to see what's available.

Now we can run a command available in the PATH for the conda environment.

> conda-project run python --version
Python 3.8.11

>  conda-project run tranquilizer cheese_shop.py --port 8080
 * Serving Flask app "tranquilizer.application" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit) 

Finally, after adding packages to the the environment.yml file they can be installed. (use the --refresh to completely rebuild the env, prepare will not remove packages)

# add numpy to the dependencies section using an editor

> conda-project prepare
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /Users/adefusco/Desktop/envYaml/envs/envYaml

  added / updated specs:
    - numpy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    numpy-1.21.2               |   py38hb38b75b_0           9 KB
    numpy-base-1.21.2          |   py38h6269429_0         4.2 MB
    ------------------------------------------------------------
                                           Total:         4.2 MB

The following NEW packages will be INSTALLED:

  blas               conda-forge/osx-arm64::blas-2.113-openblas
  blas-devel         conda-forge/osx-arm64::blas-devel-3.9.0-13_osxarm64_openblas
  libblas            conda-forge/osx-arm64::libblas-3.9.0-13_osxarm64_openblas
  libcblas           conda-forge/osx-arm64::libcblas-3.9.0-13_osxarm64_openblas
  libgfortran        pkgs/main/osx-arm64::libgfortran-5.0.0-11_1_0_h6a59814_26
  libgfortran5       pkgs/main/osx-arm64::libgfortran5-11.1.0-h6a59814_26
  liblapack          conda-forge/osx-arm64::liblapack-3.9.0-13_osxarm64_openblas
  liblapacke         conda-forge/osx-arm64::liblapacke-3.9.0-13_osxarm64_openblas
  libopenblas        conda-forge/osx-arm64::libopenblas-0.3.18-openmp_h5dd58f0_0
  llvm-openmp        pkgs/main/osx-arm64::llvm-openmp-12.0.0-haf9daa7_1
  numpy              pkgs/main/osx-arm64::numpy-1.21.2-py38hb38b75b_0
  numpy-base         pkgs/main/osx-arm64::numpy-base-1.21.2-py38h6269429_0
  openblas           conda-forge/osx-arm64::openblas-0.3.18-openmp_h3b88efd_0



Downloading and Extracting Packages
numpy-base-1.21.2    | 4.2 MB    | ########## | 100% 
numpy-1.21.2         | 9 KB      | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
The project is ready to run commands.
Use `anaconda-project list-commands` to see what's available.

locking dependencies

To write the lock file (fully specified cross-platform environments) project-lock.yml

> conda-project lock
Updating locked dependencies for env spec envYaml...
Resolving conda packages for osx-arm64
Resolving conda packages for linux-64
Resolving conda packages for osx-64
Resolving conda packages for win-64
Changes to locked dependencies for envYaml:
  platforms:
+   linux-64
+   osx-64
+   osx-arm64
+   win-64

the lock file also included all pip packages (i.e., pip freeze)

> tail project-lock.yml
      - setuptools=58.0.4=py38haa95532_0
      - sqlite=3.37.2=h2bbff1b_0
      - vc=14.2=h21ff451_1
      - vs2015_runtime=14.27.29016=h5e58377_2
      - wincertstore=0.2=py38haa95532_2
      pip:
      - charset-normalizer==2.0.12
      - idna==3.3
      - requests==2.27.1
      - urllib3==1.26.8

If the environment.yml file differs from its version when the project-lock.yml file was created then conda-project commands will print a warning.

> conda-project run python --version
Potential issues with this project:
  * project-lock.yml: Env spec 'envYaml' has changed since the lock file was last updated (env spec hash has changed from a8520e7c59d22b0f271b4098e0041ca14a7404a6 to 7aa631535b843f3889a7fcec42886043d5c8d12a)
  * project-lock.yml: Lock file is missing 1 packages for env spec envYaml on linux-64 (pandas)
  * project-lock.yml: Lock file is missing 1 packages for env spec envYaml on osx-64 (pandas)
  * project-lock.yml: Lock file is missing 1 packages for env spec envYaml on osx-arm64 (pandas)
  * project-lock.yml: Lock file is missing 1 packages for env spec envYaml on win-64 (pandas)

to remove the warning update will re-lock the packages and install the missing pacakges

> conda-project update
Updating locked dependencies for env spec envYaml...
Resolving conda packages for osx-arm64
Resolving conda packages for linux-64
Resolving conda packages for osx-64
Resolving conda packages for win-64
Changes to locked dependencies for envYaml:
  packages:
    all:
+     packaging=21.3=pyhd3eb1b0_0
+     pyparsing=3.0.4=pyhd3eb1b0_0
    linux-64:
+     bottleneck=1.3.2=py38heb32a55_1
+     numexpr=2.8.1=py38h6abb31d_0
+     pandas=1.3.5=py38h8c16a72_0
    osx-64:
+     bottleneck=1.3.2=py38hf1fa96c_1
+     numexpr=2.8.1=py38h2e5f0a9_0
+     pandas=1.3.5=py38h743cdd8_0
    osx-arm64:
+     bottleneck=1.3.2=py38heec5a64_1
+     numexpr=2.8.1=py38h144ceef_0
+     pandas=1.3.5=py38h9197a36_0
    win-64:
+     bottleneck=1.3.2=py38h2a96729_1
+     numexpr=2.8.1=py38hb80d3ca_0
+     pandas=1.3.5=py38h6214cd6_0
-   pip:
-     charset-normalizer==2.0.12
-     idna==3.3
-     requests==2.27.1
-     urllib3==1.26.8
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /Users/adefusco/Desktop/envYaml/envs/envYaml

  added / updated specs:
    - bottleneck==1.3.2=py38heec5a64_1
    - numexpr==2.8.1=py38h144ceef_0
    - packaging==21.3=pyhd3eb1b0_0
    - pandas==1.3.5=py38h9197a36_0
    - pyparsing==3.0.4=pyhd3eb1b0_0


The following NEW packages will be INSTALLED:

  bottleneck         pkgs/main/osx-arm64::bottleneck-1.3.2-py38heec5a64_1
  numexpr            pkgs/main/osx-arm64::numexpr-2.8.1-py38h144ceef_0
  packaging          pkgs/main/noarch::packaging-21.3-pyhd3eb1b0_0
  pandas             pkgs/main/osx-arm64::pandas-1.3.5-py38h9197a36_0
  pyparsing          pkgs/main/noarch::pyparsing-3.0.4-pyhd3eb1b0_0


Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Updated locked dependencies for env spec envYaml in project-lock.yml.
Update complete.

environment.yml and project.yml

To extend the environment.yml file with specific project features like environment variables, supported platforms, commands, and data sets. For example

name: envYaml

variables:
  PROJECT_VAR: 'value set in project.yml'

downloads:
  MPG_CSV: https://bit.ly/autompg-csv

commands:
  default:
    unix: env | grep PROJECT_VAR

platforms:
  - osx-64
  - osx-arm64

Beginning by running prepare we see that the environment is created and the dataset downloaded

# To activate this environment, use
#
#     $ conda activate /Users/adefusco/Desktop/envYaml/envs/envYaml
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Collecting requests
  Using cached requests-2.27.1-py2.py3-none-any.whl (63 kB)
Requirement already satisfied: certifi>=2017.4.17 in ./envs/envYaml/lib/python3.8/site-packages (from requests) (2021.10.8)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.8-py2.py3-none-any.whl (138 kB)
Collecting charset-normalizer~=2.0.0
  Using cached charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.3-py3-none-any.whl (61 kB)
Installing collected packages: urllib3, idna, charset-normalizer, requests
Successfully installed charset-normalizer-2.0.12 idna-3.3 requests-2.27.1 urllib3-1.26.8
exoplanets.csv: 100%|████████████████████████████████████████████████████████████████████| 0.28/0.28 [00:00<00:00, 3.33s/MiB]
The project is ready to run commands.
Use `anaconda-project list-commands` to see what's available.

Now running the default command verifies that the env variable is set

> conda-project run
Previously downloaded file located at /Users/adefusco/Desktop/envYaml/exoplanets.csv
PROJECT_VAR=value set in project.yml

And finally, running lock will only lock for the provided platforms

> conda-project lock
Updating locked dependencies for env spec envYaml...
Resolving conda packages for osx-arm64
Resolving conda packages for osx-64
Changes to locked dependencies for envYaml:
  platforms:
+   osx-64
+   osx-arm64

requirements.txt

If there is a requirements.txt in the project directory (and no environment.yml) all packages listed will be installed as pip packages.

requests
tranquilizer==0.4.2

Running the prepare will first create a Conda environment with the most recent version of Python (3.8) and pip and then add the packages in the requirements.txt file.

> conda-project prepare
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/reqs/envs/default

  added / updated specs:
    - python


The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/osx-64::ca-certificates-2020.1.1-0
  certifi            pkgs/main/osx-64::certifi-2020.4.5.2-py38_0
  libcxx             pkgs/main/osx-64::libcxx-10.0.0-1
  libedit            pkgs/main/osx-64::libedit-3.1.20181209-hb402a30_0
  libffi             pkgs/main/osx-64::libffi-3.3-h0a44026_1
  ncurses            pkgs/main/osx-64::ncurses-6.2-h0a44026_1
  openssl            pkgs/main/osx-64::openssl-1.1.1g-h1de35cc_0
  pip                pkgs/main/osx-64::pip-20.0.2-py38_3
  python             pkgs/main/osx-64::python-3.8.3-h26836e1_1

And confirm the pip packages were installed

>conda list -p envs/default
# packages in environment at /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/reqs/envs/default:
#
# Name                    Version                   Build  Channel
aniso8601                 8.0.0                    pypi_0    pypi
attrs                     19.3.0                   pypi_0    pypi
ca-certificates           2020.1.1                      0  
certifi                   2020.4.5.2               py38_0  
chardet                   3.0.4                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
flask                     1.1.2                    pypi_0    pypi
flask-restplus            0.13.0                   pypi_0    pypi

If you require a different version of Python it can be supplied during prepare

>conda-project prepare --python=3.6
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/reqs/envs/default

  added / updated specs:
    - python=3.6


The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/osx-64::ca-certificates-2020.1.1-0
  certifi            pkgs/main/osx-64::certifi-2020.4.5.2-py36_0
  libcxx             pkgs/main/osx-64::libcxx-10.0.0-1
  libedit            pkgs/main/osx-64::libedit-3.1.20181209-hb402a30_0
  libffi             pkgs/main/osx-64::libffi-3.3-h0a44026_1
  ncurses            pkgs/main/osx-64::ncurses-6.2-h0a44026_1
  openssl            pkgs/main/osx-64::openssl-1.1.1g-h1de35cc_0
  pip                pkgs/main/osx-64::pip-20.0.2-py36_3
  python             pkgs/main/osx-64::python-3.6.10-hf48f09d_2

Again, you can run any executable in the environment

>conda-project run tranquilizer --version
tranquilizer 0.4.2

Again, you can add packages to requirements.txt and install them with prepare

# edit requirements.txt to add pytest

> conda-project prepare
The project is ready to run commands.
Use `anaconda-project list-commands` to see what's available.




> conda list -p envs/default py
# packages in environment at /Users/adefusco/Development/AnacondaPlatform/anaconda-project/examples/reqs/envs/default:
#
# Name                    Version                   Build  Channel
py                        1.8.2                    pypi_0    pypi
pyparsing                 2.4.7                    pypi_0    pypi
pyrsistent                0.16.0                   pypi_0    pypi
pytest                    5.4.3                    pypi_0    pypi
python                    3.6.10               hf48f09d_2  
python-dateutil           2.8.1                    pypi_0    pypi
pytz                      2020.1                   pypi_0    pypi

AlbertDeFusco and others added 17 commits February 13, 2022 21:30
An anaconda-project.yml file is no longer required.

* If environment.yml or environment.yaml file exists
   Create the env as envs/{name} where name comes from the file.
   This works for pip entries, too.

* If requirements.txt file exists
   Create the env as env/default with the latest version of python.
   Add packages from requirements.txt as pip installs
the --python=x.x flag will ensure that the base conda
environment has your requested version of python before running
pip installs

* also stop anaconda-project run from writing the file to disk
* also list pip packages with list-packages command
with out a name field the env-spec
becomes "default"
This reverts commit fdd723b7e8e68671217f42dc4b866ecf0905fb6c.

Need to think about this more carefully
* anaconda-project entry point renamed to conda-project
    * conda searches for conda-* executables in the PATH
      to enable "conda project"

* conda-project.yml is preferred but anaconda-project.yml
  is still supported

* cli help updated to reflect conda-project

* conda recipe updated to build conda-project
(and project-lock.yml)
@AlbertDeFusco
Copy link
Collaborator Author

It will be good to rebase against #359 once it's merged. It should help with adding env_specs to the project.yml file that can extend the default env from environment.yml.

Also we could use env_specs to install non-pypi packages before applying the supplied requirements.txt.

@jlstevens
Copy link
Collaborator

jlstevens commented Feb 14, 2022

This looks great!

Here are some general questions/thoughts/comments:

  1. Although requirements.txt files are common (which means it makes sense to support them), I do worry that pip isn't as reliable for general package reproducibility as conda. This has been historically the case, but maybe things have improved now that pip has a solver...

  2. People like to treat environmnent.yml files as if they are locks when they are not. The workflow here is that you could have an env spec coming from an environmnent.yml which you should then lock with conda-project lock (I do believe this should have always been a core conda feature and not something that should have been defined only at the project level). Given the history of things, I wonder if there could be some way to encourage locking: for instance, maybe the conda-project archive command could put up an interactive prompt by default offering to generate a lock if one is missing?

  3. Given that there is some equivalence between a project.yml which an env spec and a project.yml with an environment.yml, I wonder if it would be worth offering a tool to convert between these formats. The reason not to do this (even if it could be useful!) is that we are trying to reduce the CLI surface of conda project...

  4. Just spitballing here, but maybe there could be a 'core' conda project with the most essential features and an optional, additional package to add extra commands for the people who want them. For instance, the commands that manipulate the yaml files by adding/removing/updating package specs, or this conversion tool I just suggested?

    Edit: I see in [PROPOSAL] conda project cli #284 that @AlbertDeFusco suggests we can drop these commands due to the environment.yml support added here..I think I agree but there are still the following commands that we are unsure of: archive/unarchive, upload, download, dock. Of these, I would most want to keep archive/unarchive...

    If this were done (and I'm still trying to decide if I think this is a good idea or not!) then the supported project.yml spec would stay the same, all that would be different would be the set of commands/tools offered.

  5. Although it doesn't really relate to the CLI, we have discussed that in conda-project that the yaml format can be simplified by dropping the notebook, notebooks and bokeh commands. We only really need the unix and win commands and some docs showing how to use the jinja2 templating to achieve the equivalent functionality.

@AlbertDeFusco
Copy link
Collaborator Author

Some great things in there for us to dig into. Here's another suggestion from Matt

#362

@jbednar
Copy link
Collaborator

jbednar commented Feb 14, 2022

@jlstevens , what's the difference between the two files you're suggesting in 3. to convert? I can't think of any required difference, and would argue for a single type of file, with commands ignored unless one does a prorject run invocation...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants