Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.10.1 & turn into multi-output recipe to account for optional installs #14

Merged
merged 16 commits into from
Jul 19, 2021

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Mar 5, 2021

While updating the dependencies for #13, I noticed that there are new optional installs (and that this recipe hasn't done a good job in keeping in sync with the dependency changes...)

In order to make the core install less heavy, I propose the following mapping of installs

pip install conda install Comment
modin[all] modin-all
modin[dask] modin-dask
modin[ray] modin-ray
modin[remote] modin-remote Currently disabled, rpyc not in conda-forge yet
modin[spreadsheet] modin-spreadsheet Currently disabled, modin-spreadsheet not in conda-forge yet; Note: this name would currently conflict between the modin-optional and the separate package!
modin modin Needed for compatibility with previous installs. Should probably replace what I've currently named modin-core, but could also point to modin-all

Finally, there's a problem with the re-rendering, that conda-smithy reintroduces the 3.6 job, but that's unrelated to this discussion.

Builds on #13 (which should be merged first) and incorporates those commits

Closes #17
Closes #18

CC @devin-petersohn

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@devin-petersohn
Copy link
Contributor

@h-vetinari Your plan looks great. I agree that conda install -c conda-forge modin should probably point to modin-core. Ideally conda install would only add the minimum requirements.

We can skip the spreadsheet and remote as those are very early releases and probably not ready for mass adoption.

@h-vetinari
Copy link
Member Author

h-vetinari commented Mar 5, 2021

Your plan looks great.

Happy to hear it! :)
Over several packages with many optionals, I've found this to be an approach that works well. I'd need your help a bit how to write meaningful tests for the various subpackages; feel free to commit in this PR if you want.

I agree that conda install -c conda-forge modin should probably point to modin-core. Ideally conda install would only add the minimum requirements.

I've changed my mind on this slightly, because conda install modin and then python -c "import modin" fails, since there's no engine. In the meantime, I'm pointing modin to require modin-dask (since ray is a more heavy-weight dependency and not available on osx yet).

We can skip the spreadsheet and remote as those are very early releases and probably not ready for mass adoption.

That's fine, they're currently skipped in the recipe. The only thing I'd like to catch early is if we can avoid the naming clash between modin-spreadsheet (the PyPI package) and modin-spreadsheet==modin[spreadsheet] (the modin-optional). One solution would be to not package the PyPI package for conda-forge independently, but only package it here in this feedstock for the output modin-spreadsheet (which then still is the functional equivalent of modin[spreadsheet]). WDYT?

@devin-petersohn
Copy link
Contributor

One solution would be to not package the PyPI package for conda-forge independently, but only package it here in this feedstock for the output modin-spreadsheet (which then still is the functional equivalent of modin[spreadsheet]). WDYT?

I think that makes sense, we could also have modin-spreadsheet under a different name (modin-spreadsheet-core) in conda forge. We aren't anticipating anyone directly installing it without Modin (it wont work without Modin anyway), but want it to make sure that it can evolve quickly and independently since it's quite early.

@devin-petersohn
Copy link
Contributor

@h-vetinari What do you need from me on this?

Copy link
Member Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devin-petersohn: What do you need from me on this?

I left a couple of comments where I could use some inputs. :)

recipe/meta.yaml Outdated
Comment on lines 62 to 66
test:
commands:
# cannot import modin because it needs an engine (dask or ray)
# TODO: add core-specific tests!
- exit 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use some advice for core-specific tests here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For core specific, we could just import modin. We could also add a pytest test file with a with pytest.raises block in it to ensure it does raise an error.

Related question: do we need modin-core on conda forge? We have it in pypi to be as lightweight as possible because often a user will have one of ray/dask installed already. Do you think it will help to have it here too (even just to mirror pypi)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, import modin doesn't work for core, because there's no engine, and the import fails on that.

modin-core makes a lot of sense IMO, even if it's not intended for general usage, because it allows us to have a common base for the people to either choose dask and/or ray as an engine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import modin should work on the most recent releases.

import modin  # doesn't touch engine code
import modin.pandas

This will fail:

import modin
from modin.config import Engine

Engine.get()  # triggers check for engine

recipe/meta.yaml Outdated
Comment on lines 78 to 86
test:
imports:
# are there more specific imports?!
- modin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there modin[dask]-specific things to test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing the imports should be sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

recipe/meta.yaml Outdated
Comment on lines 95 to 105
test:
imports:
# are there more specific imports?!
- modin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there modin[ray]-specific things to test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, testing import should be sufficient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

recipe/meta.yaml Outdated
Comment on lines 118 to 148
# TODO: resolve conflict modin[spreadsheet] vs. modin-spreadsheet
- name: modin-spreadsheet
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your comment, my understanding now would be:

  • leave this as modin-spreadsheet in this package
  • when bringing in the modin-spreadsheet from PyPI to conda-forge, name it something else, like modin-spreadsheet-core

Does this match your view?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this matches my view

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's remember that whenever we create that staged-recipes PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little late to the party, but I don't think a separate package is needed - all modin-spreadsheet would do would be add a dependency to modin-spreadsheet-core, what's the point of having two separate packages? I think the other solution (make modin-spreadsheet subpackage include modin-spreadsheet from PyPI) makes more sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it more, I think we definitely need a separate feedstock, because modin-spreadsheet (separate package from PyPI) has its own separate versioning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we just need to make sure that the separate feedstock is named modin-spreadsheet-core...

@devin-petersohn
Copy link
Contributor

@h-vetinari Sorry it's been a hectic couple of weeks

I left a couple of comments where I could use some inputs. :)

I missed them before :/. I will respond

Copy link
Member Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's been a hectic couple of weeks

No need to apologise. Here are my answers.

recipe/meta.yaml Outdated
Comment on lines 62 to 66
test:
commands:
# cannot import modin because it needs an engine (dask or ray)
# TODO: add core-specific tests!
- exit 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, import modin doesn't work for core, because there's no engine, and the import fails on that.

modin-core makes a lot of sense IMO, even if it's not intended for general usage, because it allows us to have a common base for the people to either choose dask and/or ray as an engine.

recipe/meta.yaml Outdated
Comment on lines 78 to 86
test:
imports:
# are there more specific imports?!
- modin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

recipe/meta.yaml Outdated
Comment on lines 95 to 105
test:
imports:
# are there more specific imports?!
- modin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

recipe/meta.yaml Outdated
Comment on lines 118 to 148
# TODO: resolve conflict modin[spreadsheet] vs. modin-spreadsheet
- name: modin-spreadsheet
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's remember that whenever we create that staged-recipes PR.

@h-vetinari h-vetinari mentioned this pull request Jul 13, 2021
3 tasks
recipe/meta.yaml Outdated

- name: modin-remote
build:
# dependency rpyc is not in conda-forge yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do note that we also require ray-autoscaler for modin-remote, and autoscaler is also not on conda-forge now anymore because a lot of its dependencies are way out of date...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you open an issue on the ray-feedstock and maybe on the feedstocks of the missing dependencies?

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe:

  • It looks like the 'modin-omnisci' output doesn't have any tests.

Signed-off-by: Vasily Litvinov <vasilij.n.litvinov@intel.com>
Also add placeholder for modin-sql

Signed-off-by: Vasily Litvinov <vasilij.n.litvinov@intel.com>
Signed-off-by: Vasily Litvinov <vasilij.n.litvinov@intel.com>
@h-vetinari
Copy link
Member Author

h-vetinari commented Jul 13, 2021

Disabling python 3.9 should not be necessary. Something in the dependencies of pyomniscidbe depends on glibc 2.17, so we just need to follow suit there (potentially need to add - sysroot_linux-64 2.17 # [linux64] as a dependency)

@vnlitvinov
Copy link
Contributor

I have only disabled modin-omnisci on Python 3.9 because there are no packages for pyomniscidbe for 3.9 🙄 refs: https://anaconda.org/conda-forge/pyomniscidbe/files and https://github.com/conda-forge/omniscidb-feedstock/blob/master/recipe/meta.yaml#L360

@h-vetinari
Copy link
Member Author

Aha, I had looked at https://anaconda.org/conda-forge/pyomniscidb/files, and there are some py39 builds. What's the difference between having the "e" at the end or not?

@h-vetinari
Copy link
Member Author

Nevermind, I can see there are different outputs in the recipe. In any case, that comment you linked is not current anymore, there are builds for tbb4py: https://anaconda.org/conda-forge/tbb4py/files

If you want you can force-push your changes again, but I'd prefer building pyomniscidbe for py39 before merging this.

@h-vetinari
Copy link
Member Author

Waiting for the builds from conda-forge/omniscidb-feedstock#46 (should be ~3 more hours), thanks @vnlitvinov!

@h-vetinari
Copy link
Member Author

PS. Sorry for messing up your commits, I had forgotten to update before I force-pushed. I managed to recover them using https://objectpartners.com/2014/02/11/recovering-a-commit-from-githubs-reflog/ 🙃

@h-vetinari h-vetinari changed the title RFC: turn into multi-output recipe to account for optional installs v0.10.1 & turn into multi-output recipe to account for optional installs Jul 13, 2021
@h-vetinari
Copy link
Member Author

@devin-petersohn
I cannot approve my own PR, but after the help hand of @vnlitvinov, I'd be happy to merge this now. Any final comments/concerns?

@vnlitvinov
Copy link
Contributor

Aha, I had looked at https://anaconda.org/conda-forge/pyomniscidb/files, and there are some py39 builds. What's the difference between having the "e" at the end or not?

All difference in the world, or really bad naming (IMHO). The version with "e" is "Python bindings for SQL being run as library" (somewhat similar to sqlite but much more powerful), while without "e" is the Python interface to a SQL server run as a separate application. They serve different purposes.

P.S. I've written that yesterday but for some reason didn't publish, so publishing now.

Copy link
Contributor

@vnlitvinov vnlitvinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (not that it counts now, but still)

version: {{ version }}

source:
url: https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/{{ name }}-{{ version }}.tar.gz
sha256: 88ce4def4a1145d31d207be55b9c3ba914e084ccfebace1d52b03507df258fd2
url: https://github.com/modin-project/modin/archive/refs/tags/{{ version }}.tar.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a comment explaining why we take github instead of pypi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can if you want. It might just be this release where the sources are different - for the feedstocks I maintain, I consider PyPI & GH sources to be completely equivalent (actually with a slight preference to GH), and not really worth commenting about. But YMMV.

recipe/meta.yaml Outdated
Comment on lines 118 to 148
# TODO: resolve conflict modin[spreadsheet] vs. modin-spreadsheet
- name: modin-spreadsheet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it more, I think we definitely need a separate feedstock, because modin-spreadsheet (separate package from PyPI) has its own separate versioning.

Comment on lines +148 to +149
# Also requires ray-autoscaler which is now absent because
# its dependencies on conda-forge are too ancient.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vnlitvinov, I don't see the dependency on ray[autoscaler] in setup.py for modin[remote]. Where does this come from? Again a conda-only dependency?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just knowledge of how things are implemented inside (as I was the one who implemented them 😄).
That said, ray does not have a ray[autoscale] target yet, ref: ray-project/ray#15725

Copy link
Contributor

@vnlitvinov vnlitvinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@h-vetinari
Copy link
Member Author

@devin-petersohn
I cannot approve my own PR, but after the help hand of @vnlitvinov, I'd be happy to merge this now. Any final comments/concerns?

Last call (not that I think it's critical...) - will merge in about 24h unless there are objections.

@vnlitvinov
Copy link
Contributor

I think Devin is out this week.

@h-vetinari
Copy link
Member Author

I think Devin is out this week.

OK. Do you think we should wait or merge?

@vnlitvinov
Copy link
Contributor

IIRC he should be back on Monday or Tuesday, we could wait for him, no big deal.

Copy link
Contributor

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me take a look before merging. LGTM!

@devin-petersohn devin-petersohn merged commit e19affc into conda-forge:master Jul 19, 2021
@h-vetinari h-vetinari deleted the multi branch July 20, 2021 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants