-
-
Notifications
You must be signed in to change notification settings - Fork 945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slim down dependencies #1313
Comments
I just noticed that there is a discussion of this in #1261. I'll leave this open for visibility unless people would prefer to close. |
Thanks for opening the issue (one item I can strike out on my to do list :-)) There is some discussion about this also happening in #1261 |
For conda, I think we can have a geopandas-base or geopandas-core that depends on: pandas, shapely, pyproj, rtree. Also rtree is optional, but since spatial join, overlay and clip are rather essential operations, I would prefer to keep this even in the core installation. But it can certainly be considered as well. (in theory even pyproj could be made optional, since a GeoDataFrame does not require to have a For pip I am less sure. I would find it slightly annoying that |
it seems i'm in the minority here but, IMHO dropping these dependencies would be counterproductive for the vast majority of users. As @jorisvandenbossche describes above, I'm already frustrated that
What case do users have for geopandas in that setting? Why not just use pandas/shapely directly or spatialpandas? I get that as a general principle, fewer dependencies makes for a preferable alternative. But in the case of packages like |
Note that if you are using conda and if you are used to do Also note that I only proposed to not include fiona in the core dependencies, and only mentioned rtree and pyproj as theoretical options (I am myself also not in favor of dropping as those, as mentioned above, unless someone comes with good reasons). |
thats fair, of course. Also, just to be clear, i didn't mean to come in here and start an argument, just to raise an alternative view--and i was curious about the counterpoints :) |
One alternative would be to define define the geopandas package on PyPI/conda to grab all the dependencies and make geopandas-base a separate package.
With this, neither the PyPI nor conda target would change, and restricted users could use conda/pip install geopandas-base and program on top of geopandas-base? Is the issue that we want to support programming on top of geopandas-base as if it were geopandas? |
I vote for @ljwolf's proposal, assuming he means GDAL, not GEOS. I would like to see Then we should have geopandas-base coming with the bare minimum, probably even without rtree and pyproj. If advanced users have an issue with some of the C deps, they can just install geopandas-base and only those parts they require. It would require just a small change to the codebase, the rest is the question of packaging, i.e. new recipe on conda-forge. Not sure about PyPI, how would that work because we would have to alter setup.py and requirements. |
Yes, that is what I was intending for conda (and we have examples of matplotlib or dask that do this from a single feedstock (https://github.com/conda-forge/matplotlib-feedstock/blob/master/recipe/meta.yaml), or multiple feedstocks (https://github.com/conda-forge/dask-core-feedstock/blob/master/recipe/meta.yaml)). So there we have examples. For pip I am less sure (eg matplotlib and dask don't do something similar on PyPI). |
Shall we start with conda and then see if there is a need to try the same on PyPI later based on the response? |
I guess this is why i'm confused... If you have conda at your disposal, then installing gdal is trivial, no? The trouble comes from installing so if the root of the problem is that gdal/fiona are difficult to install with pip, why is it useful to create another conda package without gdal? |
i vote for calling it |
An update on this - conda recipe currently offers a minimal We could still do the same for pip in some way. |
To still answer this (very lately), I see two main reasons: 1) even with conda, installing gdal/fiona is still the package that can give problems from time to time (given the many c dependencies, it most easily gives channel conflicts, or some temporary error if one of the packages gets updated, or ..), 2) more importantly, it gives a large install size, and if you don't need gdal/fiona, the geopandas-base package gives you the option to get a lighter env (which can be useful in cases where size matters, eg in containers, AWS lambda, ..) |
I was chatting with @jorisvandenbossche at the dask developer meeting last week and he mentioned that gdal is only required for fiona which handles the IO parts of geopandas.
Since gdal is known to be a pain to install, it'd be nice if geopandas were split into two conda packages geopandas-core and geopandas. geopandas-core would include all the dependencies except fiona, and geopandas would include fiona, geopandas-core and all the current dependencies.
For pip installs there could be more subsets of dependencies, but the full install would be
pip install geoviews[complete]
.This pattern had been established in dask and other projects (such as geoviews).
The text was updated successfully, but these errors were encountered: