Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin upper pandas version requirement #1182

Merged
merged 3 commits into from
Jan 31, 2022
Merged

Conversation

antonymilne
Copy link
Contributor

@antonymilne antonymilne commented Jan 27, 2022

Description

Fixes #1179. pandas have released 1.4.0, which requires xlrd>=2.0.1 (and is only compatible with Python >=3.8). This is incompatible with our xlrd~=1.0.

Development notes

Note that pandas was unpinned in August 2020 in https://github.com/quantumblacklabs/private-kedro/pull/772. In hindsight we probably should have reinstated an upper bound some time after that. I've done so now to restrict to <1.4.

The alternative to this was to allow pandas 1.4 and relax our requirement for xlrd to include 2.0.1. I opted not to do this since xlrd 2.0 introduced breaking changes.

Questions

  1. Does this seem like the right thing to do? It stops people from using pandas>=1.4 which is maybe annoying for users who want to use any new features in those releases. We could do something to make the version of xlrd installed depend on the version of pandas so that not everyone gets xlrd>=2.0, but that seemed like overkill to me.
  2. Looking at pandas release notes, every 1.x release (like 1.1, 1.2, 1.3, etc.) appears to break backwards compatibility. So I wonder whether we should change the specification in develop also (currently pandas~=1.3)?

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: Antony Milne <antony.milne@quantumblack.com>
Signed-off-by: Antony Milne <antony.milne@quantumblack.com>
Copy link
Contributor

@lorenabalan lorenabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I think this is perfectly fine. At the end of the day, users can still install newer pandas themselves with pip install pandas rather than through pip install kedro[pandas], and they would have to be careful to use openpyxl instead of xlrd. This won't be as much of a problem in 0.18 cause we default to openpyxl then.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Signed-off-by: Antony Milne <antony.milne@quantumblack.com>
@lorenabalan lorenabalan merged commit f59739c into main Jan 31, 2022
@lorenabalan lorenabalan deleted the bugfix/pandas-version branch January 31, 2022 09:41
Galileo-Galilei pushed a commit to Galileo-Galilei/kedro that referenced this pull request Feb 19, 2022
* Add reference to layers medium article

* Update docs/source/12_faq/01_faq.md

Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com>

* Update 01_faq.md

Fix spacing

* Update docs/source/12_faq/01_faq.md

Co-authored-by: Jo Stichbury <stichbury@users.noreply.github.com>

Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com>
Co-authored-by: Jo Stichbury <stichbury@users.noreply.github.com>
lvijnck pushed a commit to lvijnck/kedro that referenced this pull request Apr 7, 2022
Signed-off-by: Laurens Vijnck <laurens_vijnck@mckinsey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExcelDataSet not working after pandas 1.4.0 release (python 3.8)
5 participants