Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Resampling with non nano returns an error #57427

Open
3 tasks done
khider opened this issue Feb 14, 2024 · 2 comments
Open
3 tasks done

BUG: Resampling with non nano returns an error #57427

khider opened this issue Feb 14, 2024 · 2 comments
Labels
Bug Non-Nano datetime64/timedelta64 with non-nanosecond resolution Regression Functionality that used to work in a prior pandas version Resample resample method
Milestone

Comments

@khider
Copy link

khider commented Feb 14, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np
import operator

#Load the data

url = 'https://raw.githubusercontent.com/LinkedEarth/Pyleoclim_util/master/pyleoclim/data/LR04.csv'
df = pd.read_csv(url,skiprows=4,header=0)

multiplier = 1
rule = f'{1_000*multiplier}YS'

op=operator.sub

SECONDS_PER_YEAR = 31556925.974592 
index = pd.DatetimeIndex(op(
    np.datetime64(str(1950), 's'),
    (df.iloc[:,0]*SECONDS_PER_YEAR*10**3).astype('timedelta64[s]')
),name='datetime')

value = df.iloc[:,1].to_numpy()

ser = pd.Series(value, index=index)
ser2 = ser.resample(rule)

Issue Description

Produces the following error:

Traceback (most recent call last):

  Cell In[13], line 1
    ser2 = ser.resample(rule)

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/generic.py:9765 in resample
    return get_resampler(

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:2044 in get_resampler
    return tg._get_resampler(obj, kind=kind)

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:2225 in _get_resampler
    return DatetimeIndexResampler(

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:187 in __init__
    self.binner, self._grouper = self._get_binner()

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:252 in _get_binner
    binner, bins, binlabels = self._get_binner_for_time()

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:1735 in _get_binner_for_time
    return self._timegrouper._get_time_bins(self.ax)

  File ~/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/pandas/core/resample.py:2323 in _get_time_bins
    bins = lib.generate_bins_dt64(

  File lib.pyx:891 in pandas._libs.lib.generate_bins_dt64

ValueError: Values falls before first bin

Note that this issue did not exist in 2.1.4

Expected Behavior

A resampled timeseries as of 2.1.4

Installed Versions

pd.show_versions()
/Users/deborahkhider/opt/anaconda3/envs/paleopandas/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS

commit : fd3f571
python : 3.10.8.final.0
python-bits : 64
OS : Darwin
OS-release : 22.6.0
Version : Darwin Kernel Version 22.6.0: Wed Jul 5 22:21:56 PDT 2023; root:xnu-8796.141.3~6/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.0
numpy : 1.24.3
pytz : 2023.3
dateutil : 2.8.2
setuptools : 67.7.2
pip : 23.1.2
Cython : 0.29.33
pytest : None
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.2
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.14.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : 1.3.6
dataframe-api-compat : None
fastparquet : None
fsspec : 2023.6.0
gcsfs : None
matplotlib : 3.6.3
numba : 0.57.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.12.0
sqlalchemy : 2.0.23
tables : None
tabulate : 0.9.0
xarray : 2023.1.0
xlrd : 2.0.1
zstandard : None
tzdata : 2023.3
qtpy : 2.3.1
pyqt5 : None

@khider khider added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 14, 2024
@rhshadrach
Copy link
Member

rhshadrach commented Feb 16, 2024

Thanks for the report, I am seeing this also fail on 2.1.4 - can you check that again and post the pd.show_versions() of the 2.1.4 environment you're seeing it succeed in?

Also - I'm seeing this:

print(ser.index)
# DatetimeIndex(['-5298046-05-16 11:11:03', '-5303046-05-17 11:19:50',
#                '-5308046-05-19 11:28:37', '-5313046-05-20 11:37:24',
#                '-5318046-05-22 11:46:11'],
#               dtype='datetime64[s]', name='datetime', freq=None)

are those the desired values?

@rhshadrach rhshadrach added Needs Info Clarification about behavior needed to assess issue Resample resample method Non-Nano datetime64/timedelta64 with non-nanosecond resolution and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 16, 2024
@khider
Copy link
Author

khider commented Feb 16, 2024

Confirming that it works for me with 2.1.4 on two different Mac machines.

This is from the second machine:

pd.show_versions()
/Users/deborahkhider/anaconda3/envs/paleodev/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS
------------------
commit              : a671b5a8bf5dd13fb19f0e88edc679bc9e15c673
python              : 3.11.7.final.0
python-bits         : 64
OS                  : Darwin
OS-release          : 23.3.0
Version             : Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:59 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6030
machine             : arm64
processor           : arm
byteorder           : little
LC_ALL              : en_US.UTF-8
LANG                : en_US.UTF-8
LOCALE              : en_US.UTF-8

pandas              : 2.1.4
numpy               : 1.26.3
pytz                : 2023.3.post1
dateutil            : 2.8.2
setuptools          : 68.2.2
pip                 : 23.3.1
Cython              : None
pytest              : 8.0.0
hypothesis          : None
sphinx              : 5.0.2
blosc               : None
feather             : None
xlsxwriter          : None
lxml.etree          : None
html5lib            : None
pymysql             : None
psycopg2            : None
jinja2              : 3.1.2
IPython             : 8.20.0
pandas_datareader   : None
bs4                 : 4.12.2
bottleneck          : None
dataframe-api-compat: None
fastparquet         : None
fsspec              : None
gcsfs               : None
matplotlib          : 3.8.2
numba               : 0.58.1
numexpr             : None
odfpy               : None
openpyxl            : None
pandas_gbq          : None
pyarrow             : 15.0.0
pyreadstat          : None
pyxlsb              : None
s3fs                : None
scipy               : 1.12.0
sqlalchemy          : None
tables              : None
tabulate            : 0.9.0
xarray              : None
xlrd                : 2.0.1
zstandard           : None
tzdata              : 2023.4
qtpy                : 2.4.1
pyqt5               : None

Also, I recreated this function from the documentation of another code base that is now compiling fine on readthedocs.

Yes. these are the expected values.

@rhshadrach rhshadrach added Needs Triage Issue that has not been reviewed by a pandas team member and removed Needs Info Clarification about behavior needed to assess issue labels Mar 2, 2024
@lithomas1 lithomas1 added Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 22, 2024
@lithomas1 lithomas1 added this to the 2.2.2 milestone Mar 22, 2024
@lithomas1 lithomas1 modified the milestones: 2.2.2, 2.2.3 Apr 10, 2024
@lithomas1 lithomas1 modified the milestones: 2.2.3, 2.3 Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Non-Nano datetime64/timedelta64 with non-nanosecond resolution Regression Functionality that used to work in a prior pandas version Resample resample method
Projects
None yet
Development

No branches or pull requests

3 participants