Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: read_stata return non-nano #55642

Merged
merged 11 commits into from
Feb 2, 2024

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Pre-emptively addresses issues that come up when implementing #55564

cc @bashtage

@mroeschke mroeschke added IO Stata read_stata, to_stata Non-Nano datetime64/timedelta64 with non-nanosecond resolution labels Oct 23, 2023
@bashtage
Copy link
Contributor

Does the writer code automatically handle non-ns timestamps? Or does it need some work? I give some time if needed.

@jbrockmendel
Copy link
Member Author

Does the writer code automatically handle non-ns timestamps? Or does it need some work? I give some time if needed.

IIUC _datetime_to_stata_elapsed_vec handles this correctly, but another pair of eyeballs would be a good idea

if fmt.startswith(("%tc", "tc")):
# Delta ms relative to base
td = np.timedelta64(stata_epoch - unix_epoch, "ms")
res = np.array(dates._values, dtype="M8[ms]") + td
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very familiar with this part of the code base, but this looks like it would overflow when dates._values are close to the minimum storage value. Does numpy catch that for us or return junk values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. numpy does not catch this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashtage am i right in thinking stata doesn't support dates millions of years in the past/future so we don't need to worry about these overflows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a check of type tc and found that the maximum date it would represent is 31Dec9999 21:41:39

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is stored as 253717911699456

@jbrockmendel
Copy link
Member Author

@bashtage gentle ping

@bashtage
Copy link
Contributor

Will take a look later today.

@jbrockmendel jbrockmendel added this to the 3.0 milestone Dec 1, 2023
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jan 13, 2024
@mroeschke mroeschke removed the Stale label Feb 2, 2024
@mroeschke mroeschke merged commit 4663edd into pandas-dev:main Feb 2, 2024
51 checks passed
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the enh-stata-non-nano branch February 2, 2024 20:19
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* ENH: read_stata return non-nano

* GH ref

* mypy fixup

* update doctest

* simplify

* avoid Series.view

* dont go through Series

* move whatsnew

* remove outdated whatsnew
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Stata read_stata, to_stata Non-Nano datetime64/timedelta64 with non-nanosecond resolution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants