Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a function to parse a string without timezone to local timestamp #3787

Conversation

MichaelLeeHZ
Copy link
Contributor

@MichaelLeeHZ MichaelLeeHZ commented Mar 2, 2023

Which issue does this PR close?

None

Closes #.

Rationale for this change

Recently in my system, I update arrow from 23 to 32, and I found return from function string_to_timestamp_nanos is changed.

Before #2814, function string_to_timestamp_nanos return a local timestamp when parsing a string without an explicit time zone.

// my time zone is UTC+8
string_to_timestamp_nanos("2022-12-30 09:26:56.123")  => 1672363616123

After #2814, I found the function string_to_timestamp_nanos return a UTC timestamp when parsing a string without an explicit time zone.

// my time zone is UTC+8
string_to_timestamp_nanos("2022-12-30 09:26:56.123")  => 1672392416123

In my scene, I need the former.

What changes are included in this PR?

Add a function string_to_local_timestamp_nanos which interprets strings without an explicit time zone as
timestamps with offsets of the local time on the machine.

For example,2022-12-30T09:26:56.123 is interpreted as a local timestamp in the timezone of the machine. For example, if the system timezone is set to Asia/Shanghai (UTC+8) the timestamp will be interpreted as though it were
2022-12-30T09:26:56.123+08:00

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Mar 2, 2023
@tustvold
Copy link
Contributor

tustvold commented Mar 2, 2023

Thank you for this, I would like to take some time to think about this API. I think we might want a way to specify the destination timezone for the function, perhaps adding the ability to pass an Option<Tz>. This would not only cover the case of parsing interpreting as the local timezone, but also parsing into a TimestampArray with a non-UTC timezone set which I think is currently handled incorrectly...

@MichaelLeeHZ MichaelLeeHZ marked this pull request as ready for review March 2, 2023 14:09
@tustvold
Copy link
Contributor

tustvold commented Mar 2, 2023

I've added an alternative proposal in #3795 PTAL 🙏

I think you should be able to just provide it with Local as the timezone parameter

@MichaelLeeHZ
Copy link
Contributor Author

MichaelLeeHZ commented Mar 3, 2023

I think you should be able to just provide it with Local as the timezone parameter

I have read your issue and PR, and there is a question:
if I want to parse a string with time zone, what should I do with string_to_datetime? Do I have to use string_to_timestamp_nanos ?
The precondition is I do not know the format of timestamp-like string.

@tustvold
Copy link
Contributor

tustvold commented Mar 3, 2023

string_to_datetime contains all the current logic, including handling strings with an embedded timezone. It will convert them appropriately

See https://github.com/apache/arrow-rs/pull/3795/files#diff-cd568dd408cf11db340aa0259545de48fca7a0c0ac680bb9ff9a34956bab52b1R632

@tustvold tustvold closed this Mar 4, 2023
@tustvold
Copy link
Contributor

tustvold commented Mar 4, 2023

Closing as I believe this functionality is provided by #3795. Feel free to reopen if I am mistaken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants