Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add allowOverflow flag to Timestamp::toTimezone #9836

Closed

Conversation

NEUpanning
Copy link
Contributor

@NEUpanning NEUpanning commented May 16, 2024

Allow integer overflow when converting timestamp to TimePoint.

Use in Spark function from_unixtime that allows overflow.

Fixes #9778

@facebook-github-bot
Copy link
Contributor

Hi @NEUpanning!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link

netlify bot commented May 16, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 5f712d3
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/664da6b3690ab600084f7453

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 16, 2024
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@NEUpanning
Copy link
Contributor Author

@PHILO-HE @rui-mo Could you please help to take a look? Thanks.

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@NEUpanning NEUpanning requested a review from rui-mo May 17, 2024 06:10
Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good overall.

velox/type/Timestamp.h Show resolved Hide resolved
@NEUpanning NEUpanning requested a review from rui-mo May 17, 2024 06:39
velox/type/Timestamp.h Outdated Show resolved Hide resolved
velox/type/Timestamp.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good % this question #9836 (comment).

@rui-mo
Copy link
Collaborator

rui-mo commented May 17, 2024

@mbasmanova Would you like to take a review? Thanks.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revise the PR title. Assume this fix is not only required by from_unixtime.

My suggestion:
Fix time conversion overflow for `Timestamp::toTimezone`

@mbasmanova
Copy link
Contributor

@rui-mo @PHILO-HE Rui, I'll take a look once your's and @PHILO-HE's comments are resolved, but would it be possible to update PR description to provide more context for this change?

@NEUpanning NEUpanning changed the title fix from_unixtime function throw exception with overflowed argument and timezone Introduce overflow argument for Timestamp::toTimezone to indicates overflow behavior is allowed May 20, 2024
@NEUpanning NEUpanning changed the title Introduce overflow argument for Timestamp::toTimezone to indicates overflow behavior is allowed Introduce allowOverflow argument for Timestamp::toTimezone to indicates overflow behavior is allowed May 20, 2024
@NEUpanning
Copy link
Contributor Author

@rui-mo @PHILO-HE I've revised the PR title and added more context in PR description. Please take a look and provide your feedback.

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks.

@mbasmanova mbasmanova changed the title Introduce allowOverflow argument for Timestamp::toTimezone to indicates overflow behavior is allowed Add allowOverflow flag to Timestamp::toTimezone May 20, 2024
Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! @mbasmanova, please take a look. Thanks!

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NEUpanning Some questions.

#ifdef NDEBUG
// Integer overflow in the internal conversion from seconds to milliseconds.
EXPECT_EQ(
fromUnixTime(std::numeric_limits<int64_t>::max(), "yyyy-MM-dd HH:mm:ss"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example in the documentation shows a different result: https://facebookincubator.github.io/velox/functions/spark/datetime.html#from_unixtime

SELECT from_unixtime(9223372036854775807, "yyyy-MM-dd HH:mm:ss");  -- '1969-12-31 23:59:59'

Which one is the "right" one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are right. The example in the documentation doesn't set query time zone configuration, but this test sets time zone to "Asia/Shanghai" , so the results are different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add another example in document for query with time zone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added example in document : 59968ef

@@ -898,6 +898,15 @@ TEST_F(DateTimeFunctionsTest, fromUnixtime) {

// 8 hours ahead UTC.
setQueryTimeZone("Asia/Shanghai");
// In debug mode, Timestamp constructor will throw exception if range check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a deeper problem here that Timestamp cannot represent values representable in Spark? If not, then, perhaps there is some issue with the implementation where it attempts to create an "invalid" Timestamp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presto's Timestamp is stored in one 64-bit signed integer for milliseconds, but Spark's Timestamp is stored in 64-bit signed integer for seconds, so this test attempts to create an "invalid" Timestamp. Maybe we should remove the range check for Timestamp since it's also used by Spark now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to follow-up on that. CC: @rui-mo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on it when i have time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NEUpanning Do we need to create an issue to track this follow-up?

Copy link
Contributor Author

@NEUpanning NEUpanning May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo Yes, i opened a issue : #9904

@@ -93,6 +93,10 @@ These functions support TIMESTAMP and DATE input types.
SELECT from_unixtime(3600, 'yyyy'); -- '1970'
SELECT from_unixtime(9223372036854775807, "yyyy-MM-dd HH:mm:ss"); -- '1969-12-31 23:59:59'

If we run the following query in the `Asia/Shanghai` time zone: ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to include all 3 queries above, but in this time zone?

The same queries evaluated with `Asia/Shanghai` session time zone return timestamps shifted by 8 hours.

SELECT ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. See 15e85ed

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label May 22, 2024
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mbasmanova
Copy link
Contributor

@NEUpanning Would you rebase?

@NEUpanning
Copy link
Contributor Author

@mbasmanova I've rebased to main branch and squashed commits.

@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mbasmanova merged this pull request in 47fc3dd.

Copy link

Conbench analyzed the 1 benchmark run on commit 47fc3dda.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
Summary:
Allow integer overflow when converting timestamp to TimePoint.

Use in Spark function from_unixtime that allows overflow.

Fixes facebookincubator#9778

Pull Request resolved: facebookincubator#9836

Reviewed By: amitkdutta

Differential Revision: D57659010

Pulled By: mbasmanova

fbshipit-source-id: d60fd686c19dff02eebba1a0c0410056d189c6f5
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
Summary:
Allow integer overflow when converting timestamp to TimePoint.

Use in Spark function from_unixtime that allows overflow.

Fixes facebookincubator#9778

Pull Request resolved: facebookincubator#9836

Reviewed By: amitkdutta

Differential Revision: D57659010

Pulled By: mbasmanova

fbshipit-source-id: d60fd686c19dff02eebba1a0c0410056d189c6f5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
5 participants