Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract from interval type failed #6327

Closed
comphead opened this issue May 10, 2023 · 2 comments · Fixed by #12514
Closed

Extract from interval type failed #6327

comphead opened this issue May 10, 2023 · 2 comments · Fixed by #12514
Labels
bug Something isn't working

Comments

@comphead
Copy link
Contributor

Describe the bug

Extract from interval type failed in DF, but works in Postgres

To Reproduce

select extract( year from interval '5' year);
Error during planning: Coercion from [Utf8, Interval(MonthDayNano)] to the signature OneOf([Exact([Utf8, Date32]), Exact([Utf8, Date64]), Exact([Utf8, Timestamp(Second, None)]), Exact([Utf8, Timestamp(Microsecond, None)]), Exact([Utf8, Timestamp(Millisecond, None)]), Exact([Utf8, Timestamp(Nanosecond, None)]), Exact([Utf8, Timestamp(Nanosecond, Some("+00:00"))])]) failed.

Expected behavior

Query should work

Additional context

Found when reviewing #6312
Possibly related to #5806

@comphead comphead added the bug Something isn't working label May 10, 2023
@Jefffrey
Copy link
Contributor

Jefffrey commented Dec 31, 2023

Here's some example output from postgres:

postgres=# select extract(year from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
 extract
---------
     120
(1 row)

postgres=# select extract(month from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
 extract
---------
       3
(1 row)

postgres=# select extract(day from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
 extract
---------
     111
(1 row)

postgres=# select extract(hour from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
 extract
---------
     112
(1 row)

postgres=# select extract(minute from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
 extract
---------
      52
(1 row)

postgres=# select extract(second from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
  extract
-----------
 51.000000
(1 row)

Note how month contributes to year when it overflows, but day doesn't do the same for month

And how second and minute can overflow as well, but hour doesn't overflow into day.

Duckdb shows the same results:

D select extract(year from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('year', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                 int64                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                   120 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select extract(month from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('month', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                 int64                                                  │
├────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                      3 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select extract(day from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('day', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                int64                                                 │
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                  111 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select extract(hour from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('hour', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                 int64                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                   112 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select extract(minute from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('minute', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                  int64                                                  │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                      52 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select extract(second from '111 year 111 month 111 day 111 hour 111 minute 111 second'::interval);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ main.date_part('second', CAST('111 year 111 month 111 day 111 hour 111 minute 111 second' AS INTERVAL)) │
│                                                  int64                                                  │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                      51 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Can use these as guidance for implementation for Datafusion, but keeping in mind there are three different interval types to support:

  • YearMonth - stores number of months
    • probably should only be able to extract century, decade, year, month where month overflows into year as necessary
  • DayTime - stores number of days and number of milliseconds
  • MonthDayNano - stores number of months, number of days, and number of nanoseconds
    • probably will be the closest to behaviour with postgres/duckdb as it stores the most base units

References

https://duckdb.org/docs/sql/functions/datepart

https://duckdb.org/docs/sql/data_types/interval.html

https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT

@nrc
Copy link
Contributor

nrc commented Aug 1, 2024

The work in Arrow has landed and should be in 53, then we can progress #11501 to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants