-
Notifications
You must be signed in to change notification settings - Fork 1
Drill Date Time Types
Drill provides a number of Date/Time types, each in various stages of completion.
The date/time types include:
Drill Type | Minor Type | SQL Type | Value Vector |
---|---|---|---|
DATE | DATE | DATE | DateVector |
TIME | TIME | TIME | TimeVector |
N/A | TIMETZ | N/A | |
N/A | TIMESTAMPTZ | N/A | |
N/A | TIMESTAMP | TIMESTAMP | TimeStampVector |
INTERVAL FROM ... TO ... | INTERVAL | N/A | IntervalVector |
N/A | INTERVALYEAR | N/A | IntervalYearVector |
N/A | INTERVALDAY | N/A | IntervalDayVector |
Documentation information:
Documentation: Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.
Minor Type: DATE
Vector class: DateVector
Width: 8 bytes
Java type: long
Drill SQL constant: date '2017-03-14'
Encoding: Number of ms since 1970-01-01T00:00:00 in the server time zone
Dates are assumed to be in the server time zone. For example CURRENT_DATE
returns today, midnight as the number of ms since 1970-01-01T00:00:00 in the server time zone. Presumably, if time conversion functions convert date strings to DATE columns, the values are assumed to be in server local time.
As long as code does not compare values to CURRENT_DATE
, the values can be assumed to be in any time zone, allowing Drill to represent relative dates: 2017-03-17 as a concept rather than as a point in time.
Although the documentation says that DATE is stored in UTC, that is only true if the server time zone is UTC, else DATE is stored in the server local time zone. Also, as with TIMESTAMP, the actual range is 1903-2037.
Documentation: 24-hour based time before or after January 1, 2001 in hours, minutes, seconds. Format: HH:mm:ss Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.
Minor Type: TIME
Vector class: TimeVector
Width: 4 bytes
Java type: long
Drill time is an absolute time: it is measured in UTC. This means Drill cannot represent the idea of "let's meet at 2 PM" independent of "2 PM, PST" or "2 PM PDT". Further, since Time include seconds since 2001, it is also a date.
Minor Type: TIMETZ
Defined in MinorTypes
but never implemented.
Documentation: JDBC timestamp in year, month, date hour, minute, second, and optional milliseconds format: yyyy-MM-dd HH:mm:ss.SSS Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037. Drill does not support TIMESTAMP with time zone
Minor Type: TIMESTAMP
Vector class: TimeStampVector
Width: 8 bytes
Java type: long
Representation: Number of milliseconds after the Unix epoch, 1970-01-01T00:00:00 server local time.
JDBC: getTimestamp()
with a time zone that depends on the difference between the client and server time zones.
Although the documentation implies that TIMESTAMP is UTC, it is, in fact, in server local time.
Although the documentation says the range is 1971-2037, the range is actually 1903-2037 (the full range of a signed long.)
The NOW()
function returns the current time in server local time,
The JDBC driver tries to convert the value to UTC but it does so by converting the server local time to UTC using the client time zone. This works only if the client and server are in the same time zone. Otherwise the returned time is not UTC, but UTC + (client offset - server offset)
. This is deemed to be a feature.
It seems that JDBC users can learn the server time zone (via a means other than Drill), then pass that to an alternative JDBC method:
Timestamp realUtc = resultSet.getTimestamp(serverTimeZone);
Common advice is to run both the client and servers in the UTC time zone. Then, a TIMESTAMP is UTC (because the server time zone is UTC), and the client will convert from server time to UTC correctly.
See [this note|http://drill.apache.org/docs/data-type-conversion/#time-zone-limitation] and [this blog entry|http://www.openkb.info/2015/05/understanding-drills-timestamp-and.html#.VUzhotpVhHw] explain the time zone issues, and their workarounds, in detail.
Minor Type: TIMESTAMPTZ
Defined in MinorTypes
but never implemented.
Documentation: A day-time or year-month interval and Internally, INTERVAL is represented as INTERVALDAY or INTERVALYEAR. Also: The INTERVALYEAR and INTERVALDAY internal types represent a period of time. The INTERVALYEAR type specifies values from a year to a month. The INTERVALDAY type specifies values from a day to seconds.
Minor Type: INTERVAL
Vector class: IntervalVector
Width: 16 bytes (Int32 + Int32 + Int32 : Month + Days + Milliseconds)
Java Type: Joda Period
SQL Syntax: INTERVAL '1' YEAR, INTERVAL '2' DAY
Minor Type: INTERVALYEAR
Vector class: IntervalYearVector
Width: 4 bytes
Minor Type: INTERVALDAY
Vector class: IntervalDayVector
Width: 12 bytes
From this source:
Type | SQL | MySQL | SQL Server |
---|---|---|---|
DATE | Stores year, month, and day values | Format: YYYY-MM-DD. Supported range is from '1000-01-01' to '9999-12-31' | From January 1, 0001 to December 31, 9999 |
TIME | Stores hour, minute, and second values | A time. Format: HH:MI:SS | Time only |
DATETIME | A date and time combination. Format: YYYY-MM-DD HH:MI:SS | From January 1, 1753 to December 31, 9999 | |
DATETIMEOFFSET | The same as DATETIME with the addition of a time zone offset | ||
TIMESTAMP | Stores year, month, day, hour, minute, and second values | Stored as the number of seconds since the Unix epoch ('1970-01-01 00:00:00' UTC). | Internal version. |
INTERVAL | Composed of a number of integer fields, representing a period of time, depending on the type of interval |
The historical format for dates and times was a simple value: 1/2/1950 or 3 PM. These are "implied" times: values that can be understood only to a local (or implied) time zone. Knowing if "3 PM" is one time zone or another is specified outside of the date value itself: "3 PM my time" or "3 PM in New York."
Later, with increasing international communication, it became important to establish an absolute time, given by UTC: 2001-03-04T10:20:30Z is a date and time relative to Zulu (UTC). In Unix, all times are milliseconds since a specific point in UTC, 1970-01-01T00:00:00. This is called a timestamp and is identical across all Unix/Linux systems regardless of their physical location.
UTC can then be converted to a local time: a date and time with a numeric or named time-zone offset: 2017-03-15T16:47:30PDT.
Local and UTC times can be readily converted, often with the use of a time library that is aware of the politically-defined time zones, daylight savings times and so on.
On the other hand, implied dates and times cannot be converted: we don't know if "3 PM" is in one time zone or another. Further, we don't know if it was when daylight savings time was in effect or not. Similarly, "3/4/15" means a particular date wherever you happen to be: it is not tied to a time zone.