Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map MySQL TIMESTAMP types to TIMESTAMP WITH TIME ZONE #18470

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

adamjshook
Copy link
Member

@adamjshook adamjshook commented Jul 31, 2023

Description

This PR changes the MySQL type mappings for MySQL's TIMESTAMP type to Trino's TIMESTAMP WITH TIME ZONE type. Note that MySQL DATETIME types are read as TIMESTAMP types and Trino TIMESTAMP types are inserted into MySQL as DATETIME types -- this behavior is unchanged.

In other words, for reads and writes:

MySQL Type Trino Type
DATETIME TIMESTAMP
TIMESTAMP TIMESTAMP WITH TIME ZONE

Additional context and related issues

Fixes #15668

  • Changed read and write mappings for MySQL TIMESTAMP type to map to Trino TIMESTAMP WITH TIME ZONE
    • The implementations were taken from the PostgreSQL connector -- I am open to any suggestions on implementation details here
  • Added override of getColumnDefinitionSql to explicitly declare a column as NULL to satisfy an Invalid default value failure in the data mapping smoke tests when creating a table using a MySQL TIMESTAMP type
  • Added a separate test class for the time/date/timestamp/timestamp with time zone types where the MySQL container's time zone is set to Pacific/Apia (and therefore the time zone of the MySQL server)

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# MySQL Connector
* Changes MySQL TIMESTAMP types to map to TIMESTAMP WITH TIME ZONE

@cla-bot cla-bot bot added the cla-signed label Jul 31, 2023
@adamjshook adamjshook requested a review from hashhar July 31, 2023 22:45
@adamjshook adamjshook force-pushed the adamjshook/mysql-timestamps branch from 411146f to 062a6ed Compare August 1, 2023 18:15
Comment on lines 653 to 654
return new LongWriteFunction()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, formatter did its thing and I missed it on a rebase. I'll revert.

/**
* Default MySQL time zone is set to UTC. This is to test the date and time type mappings when the server has a different time zone.
*/
public class TestMySqlTimeMappingsWithServerTimeZone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to combine this with existing test class anyway since the server zone needs to be changed and if done on the fly it might lead to test isolation issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wendigo @dominikzalewski look at this specifically - I couldn't think of solutions but maybe there are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashhar, is this a problem? Usually in other projects I was participating in, we tended to keep line numbers on 3 digits. It wasn't an error to have more than one test class for a feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern is that if you add additional cases to the other class how do you ensure it gets tested in this class as well?

A simple solution is adding a comment on the original class.

I'm just wondering if someone has other ideas. If no ideas then what we have right now is fine.

Comment on lines +875 to +936
// after epoch (MySQL's timestamp type doesn't support values <= epoch)
.addRoundTrip("timestamp(3) WITH TIME ZONE", "TIMESTAMP '2019-03-18 10:01:17.987 %s'".formatted(sessionZone), createTimestampWithTimeZoneType(3), "TIMESTAMP '2019-03-18 10:01:17.987 UTC'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a separate test method verifying that the failure happens as expected (e..g it's a good user facing error).

.addRoundTrip("timestamp(1) WITH TIME ZONE", "TIMESTAMP '2020-09-27 12:34:56.9 %s'".formatted(sessionZone), createTimestampWithTimeZoneType(1), "TIMESTAMP '2020-09-27 12:34:56.9 UTC'")
.addRoundTrip("timestamp(3) WITH TIME ZONE", "TIMESTAMP '2020-09-27 12:34:56.123 %s'".formatted(sessionZone), createTimestampWithTimeZoneType(3), "TIMESTAMP '2020-09-27 12:34:56.123 UTC'")
.addRoundTrip("timestamp(3) WITH TIME ZONE", "TIMESTAMP '2020-09-27 12:34:56.999 %s'".formatted(sessionZone), createTimestampWithTimeZoneType(3), "TIMESTAMP '2020-09-27 12:34:56.999 UTC'")
.addRoundTrip("timestamp(6) WITH TIME ZONE", "TIMESTAMP '2020-09-27 12:34:56.123456 %s'".formatted(sessionZone), createTimestampWithTimeZoneType(6), "TIMESTAMP '2020-09-27 12:34:56.123456 UTC'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test values where rounding needs to be performed - important to test since write path is different for CTAS and INSERT.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added testTimestampWithTimeZoneCoercion, a copy of testTimestampCoercion but slightly modified to to use values that are within range of MySQL TIMESTAMP

@adamjshook adamjshook force-pushed the adamjshook/mysql-timestamps branch from 062a6ed to 90fe963 Compare August 3, 2023 13:57
@adamjshook
Copy link
Member Author

I added a test case to check the unsupported timestamptz values. It passes just fine when the server time zone is set to UTC, but when I change it to Pacific/Apia I don't get an error inserting 1970-01-01 00:00:00 UTC into MySQL, which makes me think something is wrong (and is why the CI is failing). Looking in that.

@adamjshook
Copy link
Member Author

Looking into the failing test case more.

I ran the test case with the MySQL TZ set to Pacific/Apia and set some breakpoints.

Trino executes:

INSERT INTO tpch.test_unsupported_timestamptr94g5tht8 VALUES (TIMESTAMP '1970-01-01 00:00:00 UTC')

When sent to the JDBC client, the new Timestamp value is set to 0. This query succeeds when I expect it to fail.

On the server side, running the mysql CLI inside the Docker container:

bash-4.4# echo $TZ     
Pacific/Apia

mysql> SELECT @@global.time_zone, @@session.time_zone;
+--------------------+---------------------+
| @@global.time_zone | @@session.time_zone |
+--------------------+---------------------+
| SYSTEM             | SYSTEM              |
+--------------------+---------------------+
mysql> select * from test_unsupported_timestamptr94g5tht8;
+---------------------+
| data                |
+---------------------+
| 1970-01-01 00:00:00 |
+---------------------+
ysql> insert into test_unsupported_timestamptr94g5tht8 values ('1970-01-01 00:00:00');
Query OK, 1 row affected (0.03 sec)
mysql> insert into test_unsupported_timestamptr94g5tht8 values ('1969-12-31 13:00:00');
ERROR 1292 (22007): Incorrect datetime value: '1969-12-31 13:00:00' for column 'data' at row 1
mysql> insert into test_unsupported_timestamptr94g5tht8 values ('1969-12-31 13:00:01');
Query OK, 1 row affected (0.03 sec)

MySQL seems to be acting appropriately and rejecting the values after converting the given values from the session time zone to UTC. This seems like possibly a JDBC driver issue (or how we are using the JDBC driver).

@hashhar
Copy link
Member

hashhar commented Aug 23, 2023

well, let's add validation like the ClickHouse connector does within the connector code itself. We can follow-up to see if we're using the driver wrong or it's just a driver bug.

@adamjshook adamjshook force-pushed the adamjshook/mysql-timestamps branch from 90fe963 to 1996d18 Compare August 23, 2023 18:15
@hashhar hashhar requested review from wendigo and ebyhr August 24, 2023 04:53
@hashhar
Copy link
Member

hashhar commented Aug 24, 2023

FYI @findepi @anusudarsan since you were interested in this.

public class TrinoToMySqlWriteChecker<T>
{
// The range for TIMESTAMP values is '1970-01-01 00:00:01.000000' to '2038-01-19 03:14:07.499999'
public static final TrinoToMySqlWriteChecker<Instant> TIMESTAMP = new TrinoToMySqlWriteChecker<>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the name 'TIMESTAMP' something that has been overlooked, or a thought through variable name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also seem to be using guice modules. Wouldn't it be cleaner if we configured a binding 'toInstance' in guice and injected it, rather than rely on statically initialized fields?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is thought through - see the point of usage - TIMESTAMP.validate(value).

Note that this class is "derived" from https://github.com/trinodb/trino/blob/5d8824edf9101eab451128bbad53d8c6ce930ad4/plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/TrinoToClickHouseWriteChecker.java.

And this class should go away if/when the MySQL JDBC driver bug gets fixed.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamjshook Does mysql.md (the docs) need to be updated as well to mention the new mapping?

{
// The range for TIMESTAMP values is '1970-01-01 00:00:01.000000' to '2038-01-19 03:14:07.499999'
public static final TrinoToMySqlWriteChecker<Instant> TIMESTAMP = new TrinoToMySqlWriteChecker<>(
new TimestampWriteValueChecker(new Range<>(Instant.parse("1970-01-01T00:00:01.000000Z"), Instant.parse("2038-01-19T03:14:07.499999Z"))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 1970-01-01T00:00:00.000000Z?

https://en.wikipedia.org/wiki/Unix_time

Unix time is currently defined as the number of seconds which have passed since 00:00:00 UTC on Thursday, 1 January 1970

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this should proably be 1970-01-01T00:00:00.000001Z - i.e. epoch + min delta

Copy link
Member Author

@adamjshook adamjshook Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the MySQL documentation:

The TIMESTAMP data type is used for values that contain both date and time parts. TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19 03:14:07' UTC.

mysql> insert into test_unsupported_timestamp values ('1970-01-01 00:00:00');
ERROR 1292 (22007): Incorrect datetime value: '1970-01-01 00:00:00' for column 'data' at row 1

mysql> insert into test_unsupported_timestamp values ('1970-01-01 00:00:01');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_unsupported_timestamp;
+---------------------+
| data                |
+---------------------+
| 1970-01-01 00:00:01 |
+---------------------+

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether this is a problem or not, but I'll mention it 'just in case'.
Back a few years ago I was working with a JSON interface of an HTTP endpoint. I was surprised to see that I could convert a negative long passed in JSON into Javascript date to represent dates before epoch. We used that for birth dates just like described here: https://www.epochconverter.com/programming/mysql (paragraph 'Negative epochs').

Now, as the article reads, mysql has a problem representing timestamps before epoch. However, we have a 'converter' at hand that is translating trino datatype to particular database datatype. And that trino datatype just might support negative values. Currently the test throws an exception if the timestamp is out of range, for instance if it's before epoch. I'm just wondering whether that's the behaviour we want to have. Maybe we somehow want to emulate negatives (ie. before epoch) timestamps.

Treat it as a general comment that can be ignored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. That makes such "emulated" values only readable by Trino. It's a form of vendor lock in.

All connectors have tests (Test*TypeMapping) which ensure that values written by trino round trip when read directly from the db and the other way around as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamjshook please add a code comment about the min value to make it more obvious

@Override
public void validate(Instant value)
{
if (value.isBefore(range.getMin()) || value.isAfter(range.getMax())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other places in the project (like io.trino.plugin.jdbc.JdbcMetadataConfig), I see we seem to be using a validation framework, for instance jakarta.validation.constraints.Min. Is there a reason (performance?) we cannot use it here as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance, jakarta validation does too much work. It's more useful for user-facing validation than runtime checks.

}
}

private static class Range<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a range class in apache commons:

https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/Range.html

That already has most of the logic introduced here. I see that we have commons in the master pom.xml:

        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-lang3</artifactId>
            <version>3.13.0</version>
        </dependency>

Is there a reason why we cannot use it in plugins?

import static java.lang.String.format;
import static java.util.Objects.requireNonNull;

public class TrinoToMySqlWriteChecker<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand the complexity introduced with these abstractions. The only two production usages of the validate() method I can see is that we pass an Instance and the code checks under the covers if it's within epoch's min and max values. Do we really need to introduce private interface, private class, another private class for Range in order to do that?

It seems that the epoch min and epoch max are hardcoded anyway. Isn't it the case that the only thing we need is a

private static boolean validate(Instant i) {
   return i >= Instant.parse("1970-01-01T00:00:00.000000Z") && Instant.parse("2038-01-19T03:14:07.499999Z")
}

(or something like above that compiles or throws an exception instead of returning a boolean) in MySqlClient class?

I would understand wrapping into classes, if the code was unit-tested. But it does not look like there's a unit test for this class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, we only have the TIMESTAMP validator here. ClickHouse on the other hand (where this class originates from) has 6 validators, each of them having different "rules" depending on the ClickHouse version we are connecting to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this could be simplified into a static validateMySqlTimestampValue function in MySqlClient.

throw new TrinoException(NOT_SUPPORTED, "This connector does not support creating tables with column comment");
}

return "%s %s %s" .formatted(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the space after the string here is intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for some reason my IntelliJ formatter put that there. I'll fix it.

@@ -469,6 +501,13 @@ public Optional<ColumnMapping> toColumnMapping(ConnectorSession session, Connect
return Optional.of(jsonColumnMapping());
case "enum":
return Optional.of(defaultVarcharColumnMapping(typeHandle.getRequiredColumnSize(), false));
case "datetime":
TimestampType timestampType = createTimestampType(getTimestampPrecision(typeHandle.getRequiredColumnSize()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closing this logic into a separate private method that would be named like 'mysqlDateTimeToTrinoTimestamp' would make it more explicit for the reader that this is what's going on here.

timestampType,
mySqlTimestampReadFunction(timestampType),
timestampWriteFunction(timestampType)));
TimestampWithTimeZoneType trinoType = createTimestampWithTimeZoneType(getTimestampPrecision(typeHandle.getRequiredColumnSize()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here 'mysqlTimestampToTrinoTimestampWithTZ'

OffsetDateTime offsetDateTime = resultSet.getObject(columnIndex, OffsetDateTime.class);
return LongTimestampWithTimeZone.fromEpochSecondsAndFraction(
offsetDateTime.toEpochSecond(),
(long) offsetDateTime.getNano() * PICOSECONDS_PER_NANOSECOND,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for super-easy reading maybe wrap this computation into a private method called 'nanos2pico'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read and write functions are modeled after a few existing functions. They could be put in one place and shared but that feels out of scope of this change and could be resolved as a separate issue.

@@ -592,6 +636,50 @@ private static ColumnMapping mySqlCharColumnMapping(CharType charType, Optional<
return ColumnMapping.sliceMapping(charType, charReadFunction(charType), charWriteFunction(), pushdownController);
}

private static LongReadFunction shortTimestampWithTimeZoneReadFunction()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether the 'short' and 'long' prefix in the following 4 methods is not misleading. The reason being that the returned type has sometimes 'Long' in the name, so I had to read it 4 times and browse deep into the code to understand why is that. Maybe there is a better wording like for instance 'simpleTimestamp'/'complexTimestamp' or 'regularTimestamp'/'extendedTimestamp'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #18470 (comment) -- similar answer for why they are named the way they are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is even more than that. The trino classes which implement the type are called ShortTimestampType and LongTimestampType and that's where the name comes from.

return ObjectWriteFunction.of(
LongTimestampWithTimeZone.class,
(statement, index, value) -> {
long epochSeconds = floorDiv(value.getEpochMillis(), MILLISECONDS_PER_SECOND);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would wrap it into a private method called 'epochMillis2Seconds'
and the below into 'timestamp2nanosDelta'.

Then I'd inline it:

Instant instantValue = Instant.ofEpochSecond(epochMillis2Seconds(value), timestamp2nanosDelta(value))

@adamjshook
Copy link
Member Author

@adamjshook Does mysql.md (the docs) need to be updated as well to mention the new mapping?

Yes! I'll add that as a separate commit when making other updates to this PR and we'll need a note in the release notes as well regarding the change.

@Test(dataProvider = "sessionZonesDataProvider")
public void testDate(ZoneId sessionZone)
{
Session session = Session.builder(getSession())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this snippet is repeated in pratically every test. Do you think there's value in promoting it to an instance variable and initializing in a setup() method?

@dominikzalewski
Copy link
Member

In the PR description it reads:

Note that MySQL DATETIME types are read as TIMESTAMP types and Trino TIMESTAMP types are inserted into MySQL as DATETIME types -- this behavior is unchanged.

Due to the size of the change (on test side), it's hard for me to verify whether this condition is met, or whether we need the backward compatibility test for that, etc. Having written that I just want to make sure that this has been thought through, even if my ask is 'doubled effort' since this has already been addressed.

@adamjshook
Copy link
Member Author

I'm trying to dig in to why we need this checker in the first place. I find it a bit unsettling that you insert a java.sql.Timestamp initialized with UTC milliseconds and the MySQL server feels like it needs to convert it to the server time zone. Particularly when we explicitly set connectionTimeZone=UTC in the MySqlClientModule.

// Trino
TIMESTAMP '2019-03-18 10:01:17.987 UTC

// MySQL
mysql> SET time_zone = 'UTC';
mysql> select * from test_timestamp_with_time_zone613fa1muom;
+-------------------------+
| col_0                   |
+-------------------------+
| 2019-03-17 20:01:17.987 |
+-------------------------+


SqlDataTypeTest.create()
.addRoundTrip("date", "DATE '0001-01-01'", DATE, "DATE '0001-01-01'")
.addRoundTrip("date", "DATE '1582-10-04'", DATE, "DATE '1582-10-04'") // before julian->gregorian switch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be beneficial to have methods named 'beforeJulianGregorianSwitch' in SqlDataTypeTest class. This was the code would document itself, would be more readable and maintainable (as comment would be unnecessary)

}

@Test(dataProvider = "sessionZonesDataProvider")
public void testTimeFromMySql(ZoneId sessionZone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this test should be named 'testPrecision'?


SqlDataTypeTest.create()
// default precision in MySQL is 0
.addRoundTrip("TIME", "TIME '00:00:00'", createTimeType(0), "TIME '00:00:00'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These literals seem to be repeated throughout the test. Do you think there's value extracting them to a constant and reusing?

@ebyhr ebyhr removed their request for review August 30, 2023 02:10
@adamjshook
Copy link
Member Author

From here solution 2b seems to be the one that is working best. Setting the below properties results in the value stored in MySQL to accurately reflect the value inserted via Trino. I've removed code that validates the timestamptz values since MySQL is returning an expected error now.

connectionProperties.setProperty("connectionTimeZone", "LOCAL");
connectionProperties.setProperty("forceConnectionTimeZoneToSession", "true");

// Trino
TIMESTAMP '2019-03-18 10:01:17.987 UTC

mysql> set time_zone = utc;
mysql> select * from test_timestamp_with_time_zone;
+-------------------------+
| col_0                   |
+-------------------------+
| 2019-03-18 10:01:17.987 |
+-------------------------+

One thing to note is the TestMySqlTimeMappingsWithServerTimeZone#testTimestampFromMySql test case where MySQL TIMESTAMP values are inserted with the MySQL JDBC driver. The time zones of these are set to the server time zone, Pacific/Apia, and then read back in UTC as expected. I think this behavior is fine and expected, but I wanted to call it out.

@adamjshook adamjshook force-pushed the adamjshook/mysql-timestamps branch from 1996d18 to 8cd1e31 Compare September 2, 2023 17:43
@adamjshook adamjshook requested a review from hashhar September 2, 2023 17:43
@github-actions github-actions bot added the docs label Sep 2, 2023
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebyhr / @findepi / @anusudarsan can one of you also please take a look?

@@ -208,6 +225,10 @@ public class MySqlClient
// MySQL driver returns width of time types instead of precision, same as the above timestamp type.
private static final int ZERO_PRECISION_TIME_COLUMN_SIZE = 8;

// MySQL TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19 03:14:07' UTC
private static final Instant MYSQL_TIMESTAMP_MIN_VALUE = Instant.parse("1970-01-01T00:00:01.000000Z");
private static final Instant MYSQL_TIMESTAMP_MAX_VALUE = Instant.parse("2038-01-19T03:14:07.499999Z");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that .49999 is confusing me, can you explain me why? It's > 03:14:07 UTC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following paragraph from here (emphasis mine) refers to the fractional seconds. Kind of misleading as the previous paragraph says the maximum is '2038-01-19 03:14:07' UTC

A DATETIME or TIMESTAMP value can include a trailing fractional seconds part in up to microseconds (6 digits) precision. In particular, any fractional part in a value inserted into a DATETIME or TIMESTAMP column is stored rather than discarded. With the fractional part included, the format for these values is 'YYYY-MM-DD hh:mm:ss[.fraction]', the range for DATETIME values is '1000-01-01 00:00:00.000000' to '9999-12-31 23:59:59.499999', and the range for TIMESTAMP values is '1970-01-01 00:00:01.000000' to '2038-01-19 03:14:07.499999'. The fractional part should always be separated from the rest of the time by a decimal point; no other fractional seconds delimiter is recognized.

// TODO (https://github.com/trinodb/trino/issues/15668) rethink how timestamps are mapped. Also, probably worth adding tests
// with MySQL server with a non-UTC system zone.
connectionProperties.setProperty("connectionTimeZone", "UTC");
// Force the connection time zone of that to the server to preserve time zones
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds opposite of what MySQL docs say:

according to docs: connectionTimeZone = LOCAL means that JDBC driver uses the JVM zone as the session zone.
forceConnectionTimeZoneToSession = true means that the server side connection zone is changed to match local JVM zone.

Now in addition to this I started thinking about whether forceConnectionTimeZoneToSession = true can lead to failure if the JVM zone doesn't exist on the server. This probably explains the existing comment about without relying on server time zone (which may be configured to be totally unusable)..

Copy link
Member

@hashhar hashhar Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #15670 as an example of an issue I'm thinking about but in reverse direction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. Comment is now backwards; I didn't update it from a previous attempt to get this to work. I'll update it to reflect what is in the documentation.

@adamjshook adamjshook force-pushed the adamjshook/mysql-timestamps branch from 8cd1e31 to 41cef73 Compare September 12, 2023 13:32
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

We can address any post-merge comments from @anusudarsan/@findepi if they take a look.

@hashhar hashhar merged commit fd5b247 into trinodb:master Sep 13, 2023
@github-actions github-actions bot added this to the 427 milestone Sep 13, 2023
@adamjshook adamjshook deleted the adamjshook/mysql-timestamps branch October 2, 2023 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Potential incorrect timestamp mapping in MySQL connector
3 participants