Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Coral dependency version to 2.0.55 #11159

Closed
wants to merge 1 commit into from

Conversation

findinpath
Copy link
Contributor

Description

This is a slightly modified copy of the PR #10226 which is supposed to fix #6450

Is this change a fix, improvement, new feature, refactoring, or other?
This change is a fix.

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

This change affects solely the Hive connector.

How would you describe this change to a non-technical end user or system administrator?

This is an update of the Coral LinkedIn library used for translating views.
Check out https://trino.io/episodes/18.html to get started.

Enable the translation of the views in hive.properties by adding:

hive.translate-hive-views=true

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@@ -251,6 +256,12 @@
</dependency>

<!-- used by tests but also needed transitively -->
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-parser</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the trino-parser was supposed to be shaded, per @shenodaguirguis's #10226 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we're back at square one, per #10226 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mvnrepository.com/artifact/com.linkedin.coral/coral-trino-parser/2.0.53

There is a shaded version of the coral-trino-parser library.
However, this is unfortunately empty - containing no files at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@findinpath findinpath Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked up what trino-parser is being used for within coral and actually this dependency is (at the moment) irrelevant for Trino in the context of coral library usage.

The Trino dependency is apparently used in the package com.linkedin.coral.trino.trino2rel https://github.com/linkedin/coral/tree/master/coral-trino/src/main/java/com/linkedin/coral/trino/trino2rel which uses Trino statements as input, but is irrelevant in the context of doing Hive translations.

Ideally there would be a split on the coral side:

  • one library containing rel2trino package (the original coral-presto/coral-trino functionality)
  • one library containing trino2rel package

Currently the coral team opted to package those two functionalities together under coral-trino.

I suggest we exclude coral-trino-parser dependency to be on the safe-side and go further with the update.

@findepi, @losipiuk , @martint WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provided means it's on the user of library to provide the dependency,
doesn't mean the dependency is not needed
and doesn't mean the dependency can have breaking API changes without breaking the library itself
(see also #10226 (comment))

i think we're making circles. the current state doesn't seem any better than were we started.

because

  1. we believe trino-parser is not needed at all
  2. we also believe the coral-trino module will be fixed eventually, probably sharing trino-parser correctly

i am fine with merging an imperfect solution for now.
i think the best what we can do is exclude trino-parser dependency from the coral dependency:

  • trino-parser remains test-only dependency of trino-hive, and is not part of the plugin
  • if our assumptions are wrong, and our use of coral indeed requires (or starts to require) trino-parser, we will know about this in a very clear NoClassDefFound message, instead of some less predictable failure (including potential silent correctness issues)

Copy link

@wmoustafa wmoustafa Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not following why we cannot just remove the dependency on trino-parser (or keep it to test as it was the case previously).
coral-trino-parser-2.0.53.jar is empty -- we can look into not publishing this but this should be irrelevant.
coral-trino-parser-2.0.53-all.jar has the shaded jar and is not empty.
Coral internally uses coral-trino-parser-2.0.53-all.jar, so this should not conflict with anything on Trino DB side (hence this file should not be changed).

If you revert this change, and still run to the issue on local builds, most likely you need to clean up your local maven caches. We ran to this issue and cleaning local maven caches helps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From mvn dependency:tree:

[INFO] +- com.linkedin.coral:coral-trino:jar:2.0.53:compile
[INFO] |  +- com.google.code.gson:gson:jar:2.8.1:compile
[INFO] |  +- com.linkedin.coral:coral-trino-parser:jar:2.0.53:compile

It very much looks like coral-trino.jar depends on coral-trino-parser-2.0.53.jar. Also this is what was pulled to my .m2/repository when I built trino from this PR. And the jar is empty.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljfgem do you have the corresponding tree from our internal integration?

Copy link

@ljfgem ljfgem Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@losipiuk @findinpath @findepi @wmoustafa
Yes, we have the similar corresponding tree from our internal integration.
But I tried 2 builds:

  1. Build without any config changes for trino-parser, just upgrade Coral to v2 and modify corresponding Java code
  2. Build with the trino-parser config changes (just applied this PR)

and both builds succeeded.

I am wondering which building issue we are facing, could you provide the stacktrace?

@findinpath
Copy link
Contributor Author

@losipiuk / @findepi

The library org.apache.calcite:calcite-core:jar:1.19.0 is a dependency of org.apache.pinot:pinot-common:jar:0.8.0

The library com.linkedin.calcite:calcite-core:jar:shaded:1.21.0.151 is a dependency of io.trino:trino-hive

Both of the libraries use the same base package name org.apache.calcite which may lead to eventual errors.
Is this a known situation/issue?

@losipiuk
Copy link
Member

@losipiuk / @findepi

The library org.apache.calcite:calcite-core:jar:1.19.0 is a dependency of org.apache.pinot:pinot-common:jar:0.8.0

The library com.linkedin.calcite:calcite-core:jar:shaded:1.21.0.151 is a dependency of io.trino:trino-hive

Both of the libraries use the same base package name org.apache.calcite which may lead to eventual errors. Is this a known situation/issue?

Do the clashing dependencies meet in one of the connectors? I would expect not as otherwise maven build would fail. We have a check for class clashes.

pom.xml Outdated
Comment on lines 1152 to 1158
<exclusions>
<!--
The trino-parser library is shaded and used internally within coral-trino library for
parsing Trino statements. However, this functionality is irrelevant in the context of
using coral within Trino and can be safely excluded from the dependencies of this library.
-->
<exclusion>
<groupId>com.linkedin.coral</groupId>
<artifactId>coral-trino-parser</artifactId>
</exclusion>
</exclusions>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do this exclusion given trino-parser used in coral has been shaded.

@@ -251,6 +256,12 @@
</dependency>

<!-- used by tests but also needed transitively -->
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-parser</artifactId>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this upgrade, given trino-parser used in Coral-Trino has been shaded, I don't think we need to change any trino-parser configs in this file, plugin/trino-iceberg/pom.xml or pom.xml

@findinpath findinpath requested review from findepi and losipiuk March 1, 2022 05:02
@findinpath findinpath force-pushed the bump-coral-version branch 2 times, most recently from beb066e to a68f188 Compare March 1, 2022 12:20
Comment on lines 261 to 262
<artifactId>trino-parser</artifactId>
<scope>runtime</scope>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude this dependency. see #11159 (comment) for discussion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] Used undeclared dependencies found:
[WARNING]    io.trino:trino-parser:jar:372-SNAPSHOT:test
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  25.802 s
[INFO] Finished at: 2022-03-02T06:49:48+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.1.2:analyze-only (default) on project trino-hive: Dependency problems found -> [Help 1]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bummer. If we cannot solve this within trino-hive module, maybe we need an intermediate module to "strip" trino-parser dependency from coral. I am not maven expert, so i don't know how to best do this, but i can try.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we exclude it? Before this patch, it was test scope, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was test scope, so not part of production build. Per #11159 (comment) it should remain as not part of build, because we cannot guarantee API compatibility of the trino-parser between versions.

@findinpath findinpath force-pushed the bump-coral-version branch from a68f188 to 770c430 Compare March 2, 2022 05:46
@findinpath findinpath force-pushed the bump-coral-version branch from 770c430 to 7e24729 Compare March 4, 2022 19:49
@findinpath findinpath changed the title Bump Coral dependency version to 2.0.53 Bump Coral dependency version to 2.0.55 Mar 4, 2022
@findinpath
Copy link
Contributor Author

Bump up coral to 2.0.55 to make use of linkedin/coral#234

<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-parser</artifactId>
<scope>runtime</scope>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enforcer plugin would fail if I'd do the exclusion.

I've rebased on top of master and did an undo for the changes from this PR of adding trino-parser as runtime dependency. Note that trino-parser is on master marked at the moment as a test dependency for trino-iceberg project.

@findinpath findinpath force-pushed the bump-coral-version branch from 7e24729 to 50bdec2 Compare March 7, 2022 06:58
@findinpath
Copy link
Contributor Author

Rebased on top of master to make use of the latest changes in the pom.xml of the projects affected by this PR.

@findinpath findinpath requested a review from findepi March 7, 2022 07:00
pom.xml Outdated
Comment on lines 1143 to 1153
<!--
The trino-parser library is marked as provided dependency for
coral-trino-parser.
The trino-parser library is being used internally within
coral-trino library for parsing Trino statements.
However, this functionality is irrelevant in the context of
using coral for Hive view translations within Trino and can be
therefore safely excluded from the dependencies of this library.
-->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You explain why it feels safe-ish to do it, but we should also explain why we do it.

Dependency is excluded because trino-parser does not guarantee API or behavioral
backwards compatibility. We cannot provide version depended by Coral here, since
the current version is used in tests. To avoid breaking in weird way, or silent correctness
issues, the dependency is excluded, so that it is clear when someone tries to use the
non-backwards compatible classes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the exclusion is needed anyways, but I will you decide if we should keep it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-authored-by: Marius Grama <findinpath@gmail.com>
@findinpath findinpath force-pushed the bump-coral-version branch from 50bdec2 to 443a33b Compare March 8, 2022 11:43
@findinpath findinpath requested a review from findepi March 8, 2022 11:43
@findepi
Copy link
Member

findepi commented Mar 8, 2022

Merged as d8f7ac8, fixed a typo.

@findepi findepi closed this Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Support Hive VIEW with CAST to decimal
6 participants