-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-28028: Remove duplicated proto reader/writer classes introduced in HIVE-19288 #5079
Conversation
|
Thanks for the review @okumin. |
@abstractdog, can you please help review this? |
@@ -116,6 +116,32 @@ | |||
</exclusion> | |||
</exclusions> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this dependency seems to be used in itests/hive-unit and ql too, questions:
- would it make sense to define this in root pom's dependencyManagement with exclusions and just refer to that here?
- the exclusions are different in the two places, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.) I followed the same pattern done for other tez dependencies line tez-dag etc. In parent pom.xml, just dependency in there and in ql/pom.xml, they have done the exclusion. Same goes for itests module and sub modules.
2.) Every tez dependency is trying to exclude some hadoop dependencies, but they are not same in submodules, for example, https://github.com/apache/hive/blob/master/itests/qtest/pom.xml#L392 and https://github.com/apache/hive/blob/master/itests/hive-minikdc/pom.xml#L150, both of then are tez-dag dependecy but have different exclusions. Hence I don't know which hadoop dependecy to exclude, and what's the history about them. That's why I kept the exclusions same as other tez dependencies in same pom.xml
If there is any suggestion around it, I am happy to re-evaluate and re work on the exclusions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abstractdog, apologies for delay. Can you provide your guidance on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aggarwal-Raghav can you pull this to the parent pom as Laszlo suggested & for the exclusions, combine all of them together, do check once if they are actually being pulled in or not.
if this approach creates some issue, we can consider excluding dependencies individually
@ayushtkn, If I remove all the exclusions from the tez-protobuf-history-plugin, then no log4j or any hadoop dependency is coming. Attaching the dependency tree for reference. |
Have updated the PR, considered the suggestions from Laszlo, and have excluded hadoop dependencies which are used in https://github.com/apache/tez/blob/rel/release-0.10.3/tez-plugins/tez-protobuf-history-plugin/pom.xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you haven't all the excluded dependencies, earlier you had hadoop-mapreduce-client-core and others excluded in ql/pom.xml but now in the parent pom, they aren't excluded, why?
Will add the rest |
dcb95bd
to
3c52bd0
Compare
The exercise to move all dependencies to hive project level common pom.xml might require more work. There already seems to be a lot of tez related dependencies that are specified in the hive project root pom.xml with no exclusions currently. The exclusions are specifically in the ql module pom.xml, the one place that i saw. moving them up might lead to change in state of other places these dependencies are included and i think should be a larger exercise and we should create another jira for the same with more experts with runtime expertise taking that forward. For the sake of being optimist, @Aggarwal-Raghav lets just give it a shot if moving them up is easy and build runs else we can create another jira to get that in. @abstractdog / @ayushtkn The goal here is to remove common transitive dependencies from tez also in hive i suppose and prevent further classpath inconsistencies ? |
Moving all the dependencies will be a big refactor, we should definitely do it as part of a new jira, not in this. The only dependency we are adding here is "tez-protobuf-history-plugin", I think @abstractdog mentioned that we move that to the main pom & have the exclusions for it defined in the main pom, not the existing ones, for the existing ones we can create a followup if needed. if for the "new" dependency being added here, if defining it & its exclusion in the main pom doesn't work, I am ok with following the old approach of independently defining it is ql & itests and the exclusions there. |
I believe we're quite close, also I admit that we don't have a clear policy regarding the tez dependencies and where to exclude hadoop stuff from there, so I believe, without refactoring the whole pom chaos, we should still do here what makes the most sense, which is I tried to suggest, so:
+1 ignore detailed history now, just exclude the typical ones:
does this make sense? |
3c52bd0
to
5534441
Compare
latest patch LGTM +1 |
|
…in HIVE-19288 (apache#5079) (Raghav Aggarwal reviewed by Laszlo Bodor, Ayush Saxena)
What changes were proposed in this pull request?
Please check JIRA comments for discussion HIVE-28028
Why are the changes needed?
There are duplicate java files present in hive which are already present in apache tez. I checked the diff between these files from hive and tez and found there are few improvement in TEZ (like TEZ-4296, TEZ-4105, TEZ-4305)
Does this PR introduce any user-facing change?
NO
Is the change a dependency upgrade?
NO
How was this patch tested?
On local mac