Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up the Maven build #262

Merged
merged 5 commits into from
Oct 21, 2019
Merged

Clean up the Maven build #262

merged 5 commits into from
Oct 21, 2019

Conversation

ches
Copy link
Member

@ches ches commented Oct 19, 2019

Hi folks!

There are three main themes here:

  1. The pebble that got me rolling, this was giving us a headache while trying to work:

    Fix the core module dependency on ingestion so that Maven resolves it as an inter-project dep, without requiring it to be installed to (local) repository and mucking with the version as you work.

  2. Use <dependencyManagement> in parent POM for common dependencies and families—version drift between modules on things like gRPC and Google Cloud deps will bring more pain in the long run than any short-term convenience of letting them differ between modules I think. This makes it easy to control that consistently.

  3. Avoid (re)defining <properties> in submodule POMs—similar reasoning as dependencies: drift leads to suffering. If you build only a subset of modules at a time, effective values may differ based on how Maven orders the build.

    Also avoid repeating things that inherit from the parent POM.

The trick to working with #1 in case you build from the command line is Maven's --also-make flag, which tells it to build project dependencies when building a dependent is requested. So to build core and have it resolve ingestion from source, it is:

$ mvn --also-make --projects=core test
$ Or for short:
$ mvn -am -pl core test

I got a little carried away and enabled -Xlint. That's opinionated, so it's a separate commit I can remove if you're opposed to it.

I needed to massage dependency versions slightly, for instance grpcVersion was set to a few different values in different modules and I picked what appeared to be most common for current Beam / Google Cloud deps in the tree which finally got Maven's resolver happy and no bincompat issues from tests.

This passes mvn verify, let's see how it fares in CI…

Oh, finally, I did not yet touch the sdk/java module that was added in the last day or two, because it is not currently a child of the root POM. I really believe that it should be for the same reasoning of version consistency pain above, so I may push another commit making it so. This is done. Let me know if you don't want that.

/cc @smadarasmi

@feast-ci-bot
Copy link
Collaborator

@ches: GitHub didn't allow me to request PR reviews from the following users: smadarasmi.

Note that only gojek members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Hi folks!

There are three main themes here:

  1. The pebble that got me rolling, this was giving us a headache while trying to work:

Fix the core module dependency on ingestion so that Maven resolves it as an inter-project dep, without requiring it to be installed to (local) repository and mucking with the version as you work.

  1. Use <dependencyManagement> in parent POM for common dependencies and families—version drift between modules on things like gRPC and Google Cloud deps will bring more pain in the long run than any short-term convenience of letting them differ between modules I think. This makes it easy to control that consistently.
  2. Avoid (re)defining <properties> in submodule POMs—similar reasoning as dependencies: drift leads to suffering. If you build only a subset of modules at a time, effective values may differ based on how Maven orders the build.

Also avoid repeating things that inherit from the parent POM.

I got a little carried away and enabled -Xlint. That's opinionated, so it's a separate commit I can remove if you're opposed to it.

I needed to massage dependency versions slightly, for instance grpcVersion was set to a few different values in different modules and I picked what appeared to be most common for current Beam / Google Cloud deps in the tree which finally got Maven's resolver happy and no bincompat issues from tests.

This passes mvn verify, let's see how it fares in CI…

Oh, finally, I did not yet touch the sdk/java module that was added in the last day or two, because it is not currently a child of the root POM. I really believe that it should be for the same reasoning of version consistency pain above, so I may push another commit making it so. Let me know if you don't want that.

/cc @smadarasmi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ches
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: zhilingc

If they are not already assigned, you can assign the PR to them by writing /assign @zhilingc in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feast-ci-bot
Copy link
Collaborator

Hi @ches. Thanks for your PR.

I'm waiting for a gojek member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment on lines -94 to +86
<groupId>com.github.gojek.feast</groupId>
<groupId>feast</groupId>
<artifactId>feast-ingestion</artifactId>
<version>9428b230a5</version>
<version>${project.version}</version>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theme #1

<!--compile 'com.github.spullara.mustache.java:compiler:0.9.5'-->
<dependency>
<groupId>com.github.spullara.mustache.java</groupId>
<artifactId>compiler</artifactId>
<version>0.9.5</version>
</dependency>
<dependency>
<groupId>com.hubspot.jinjava</groupId>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I could tell, this was unused. Unless it was here to override a version coming in transitively. If anyone knows otherwise, let me know. Nothing fails from its removal, but if it was fixing something transitive then it might need integration test coverage to surface a runtime issue.

Actually now that I look at it, the Mustache dep above might be unused too. Were these template languages used for some functionality that's been removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could have been used previously for the UI. That has been cut out for now. I think it's fine to just delete these for now. If something breaks we can always add them back

</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.5.5</version>
</dependency>

<!-- Jackson due to jinjava dependency problems -->
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests these were declared here only for the sake of jinjava removed below, so with jinjava removed I tried removing these too and nothing broke. The Jackson versions are set in <dependencyManagement> so if they're brought in transitively and the versions are important, that should take care of it already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidheryanto @zhilingc do you have context on these?

I feel like we can just go ahead and remove these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a bit of extra confidence, I've run mvn dependency:analyze -DignoreNonCompile and indeed Jackson is not used in core even transitively, nor are Jinjava or Mustache. The dependency plugin's analyzer isn't perfect but it's pretty good outside of annotations.

Pushing another commit that removes more of the reported unused deps with no apparent breakage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the jinjava library was used as a templating engine for our SQL query in Core to retrieve batch features. But in 0.3, batch feature retrieval responsibility is moved to Serving, so we don't need need jinjava anymore in Core

serving/pom.xml Outdated
@@ -292,7 +240,7 @@
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>RELEASE</version>
<version>5.5.2</version>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This quiets a Maven warning because the RELEASE meta version is deprecated. I'm not sure if this should be a 4.x version for JUnit 4, but this is the latest release version. Actually this dependency appears unused except in the Java SDK where it is re-declared since it isn't currently a child module.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we're trying to see if we can use JUnit 5 test framework because testing for exceptions and parameterizing test cases is cleaner in JUnit 5. But since in Core, Serving and Ingestion there are some dependencies that may not be as quick to migrate the test to JUnit 5, currently we still use JUnit 4 for those.

@ches
Copy link
Member Author

ches commented Oct 19, 2019

I've made sdk/java a child module of the main build now.

Copy link
Member

@woop woop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @ches!

These types of contributions add a lot of value. The changes from 0.1 to 0.3 have left us with quite a few dangling or misconfigured dependencies as I am sure you have discovered.

Overall it looks good, but I would like @zhilingc or @davidheryanto to have a quick look to make sure I didn't miss anything.

One thing that I need to mention is that all contributors need to sign the Google CLA in order to have code merged in. We actually need to have a bot that prompts folks for this.

Thanks again for the submission!

<!--compile 'com.github.spullara.mustache.java:compiler:0.9.5'-->
<dependency>
<groupId>com.github.spullara.mustache.java</groupId>
<artifactId>compiler</artifactId>
<version>0.9.5</version>
</dependency>
<dependency>
<groupId>com.hubspot.jinjava</groupId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could have been used previously for the UI. That has been cut out for now. I think it's fine to just delete these for now. If something breaks we can always add them back

</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.5.5</version>
</dependency>

<!-- Jackson due to jinjava dependency problems -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidheryanto @zhilingc do you have context on these?

I feel like we can just go ahead and remove these.

@ches
Copy link
Member Author

ches commented Oct 20, 2019

Thanks for the review!

I've signed the CLA individually, I'll see about the steps for us to have a corporate signatory this week.

It might be good to put setup of a CLA bot in the backlog.

@woop
Copy link
Member

woop commented Oct 20, 2019

Thanks for the review!

I've signed the CLA individually, I'll see about the steps for us to have a corporate signatory this week.

It might be good to put setup of a CLA bot in the backlog.

Great thanks. I have added it as a note in our project backlog.

Three main themes:

- Fix core module dependency on ingestion, so that Maven resolves it as
  an inter-project dep without requiring it to be installed to (local)
  repository and mucking with the version.

- Use dependencyManagement in parent POM for common dependencies and
  families -- version drift between modules on things like gRPC and
  Google Cloud deps will bring more pain in the long run than any short-
  term convenience of letting them differ between modules. This makes it
  easy to control that consistently.

- Avoid (re)defining properties in submodule POMs -- similar reasoning
  as dependencies: drift leads to suffering. If you build only a subset
  of modules at a time, resolved values may differ based on how Maven
  orders the build.

  Also avoid repeating things that inherit from the parent POM, like
  module groupId.
Updating the protobuf/protoc version everywhere to what the SDK used,
because it requires new APIs.
Checks mtime versus target/ directory. Saves considerable build time.
@ches
Copy link
Member Author

ches commented Oct 21, 2019

Rebased to fix conflicts. Needed to trivially touch source code this time, for the rename of ImportJobPipelineOptions to ImportOptions which must have happened since the version in the binary build of ingestion last published to Jitpack. Hopefully that workflow is no longer needed ☺️

According to

    mvn dependency:analyze -DignoreNonCompile

and running tests.

Also added some undeclared direct dependencies, though not covering them
comprehensively yet.
<dependency>
<groupId>io.opencensus</groupId>
<artifactId>opencensus-contrib-http-util</artifactId>
</dependency>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't catch it until now but dependency analyzer says this is unused in Ingestion since @zhilingc removed the 0.1 legacy PathUtil earlier today. Also direct use of kafka-clients is gone since today, only Beam APIs for Kafka. Updating the last commit with removals of these, sorry for one more force-push.

woop pushed a commit that referenced this pull request Oct 24, 2019
Urge the use of <dependencyManagement> by disallowing duplicate
dependency declarations. See #262.

Making final releases with SNAPSHOT dependencies in use is a really bad
idea that can happen by accidental oversight. Enforcer helps ensure it
doesn't.

For build reproducibility it's good to specify what Maven versions are
supported for the project's build to work. 3.0.5 is the minimum required
version by all our plugins currently, according to:

    $ mvn versions:display-plugin-updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants