We're excited you're interested in contributing to Marquez! We'd love your help, and there are plenty of ways to contribute:
- Give the repo a star
- Join our slack channel and leave us feedback or help with answering questions from the community
- Fix or report a bug
- Fix or improve documentation
- For newcomers, pick up a "good first issue", then send a pull request our way (see the resources section below for helpful links to get started)
We feel that a welcoming community is important and we ask that you follow the Contributor Covenant Code of Conduct in all interactions with the community.
If you’re interested in using or learning more about Marquez, reach out to us on our slack channel and follow @MarquezProject for updates. We also encourage new comers to join our monthly community meeting!
Your pull request must be approved and merged by a committer.
To run the entire test suite:
$ ./gradlew test
You can also run individual tests for a submodule using the --tests
flag:
$ ./gradlew :api:test --tests marquez.api.OpenLineageResourceTest
$ ./gradlew :api:test --tests marquez.service.OpenLineageServiceIntegrationTest
$ ./gradlew :api:test --tests marquez.db.OpenLineageDaoTest
Or run tests by category:
$ ./gradlew :api:testUnit # run only unit tests
$ ./gradlew :api:testIntegration # run only integration tests
$ ./gradlew :api:testDataAccess # run only data access tests
We use spotless to format our code. This ensures .java
files are formatted to comply with Google Java Style. Make sure your code is formatted before pushing any changes, otherwise CI will fail:
$ ./gradlew spotlessApply
Note: To make formatting code simple, we recommend installing a plugin for your favorite IDE. We also use Lombok. Though not required, you might want to install the plugin, as well.
We use pre-commit
to manage git hooks:
$ brew install pre-commit
To setup the git hook scripts run:
$ pre-commit install
Each Pull Request executes a series of quality checks, mostly relying upon CircleCI for validation. However, there are certain validation checks that execute via GitHub Actions and can be run locally using the steps below.
Install act and run the following command, which will evaluate the GitHub Actions checks that apply to each Pull Request. The first time you run act you will be asked to choose a runner.
Alternatively, you can store your preferred runner within a local user profile named .actrc.
# .actrc file example (https://github.com/nektos/act#configuration)
-P ubuntu-latest=ghcr.io/catthehacker/ubuntu:act-latest
Once you have configured a runner, use act to invoke GitHub Actions and evaluate the workflow.
act pull_request
You can also enable verbose logging and image caching via act flags.
act pull_request --reuse --verbose
Note: Docker must be running in order to utilize act.
There is an issue within the act tool that prevents the kind cluster from being deleted after execution of the action. When this condition exists, you will experience the error below.
| Creating kind cluster...
| ERROR: failed to create cluster: node(s) already exist for a cluster with the name "chart-testing"
[Lint and Test Chart/lint-test] ❌ Failure - Create kind cluster
Execute the command below to manually clean up the kind cluster and resolve the problem.
kind delete clusters chart-testing
Use publishToMavenLocal
to publish artifacts to your local maven repository:
$ ./gradlew publishToMavenLocal
Submitting a Pull Request
- Fork and clone the repository
- Make sure all tests pass locally:
./gradlew :api:test
- Create a new branch:
git checkout -b feature/my-cool-new-feature
- Make a change on your cool new branch
- Write a test for your change
- Make sure
.java
files are formatted:./gradlew spotlessJavaCheck
- Make sure
.java
files contain a copyright and license header - Make sure to sign you work
- Push the change to your fork and submit a pull request
- Work with project maintainers to get your change reviewed and merged into the
main
branch - Delete your branch
To ensure your pull request is accepted, follow these guidelines:
- All changes should be accompanied by tests
- Do your best to have a well-formed commit message for your change
- Keep diffs small and self-contained
- If your change fixes a bug, please link the issue in your pull request description
- Any changes to the API reference require regenerating the static
openapi.html
file.
Note: A pull request should generally contain only one commit (use
git commit --amend
andgit push --force
or squash existing commits into one).
-
Use a group at the beginning of your branch names:
feature Add or expand a feature bug Fix a bug proposal Propose a change
For example:
feature/my-cool-new-feature bug/my-bug-fix bug/my-other-bug-fix proposal/my-proposal
-
Choose short and descriptive branch names
-
Use dashes (
-
) to separate words in branch names -
Use lowercase in branch names
We use renovate to manage dependencies for most of our project modules, with a couple of exceptions. Renovate automatically detects new dependency versions and opens pull requests to upgrade dependencies in accordance with the configured rules.
The following dependencies are managed manually:
- Web code - it is challenging to programmatically validate web content
- Spark versions - the internal query plans parsed by the Spark OpenLineage integration are not stable across Spark versions
- Gradle - this tool orchestrates the entire build pipeline and was excluded to ensure stability
The sign-off is a simple line at the end of the message for a commit. All commits need to be signed. Your signature certifies that you wrote the patch or otherwise have the right to contribute the material (see Developer Certificate of Origin):
This is my commit message
Signed-off-by: Remedios Moscote <remedios.moscote@buendía.com>
Git has a -s
command line option to append this automatically to your commit message:
$ git commit -s -m "This is my commit message"
API Docs
To bundle:
$ redoc-cli bundle spec/openapi.yml -o docs/openapi.html --title "Marquez API Reference"
To serve:
$ redoc-cli serve spec/openapi.yml
Then browse to: http://localhost:8080
Note: To bundle or serve the API docs, please install redoc-cli.
We use SPDX for copyright and license information. The following license header must be included in all java,
bash
, and py
source files:
java
/*
* Copyright 2018-2022 contributors to the Marquez project
* SPDX-License-Identifier: Apache-2.0
*/
bash
#!/bin/bash
#
# Copyright 2018-2022 contributors to the Marquez project
# SPDX-License-Identifier: Apache-2.0
py
# Copyright 2018-2022 contributors to the Marquez project
# SPDX-License-Identifier: Apache-2.0
- How to Contribute to Open Source
- Using the Fork-and-Branch Git Workflow
- Understanding the GitHub flow
- Keeping a Changelog
- Code Review Developer Guide
- Signing Commits
SPDX-License-Identifier: Apache-2.0 Copyright 2018-2023 contributors to the Marquez project.