Skip to content

Commit

Permalink
Merge branch 'main' into feature/update-lineage-events-paging
Browse files Browse the repository at this point in the history
  • Loading branch information
phixMe authored Aug 10, 2023
2 parents 5779262 + a62fc04 commit 779e2b1
Show file tree
Hide file tree
Showing 148 changed files with 470 additions and 456 deletions.
2 changes: 1 addition & 1 deletion .circleci/api-load-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
set -e

# Build version of Marquez
readonly MARQUEZ_VERSION=0.39.0-SNAPSHOT
readonly MARQUEZ_VERSION=0.40.0-SNAPSHOT
# Fully qualified path to marquez.jar
readonly MARQUEZ_JAR="api/build/libs/marquez-api-${MARQUEZ_VERSION}.jar"

Expand Down
2 changes: 1 addition & 1 deletion .circleci/db-migration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# Version of PostgreSQL
readonly POSTGRES_VERSION="12.1"
# Version of Marquez
readonly MARQUEZ_VERSION=0.38.0
readonly MARQUEZ_VERSION=0.39.0
# Build version of Marquez
readonly MARQUEZ_BUILD_VERSION="$(git log --pretty=format:'%h' -n 1)" # SHA1

Expand Down
2 changes: 1 addition & 1 deletion .env.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
API_PORT=5000
API_ADMIN_PORT=5001
WEB_PORT=3000
TAG=0.38.0
TAG=0.39.0
2 changes: 1 addition & 1 deletion .github/boring-cyborg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,4 +55,4 @@ firstPRMergeComment: >
# Comment to be posted to on first time issues
firstIssueWelcomeComment: >
Thanks for opening your first issue in the Marquez project! Please be sure to follow the issue template!
Thanks for opening your first issue in the Marquez project! Please be sure to follow the issue template!
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ One-line summary:
- [ ] You've updated any relevant documentation (_if relevant_)
- [ ] You've included a one-line summary of your change for the [`CHANGELOG.md`](https://github.com/MarquezProject/marquez/blob/main/CHANGELOG.md#unreleased) (_Depending on the change, this may not be necessary_).
- [ ] You've versioned your `.sql` database schema migration according to [Flyway's naming convention](https://flywaydb.org/documentation/concepts/migrations#naming) (_if relevant_)
- [ ] You've included a [header](https://github.com/MarquezProject/marquez/blob/main/CONTRIBUTING.md#copyright--license) in any source code files (_if relevant_)
- [ ] You've included a [header](https://github.com/MarquezProject/marquez/blob/main/CONTRIBUTING.md#copyright--license) in any source code files (_if relevant_)
4 changes: 2 additions & 2 deletions .github/workflows/headerchecker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
run: |
ok=1
readarray -t files <<<"$(jq -r '.[]' <<<'${{ steps.files.outputs.added_modified }}')"
for file in ${files[@]}; do
for file in ${files[@]}; do
if [[ ($file == *".java") ]]; then
if ! grep -q Copyright "$file"; then
ok=0
Expand All @@ -45,4 +45,4 @@ jobs:
else
GREEN="\e[32m"
echo -e "${GREEN}All changed & added files have been scanned. Result: no headers are missing.${ENDCOLOR}"
fi
fi
1 change: 0 additions & 1 deletion .gitpod.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,3 @@ ports:
vscode:
extensions:
- ms-azuretools.vscode-docker

1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ repos:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
exclude: ^chart/
- id: check-added-large-files

- repo: https://github.com/jguttman94/pre-commit-gradle
Expand Down
269 changes: 144 additions & 125 deletions CHANGELOG.md

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,11 @@ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].

Community Impact Guidelines were inspired by
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].

For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
at [https://www.contributor-covenant.org/translations][translations].

[homepage]: https://www.contributor-covenant.org
Expand All @@ -131,5 +131,5 @@ at [https://www.contributor-covenant.org/translations][translations].
[translations]: https://www.contributor-covenant.org/translations

----
SPDX-License-Identifier: Apache-2.0
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
8 changes: 4 additions & 4 deletions CODE_QUALITY_AND_SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@

The authors of Marquez are committed to providing secure software of the highest quality possible. To this end, we employ a number of tools and methodologies to ensure that our design, build, maintenance and testing practices maximize efficiency and minimize risk.

The specific security and analysis methodologies that we employ include but are not limited to:
The specific security and analysis methodologies that we employ include but are not limited to:

## Security

- Participation in the [OpenSSF Best Practices Badge Program](https://bestpractices.coreinfrastructure.org/en/projects/5106) for Free/Libre and FLOSS projects to ensure that we follow current best practices for quality and security
- Use of [HTTPS](https://en.wikipedia.org/wiki/HTTPS) for network communication
- Use of [HTTPS](https://en.wikipedia.org/wiki/HTTPS) for network communication
- Support for multiple cryptographic algorithms (through the use of HTTPS)
- Separate storage of authentication credentials according to best practices
- Use of secure protocols for network communication (through the use of HTTPS)
Expand All @@ -30,5 +30,5 @@ For more information about our approach to quality and security, feel free to re
- Twitter: [@MarquezProject](https://twitter.com/MarquezProject)

----
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
6 changes: 3 additions & 3 deletions COMMITTERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ They take responsibility for guiding new pull requests into the main branch.
| Ross Turk | [@rossturk](https://github.com/rossturk) |
| Minkyu Park | [@fm100](https://github.com/fm100) |
| Paweł Leszczyński | [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |

## Emeritus

The following people are no longer working on the Marquez project.
Expand All @@ -41,5 +41,5 @@ A Contributor may become a Committer by the approval of a majority of the
existing Committers (as per the project [charter](https://wiki.lfaidata.foundation/download/attachments/18481434/Marquez%20Project%20Technical%20Charter%20Final_Adopted%2005.21.20.pdf?version=1&modificationDate=1591718661000&api=v2)).

----
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,5 +249,5 @@ We use [SPDX](https://spdx.dev) for copyright and license information. The follo
* [Signing Commits](https://docs.github.com/en/github/authenticating-to-github/signing-commits)

----
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
6 changes: 3 additions & 3 deletions GOVERNANCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ Marquez is currently led by Willy Lulciuc.

## Marquez project meetings

Some meetings are face-to-face, but most are conference calls.
Some meetings are face-to-face, but most are conference calls.
Attendance at meetings is open to all. Conference calls can be joined without an explicit invitation.
However, due to physical security requirements at some of the venues we use,
it is necessary to ensure you are added to the invitee list of any face-to-face meetings
Expand All @@ -106,7 +106,7 @@ This creates a recorded discussion of design decisions and discussions that comp
Follow the link above and register with the Slack service using your email address.
Once signed in you can see all of the active Slack channels.

Additional channels are added from time to time as new workgroups and discussion topics are established.
Additional channels are added from time to time as new workgroups and discussion topics are established.

## Marquez email

Expand Down Expand Up @@ -142,5 +142,5 @@ be resolved by voting. The voting process is a simple majority in which each com


----
SPDX-License-Identifier: Apache-2.0
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
4 changes: 2 additions & 2 deletions METRICS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@
| `marquez_job_runs_completed` | _gauge_ | | Total number of completed job runs. |

----
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the Marquez project.
6 changes: 3 additions & 3 deletions api/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ ext {
jdbi3Version = '3.37.1'
prometheusVersion = '0.16.0'
testcontainersVersion = '1.17.6'
sentryVersion = '6.21.0'
sentryVersion = '6.28.0'
}

dependencies {
Expand All @@ -44,11 +44,11 @@ dependencies {
implementation "org.jdbi:jdbi3-postgres:${jdbi3Version}"
implementation "org.jdbi:jdbi3-sqlobject:${jdbi3Version}"
implementation 'com.google.guava:guava:31.1-jre'
implementation 'org.dhatim:dropwizard-sentry:2.1.2-4'
implementation 'org.dhatim:dropwizard-sentry:2.1.6'
implementation "io.sentry:sentry:${sentryVersion}"
implementation 'org.flywaydb:flyway-core:8.5.13'
implementation "org.postgresql:postgresql:${postgresqlVersion}"
implementation 'com.graphql-java:graphql-java:20.0'
implementation 'com.graphql-java:graphql-java:20.4'
implementation 'com.graphql-java-kickstart:graphql-java-servlet:12.0.0'

testImplementation "io.dropwizard:dropwizard-testing:${dropwizardVersion}"
Expand Down
2 changes: 2 additions & 0 deletions api/src/main/java/marquez/service/RunTransitionListener.java
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,10 @@ class RunOutput {
class RunTransition {
/** the unique ID of the run. */
@NonNull RunId runId;

/** The old state of the run. */
@Nullable RunState oldState;

/** The new state of the run. */
@NonNull RunState newState;

Expand Down
4 changes: 2 additions & 2 deletions api/src/main/resources/assets/graphql-playground/index.htm
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!--
Copyright 2018-2023 contributors to the Marquez project
SPDX-License-Identifier: Apache-2.0
SPDX-License-Identifier: Apache-2.0
-->

<!DOCTYPE html>
Expand Down Expand Up @@ -542,4 +542,4 @@
})
</script>
</body>
</html>
</html>
5 changes: 2 additions & 3 deletions api/src/main/resources/banner.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
__ ___
__ ___
/ |/ /___ __________ ___ _____ ____
/ /|_/ / __ `/ ___/ __ `/ / / / _ \/_ /
/ / / / /_/ / / / /_/ / /_/ / __/ / /_
/_/ /_/\__,_/_/ \__, /\__,_/\___/ /___/
/_/

/_/
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ FROM datasets d
JOIN dataset_symlinks symlinks ON d.uuid = symlinks.dataset_uuid
INNER JOIN namespaces ON symlinks.namespace_uuid = namespaces.uuid
WHERE d.is_hidden IS FALSE
GROUP BY d.uuid;
GROUP BY d.uuid;
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* SPDX-License-Identifier: Apache-2.0 */

ALTER TABLE runs ADD start_run_state_uuid UUID REFERENCES run_states(uuid);
ALTER TABLE runs ADD end_run_state_uuid UUID REFERENCES run_states(uuid);
ALTER TABLE runs ADD end_run_state_uuid UUID REFERENCES run_states(uuid);
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
/* SPDX-License-Identifier: Apache-2.0 */

CREATE INDEX datasetversion_datasetid_idx ON dataset_versions (dataset_uuid);
CREATE INDEX datasetversion_datasetid_idx ON dataset_versions (dataset_uuid);
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* SPDX-License-Identifier: Apache-2.0 */

ALTER TABLE dataset_versions ADD CONSTRAINT dataset_versions_version UNIQUE(version);
ALTER TABLE job_versions ADD CONSTRAINT job_versions_version UNIQUE(version);
ALTER TABLE job_versions ADD CONSTRAINT job_versions_version UNIQUE(version);
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
/* SPDX-License-Identifier: Apache-2.0 */

alter table datasets drop constraint datasets_source_uuid_physical_name_key;
alter table datasets drop constraint datasets_source_uuid_physical_name_key;
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ CREATE TABLE job_versions_io_mapping (
job_version_uuid UUID REFERENCES job_versions(uuid),
dataset_uuid UUID REFERENCES datasets(uuid),
io_type VARCHAR(64) NOT NULL,
PRIMARY KEY (job_version_uuid, dataset_uuid, io_type)
PRIMARY KEY (job_version_uuid, dataset_uuid, io_type)
);

CREATE TABLE run_args (
Expand Down Expand Up @@ -121,5 +121,5 @@ CREATE TABLE stream_versions (
CREATE TABLE runs_input_mapping (
run_uuid UUID REFERENCES runs(uuid),
dataset_version_uuid UUID REFERENCES dataset_versions(uuid),
PRIMARY KEY (run_uuid, dataset_version_uuid)
PRIMARY KEY (run_uuid, dataset_version_uuid)
);
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
/* SPDX-License-Identifier: Apache-2.0 */

ALTER TABLE lineage_events DROP CONSTRAINT lineage_event_pk;
ALTER TABLE lineage_events DROP CONSTRAINT lineage_event_pk;
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* SPDX-License-Identifier: Apache-2.0 */

ALTER TABLE jobs ADD namespace_name VARCHAR(255);
UPDATE jobs SET namespace_name = namespaces.name FROM namespaces WHERE jobs.namespace_uuid = namespaces.uuid;
UPDATE jobs SET namespace_name = namespaces.name FROM namespaces WHERE jobs.namespace_uuid = namespaces.uuid;
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ FROM jobs
WHERE job_versions.job_uuid = jobs.uuid;

CREATE INDEX job_versions_selector
ON job_versions (job_name, namespace_name);
ON job_versions (job_name, namespace_name);
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ UPDATE jobs SET (current_inputs) = (select jsonb_agg(query) from
UPDATE jobs SET (current_outputs) = (select jsonb_agg(query) from
(select jv.namespace_name as namespaceName, ds.name as datasetName
from job_versions_io_mapping m inner join job_versions jv on m.job_version_uuid = jv.uuid inner join datasets ds on m.dataset_uuid = ds.uuid
where m.io_type = 'OUTPUT' and jobs.uuid = jv.job_uuid) query);
where m.io_type = 'OUTPUT' and jobs.uuid = jv.job_uuid) query);
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,3 @@ UPDATE datasets SET
source_name = s.name
FROM sources s
WHERE s.uuid = datasets.source_uuid;

Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ ALTER TABLE jobs
ALTER TABLE job_versions
ALTER COLUMN namespace_name TYPE VARCHAR;
ALTER TABLE job_versions
ALTER COLUMN job_name TYPE VARCHAR;
ALTER COLUMN job_name TYPE VARCHAR;
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ ALTER TABLE runs ADD job_context_uuid uuid;
UPDATE runs SET
job_context_uuid = jv.job_context_uuid
FROM job_versions jv
WHERE jv.uuid = runs.job_version_uuid;
WHERE jv.uuid = runs.job_version_uuid;
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ UPDATE jobs SET (current_inputs) = (select jsonb_agg(query) from
from job_versions_io_mapping m inner join job_versions jv on m.job_version_uuid = jv.uuid inner join datasets ds on m.dataset_uuid = ds.uuid
where m.io_type = 'INPUT' and jobs.uuid = jv.job_uuid) query);

alter table jobs drop column current_outputs;
alter table jobs drop column current_outputs;
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* SPDX-License-Identifier: Apache-2.0 */

alter table runs alter column transitioned_at type timestamp without time zone
USING transitioned_at::timestamp without time zone;
USING transitioned_at::timestamp without time zone;
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

create index runs_created_at_by_name_index
on runs(job_name, namespace_name, created_at DESC)
include (uuid, created_at, updated_at, nominal_start_time, nominal_end_time, current_run_state, started_at, ended_at, namespace_name, job_name, location);
include (uuid, created_at, updated_at, nominal_start_time, nominal_end_time, current_run_state, started_at, ended_at, namespace_name, job_name, location);
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ ALTER TABLE namespaces ALTER COLUMN name TYPE VARCHAR;
ALTER TABLE runs ALTER COLUMN external_id TYPE VARCHAR;
ALTER TABLE sources ALTER COLUMN name TYPE VARCHAR;
ALTER TABLE sources ALTER COLUMN connection_url TYPE VARCHAR;
ALTER TABLE tags ALTER COLUMN name TYPE VARCHAR;
ALTER TABLE tags ALTER COLUMN name TYPE VARCHAR;
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
alter table dataset_versions add column lifecycle_state VARCHAR(63);
alter table datasets add column is_deleted BOOLEAN DEFAULT FALSE;
alter table datasets add column is_deleted BOOLEAN DEFAULT FALSE;
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ALTER TABLE jobs ADD COLUMN symlink_target_uuid uuid REFERENCES jobs (uuid);
CREATE INDEX jobs_symlinks ON jobs (symlink_target_uuid)
INCLUDE (uuid, namespace_name, name)
WHERE symlink_target_uuid IS NOT NULL;
WHERE symlink_target_uuid IS NOT NULL;
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,4 @@ FROM jobs j
INNER JOIN fqn jf ON jf.uuid = COALESCE(js.link_target_uuid, j.uuid)
ON CONFLICT (uuid) DO UPDATE
SET job_fqn=EXCLUDED.job_fqn,
aliases = jobs_fqn.aliases || EXCLUDED.aliases;
aliases = jobs_fqn.aliases || EXCLUDED.aliases;
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
ALTER TABLE datasets ADD COLUMN is_hidden BOOLEAN DEFAULT FALSE;

ALTER TABLE jobs ADD COLUMN is_hidden BOOLEAN DEFAULT FALSE;
ALTER TABLE jobs ADD COLUMN is_hidden BOOLEAN DEFAULT FALSE;
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,3 @@ CREATE TYPE DATASET_NAME AS (
namespace VARCHAR(255),
name VARCHAR(255)
);

Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ CREATE TABLE column_lineage (
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL,
UNIQUE (output_dataset_version_uuid, output_dataset_field_uuid, input_dataset_version_uuid, input_dataset_field_uuid)
);
);
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* SPDX-License-Identifier: Apache-2.0 */

create index dataset_fields_dataset_uuid
on dataset_fields (dataset_uuid);
on dataset_fields (dataset_uuid);
Loading

0 comments on commit 779e2b1

Please sign in to comment.