- Column-lineage endpoints supports point-in-time requests
#2265
@pawel-big-lebowski Enable requestingcolumn-lineage
endpoint by a dataset version, job version or dataset field of a specific dataset version.
- Allow null column type in column-lineage
#2272
@pawel-big-lebowski - Include error message for JSON processing exception
#2271
@pawel-big-lebowski
In case of JSON processing exceptions Marquez API should return exception message to a client. - Fix column lineage when multiple jobs write to same dataset
#2289
@pawel-big-lebowski The fix deprecates the way fieldstransformationDescription
andtransformationType
are returned. The depracated way of returning those fields will be removed in 0.30.0.
0.28.0 - 2022-11-21
- Optimize current runs query for lineage API
#2211
@prachim-collab
Add a simpler, alternategetCurrentRuns
query that gets only simple runs from the database without the additional data from tables such asrun_args
,job_context
,facets
, etc., which required extra table joins. - Add Code Quality, DCO and Governance docs to project
#2237
#2241
@merobi-hub
Adds a number of standard governance and procedure docs to the project. - Add possibility to soft-delete namespaces
#2244
@mobuchowski
Adds the ability to "hide" inactive namespaces. The namespaces are undeleted when a relevant OL event is received. - Add search service proposal
#2203
@pawel-big-lebowski
Proposes using ElasticSearch as a pluggable search service to enhance the search feature in Marquez and adding the ability to turn it off, as well. Includes ideas about what should be indexed and the requirements for the interface.
- Show facets even when dataset has no fields
#2214
@JDarDagran
Changes the logic in theDatasetInfo
component to always show facets so that dataset facets are visible in the UI even if no dataset fields have been set. - Appreciate column prefix when given for
ended_at
#2231
@fm100
Theended_at
column was always null when querying ifcolumnPrefix
was given for the mapper. Now,columnPrefix
is included when checking for column existence. - Fix bug keeping jobs from being properly deleted
#2244
@mobuchowski
It wasn't possible to delete jobs created from events that had aParentRunFacet
. Now it's possible. - Fix symlink table column length '#2217' @pawel-big-lebowski
The dataset's name column in thedataset_symlinks
table was shorter than the column in the datasets table. Changes the existing V48 migration script to allow proper migration for users who did not upgrade yet, and adds an extra migration script to extend the column length for users who did upgrade but did not experience the issues.
0.27.0 - 2022-10-24
- Implement dataset symlink feature
#2066
@pawel-big-lebowski
Adds support for multiple dataset names and adds edges to the lineage graph based on symlinks. - Store column lineage facets in separate table
#2096
@mzareba382 @pawel-big-lebowski
Adds a column-level lineage representation and API endpoint to retrieve column-level lineage data from the Marquez database. - Add a lineage graph endpoint for column lineage
#2124
@pawel-big-lebowski
Allows for the storing of column-lineage information from events in the Marquez database and exposes column lineage through a graph endpoint. - Enrich returned dataset resource with column lineage information
#2113
@pawel-big-lebowski
Extends the/api/v1/namespaces/{namespace}/datasets
endpoint to return thecolumnLineage
facet. - Add downstream column lineage
#2159
@pawel-big-lebowski
Extends the recursive query that returns column lineage nodes to traverse the graph for downstream nodes. - Implement column lineage within Marquez Java client
#2163
@pawel-big-lebowski
Adds Marquez API client methods for column lineage. - Provide
dataset_symlinks
table forSymlinkDatasetFacet
#2087
@pawel-big-lebowski
Modifies Marquez to handle the newSymlinkDatasetFacet
in the OpenLineage spec. - Display current run state for job node in lineage graph
#2146
@wslulciuc
Fills job nodes in the lineage graph with the latest run state and makes some minor changes to column names used to display dataset and job metadata. - Include column lineage in dataset resource
#2148
@pawel-big-lebowski
Creates a method inColumnLineageService
to enrichDataset
with column lineage information and uses the method inDatasetResource
. - Add indices on the job table
#2161
@phixMe
Adds indices to the fields used we join on inside the lineage query to speed up the join operation in the/lineage
query. - Add endpoint to get column lineage by a job
#2204
@pawel-big-lebowski
Changes the API to make column lineage available for jobs. - Add column lineage methods to Python client
#2209
@pawel-big-lebowski
Implements methods for column lineage in the Python client.
- Update insert job function to avoid joining on symlinks for jobs with no symlinks
#2144
@collado-mike
Radically reduces the database compute load in Marquez installations that frequently create a large number of new jobs. - Increase size of
column-lineage.description
column#2205
@pawel-big-lebowski
VARCHAR(255)
was too small for some users.
- Add support for
parentRun
facet as reported by older Airflow OpenLineage versions#2130
@collado-mike
Adds aparentRun
alias to theLineageEvent
RunFacet
. - Add fix and tests for handling Airflow DAGs with dots and task groups
#2126
@collado-mike @wslulciuc
Fixes a recent change that broke how Marquez handles DAGs with dots and tasks within task groups and adds test cases to validate. - Fix version bump in
docker/up.sh
#2129
@wslulciuc
Defines aVERSION
variable to bump on a release. - Use
clean
when runningshadowJar
in Dockerfile#2145
@wslulciuc
Ensures the directoryapi/build/libs/
is cleaned before building the JAR again and updates.dockerignore
to ignoreapi/build/*
. - Fix bug that caused a single run event to create multiple jobs
#2162
@collado-mike
Checks to see if a run with the given ID already exists and uses the pre-associated job if so. - Fix column lineage returning multiple entries for job run multiple times
#2176
@pawel-big-lebowski
Makes column lineage return a column dependency only once if a job has been run several times. - Fix API spec issues
#2178
@phixMe
Fixes issues with type generators in theputDataset
API. - Fix downstream recursion
#2181
@pawel-big-lebowski
Fixes issue causing same node to be added to recursive table multiple times. - Update
jobs_current_version_uuid_index
andjobs_symlink_target_uuid_index
to ignoreNULL
values#2186
@collado-mike
Avoids writing to the indices when the indexed values added by #2161 are null.
0.26.0 - 2022-09-15
- Update FlywayFactory to support an argument to customize the schema programatically
#2055
@collado-mike
Note: this change does not aim to support custom schemas from configuration. - Add steps on proposing changes to Marquez
#2065
@wslulciuc
Adds steps on how to submit a proposal for review along with a design doc template. - Add
--metadata
option to seed backend with ol events#2082
@wslulciuc
Updates theseed
command to load metadata from a file containing an array of OpenLineage events via the--metadata
option. (Metadata used in the command was not being defined using the OpenLineage standard.) - Improve documentation on
nodeId
in the spec#2084
@howardyoo
Adds complete examples ofnodeId
to the spec. - Add
metadata
cmd#2091
@wslulciuc
Adds cmdmetadata
to generate OpenLineage events; generated events will be saved to a file calledmetadata.json
that can be used to seed Marquez via theseed
cmd. (We lacked a way to performance test the data model of Marquez with significantly large OL events.) - Add possibility to soft-delete datasets and jobs
#2032
#2099
#2101
@mobuchowski
Adds the ability to "hide" inactive datasets and jobs through the UI. (This PR does not include the UI part.) The feature works by adding anis_hidden
flag to both datasets and jobs tables. Then, it changesjobs_view
and addsdatasets_view
, which hides rows where theis_hidden
flag is set to True. This makes writing proper queries easier since there is no need to do this filtering manually. The soft-delete is reversed if the job or dataset is updated again because the new version reverts the flag. - Add raw OpenLineage events API
#2070
@mobuchowski
Adds an API that returns raw OpenLineage events sorted by time and optionally filtered by namespace. Filtering by namespace takes into account both job and dataset namespaces - Create column lineage endpoint proposal
#2077
@julienledem @pawel-big-lebowski
Adds a proposal to implement a column-level lineage endpoint in Marquez to leverage the column-level lineage facet in OpenLineage.
- Update lineage query to only look at jobs with inputs or outputs
#2068
@collado-mike
Changes the lineage query to query thejob_versions_io_mapping
table and INNER join with thejobs_view
so that only jobs that have inputs or outputs are present in thejobs_io
CTE. Hence, the table becomes very small and the recursive join in the lineage CTE very fast. (In many environments, a large number of jobs reporting events have no inputs or outputs - e.g., PythonOperators in an Airflow deployment. If a Marquez installation has many of these, the lineage query spends much of its time searching for overlaps with jobs that have no inputs or outputs.) - Persist OpenLineage event before updating Marquez model
#2069
@fm100
Switches the order of the code in order to persist the OpenLineage event first and then update the Marquez model. (When theRunTransitionListener
was invoked, the OpenLineage event was not persisted to the database. Because the OpenLineage event is the source of truth for all Marquez run transitions, it should be available fromRunTransitionListener
.) - Drop requirement to provide marquez.yml for
seed
cmd#2094
@wslulciuc
Useio.dropwizard.cli.Command
instead ofio.dropwizard.cli.ConfiguredCommand
to no longer require passing marquez.yml as an argument to theseed
cmd. (The marquez.yml argument is not used in theseed
cmd.)
- Fix/rewrite jobs fqn locks
#2067
@collado-mike
Updates the function to only update the table if the job is a new record or if thesymlink_target_uuid
is distinct from the previous value. (Therewrite_jobs_fqn_table
function was inadvertently updating jobs even when no metadata about the job had changed. Under load, this caused significant locking issues, as thejobs_fqn
table must be locked for every job update.) - Fix
enum
string types in the OpenAPI spec#2086
@studiosciences
Changes the type tostring
. (type: enum
was not valid in OpenAPI spec.) - Fix incorrect PostgresSQL version
#2089
@jabbera
Corrects the tag for PostgresSQL. - Update
OpenLineageDao
to handle Airflow run UUID conflicts#2097
@collado-mike
Alleviates the problem for Airflow installations that will continue to publish events with the older OpenLineage library. This checks the namespace of the parent run and verifies that it matches the namespace in theParentRunFacet
. If not, it generates a new parent run ID that will be written with the correct namespace. (The Airflow integration was generating conflicting UUIDs based on the DAG name and the DagRun ID without accounting for different namespaces. In Marquez installations that have multiple Airflow deployments with duplicated DAG names, we generated jobs whose parents have the wrong namespace.)
0.25.0 - 2022-08-08
- Fix py module release
#2057
@wslulciuc - Use
/bin/sh
inweb/docker/entrypoint.sh
#2059
@wslulciuc
0.24.0 - 2022-08-02
- Add copyright lines to all source files
#1996
@merobi-hub - Add copyright and license guidelines in
CONTRIBUTING.md
@wslulciuc - Add
@FlywayTarget
annotation to migration tests to control flyway upgrades#2035
@collado-mike
- Updated
jobs_view
to stop computing FQN on reads and to compute on writes instead#2036
@collado-mike - Runs row reduction
#2041
@collado-mike
- Update
Run
in the openapi spec to include acontext
field#2020
@esaych - Fix dataset openapi model
#2038
@esaych - Fix casing on
lastLifecycleState
#2039
@esaych - Fix V45 migration to include initial population of jobs_fqn table
#2051
@collado-mike - Fix symlinked jobs in queries
#2053
@collado-mike
0.23.0 - 2022-06-16
- Update docker-compose.yml: Randomly map postgres db port
#2000
@RNHTTR - Job parent hierarchy
#1935
#1980
#1992
@collado-mike
- Set default limit for listing datasets and jobs in UI from
2000
to25
#2018
@wslulciuc - Update OpenLineage write API to be non-transactional and avoid unnecessary locks on records under heavy contention @collado-mike
0.22.0 - 2022-05-16
- Add support for
LifecycleStateChangeFacet
with an ability to softly delete datasets#1847
@pawel-big-lebowski - Enable pod specific annotations in Marquez Helm Chart via
marquez.podAnnotations
#1945
@wslulciuc - Add support for job renaming/redirection via symlink
#1947
@collado-mike - Add
Created by
view for dataset versions along with SQL syntax highlighting in web UI#1929
@phixMe - Add
operationId
to openapi spec#1978
@phixMe
- Upgrade Flyway to v7.6.0
#1974
@dakshin-k
- Remove size limits on namespaces, dataset names, and and source connection urls
#1925
@collado-mike - Update namespace names to allow
=
,@
, and;
#1936
@mobuchowski - Time duration display in web UI
#1950
@phixMe - Enable web UI to access API via Helm Chart @GZack2000
0.21.0 - 2022-03-03
- Add MDC to the
LoggingMdcFilter
to include API method, path, and request ID @fm100 - Add Postgres sub-chart to Helm deployment for easier installation option @KevinMellott91
- GitHub Action workflow to validate changes to Helm chart @KevinMellott91
- Upgrade from
Java11
toJava17
@ucg8j - Switch JDK image from
alpine
totemurin
enabling Marquez to run on multiple CPU architectures @ucg8j
- Error when running Marquez on Apple M1 @ucg8j
-
The
/api/v1-beta/lineage
endpoint @wslulciuc -
The
marquez-airflow
lib. has been removed, Please use theopenlineage-airflow
library instead. To migrate to usingopenlineage-airflow
, make the following changes @wslulciuc:# Update the import in your DAG definitions -from marquez_airflow import DAG +from openlineage.airflow import DAG
# Update the following environment variables in your Airflow instance -MARQUEZ_URL +OPENLINEAGE_URL -MARQUEZ_NAMESPACE +OPENLINEAGE_NAMESPACE
-
The
marquez-spark
lib. has been removed. Please use theopenlineage-spark
library instead. To migrate to usingopenlineage-spark
, make the following changes @wslulciuc:SparkSession.builder() - .config("spark.jars.packages", "io.github.marquezproject:marquez-spark:0.20.+") + .config("spark.jars.packages", "io.openlineage:openlineage-spark:0.2.+") - .config("spark.extraListeners", "marquez.spark.agent.SparkListener") + .config("spark.extraListeners", "io.openlineage.spark.agent.OpenLineageSparkListener") .config("spark.openlineage.host", "https://api.demo.datakin.com") .config("spark.openlineage.apiKey", "your datakin api key") .config("spark.openlineage.namespace", "<NAMESPACE_NAME>") .getOrCreate()
0.20.0 - 2021-12-13
- Add deploy docs for running Marquez on AWS @wslulciuc @merobi-hub
- Clarify docs on using OpenLineage for metadata collection @fm100
- Upgrade to gradle
7.x
@wslulciuc - Use
eclipse-temurin
for Marquez API base docker image @fm100
- The following endpoints have been deprecated and are scheduled to be removed in
0.25.0
. Please use the/lineage
endpoint when collecting source, dataset, and job metadata @wslulciuc:
- Validation of OpenLineage events on write @collado-mike
- Increase
name
column size for tablesnamespaces
andsources
@mmeasic
0.19.1 - 2021-11-05
- URI and URL DB mappper should handle empty string as null @OleksandrDvornik
- Fix NodeId parsing when dataset name contains
struct<>
@fm100 - Add encoding for dataset names in URL construction @collado-mike
0.19.0 - 2021-10-21
- Add simple python client example @wslulciuc
- Display dataset versions in web UI 🎉 @phixMe
- Display runs and run facets in web UI 🎉 @phixMe
- Facet formatting and highlighting as Json in web UI @phixMe
- Add option for
docker/up.sh
to run in the background @rossturk - Return
totalCount
in lists of jobs and datatsets @phixMe
- Change type column in
dataset_fields
table toTEXT
@wslulciuc - Set
ZonedDateTime
parsing to support optional offsets and default to server timezone @collado-mike
Job.location
andSource.connectionUrl
should be in URI format on write @OleksandrDvornik- Z-Index fix for nodes and edges in lineage graph @phixMe
- Format of the index files for web UI @phixMe
- Fix OpenLineage API to return correct response codes for exceptions propagated from async calls @collado-mike
- Stopped overwriting nominal time information with nulls @mobuchowski
WriteOnly
clients forjava
andpython
. Before OpenLineage, we added aWriteOnly
implementation to our clients to emit calls to a backend. Abackend
enabled collecting raw HTTP requests to an HTTP endpoint, console, or file. This was our way of capturing lineage events that could then be used to automatically create resources on the Marquez backend. We soon worked on a standard that eventually became OpenLineage. That is, OpenLineage removed the need to make individual calls to create a namespace, a source, a datasets, etc, but rather accept an event with metadata that the backend could process. @wslulciuc
0.18.0 - 2021-09-14
- New Add Search API 🎉 @wslulciuc
- Add
.env.example
to override variables defined in docker-compose files @wslulciuc
- Add openlineage-java as dependency @OleksandrDvornik
- Move class SentryConfig from
marquez
tomarquez.tracing
pkg - Major UI improvements; the UI now uses the Search and Lineage APIs 🎉 @phixMe
- Set default API port to
8080
when running the Marquez shadowjar
@wslulciuc
- Update
examples/airflow
to useopenlineage-airflow
and fix the SQL in DAG troubleshooting step @wslulciuc
- Drop
job_versions_io_mapping_inputs
andjob_versions_io_mapping_outputs
tables @OleksandrDvornik
0.17.0 - 2021-08-20
- Update Lineage runs query to improve performance, added tests @collado-mike
- Add POST
/api/v1/lineage
endpoint to docs and deprecate run endpoints @wslulciuc - Drop
FieldType
enum @wslulciuc
- Run API endpoints that create or modify a job run (scheduled to be removed in
0.19.0
). Please use the POST/api/v1/lineage
endpoint when collecting job run metadata. @wslulciuc - Airflow integration, please use the
openlineage-airflow
library instead. @wslulciuc - Spark integration, please use the
openlineage-spark
library instead. @wslulciuc - Write only clients for
java
andpython
(scheduled to be removed in0.19.0
) @wslulciuc
- Dbt integration lib. @wslulciuc
- Common integration lib. @wslulciuc
0.16.1 - 2021-07-13
- dbt packages should look for namespace packages @mobuchowski
- Add common integration dependency to dbt plugins @mobuchowski
DatasetVersionDao
queries missing input and output facets @dominiquetipton- (De)serialization issue for
Run
andJobData
models @collado-mike - Prefix spark
openlineage.*
configuration parameters withspark.*
@collado-mike - Parse multi-statement sql in class
SqlParser
used in Airflow integration @wslulciuc - URL-encode namespace on calls to API backend @phixMe
0.16.0 - 2021-07-01
- New Add JobVersion API 🎉 @collado-mike
- New Add DBT integrations for BigQuery and Snowflake 🎉 @mobuchowski
- Reverted delete of BigQueryNodeVisitor to work with vanilla SparkListener @collado-mike
- Promote Lineage API out of beta @OleksandrDvornik
- Display job SQL in UI @phixMe
- Allow upsert of tags @hanbei
- Allow potentially ambiguous URIs with encoded path segments @mobuchowski
- Use source naming convetion defined by OpenLineage @mobuchowski
- Return dataset facets @collado-mike
- BigQuery source naming in integrations @mobuchowski
0.15.2 - 2021-06-17
- Add endpoint to create tags @hanbei
- Fixed build & release process for python marquez-integration-common package @collado-mike
- Fixed snowflake and bigquery errors when connector libraries not loaded @collado-mike
- Fixed Openlineage API does not set Dataset current_version_uuid #1361 @collado-mike
0.15.1 - 2021-06-11
- Factored out common functionality in Python airflow integration @mobuchowski
- Added Airflow task run macro to expose task run id @collado-mike
- Refactored ValuesAverageExpectationParser to ValuesSumExpectationParser and ValuesCountExpectationParser @collado-mike
- Updated SparkListener to extend Spark's SparkListener abstract class @collado-mike
- Use current project version in spark openlineage client @mobuchowski
- Rewrote LineageDao queries and LineageService for performance @collado-mike
- Updated lineage query to include new jobs that have no job version yet @collado-mike
0.15.0 - 2021-05-24
- Add tracing visibility @julienledem
- New Add snowflake extractor 🎉 @mobuchowski
- Add SSLContext to MarquezClient @lewiesnyder
- Add support for LogicalRDDs in spark plan visitors @collado-mike
- New Add Great Expectations based data quality facet support 🎉 @mobuchowski
- Augment tutorial instructions & screenshots for Airflow example @rossturk
- Rewrite correlated subqueries when querying the lineage_events table @collado-mike
- Web time formatting display fix @kachontep
0.14.2 - 2021-05-06
- Unpin
requests
dep inmarquez-airflow
integration @wslulciuc - Unpin
attrs
dep inmarquez-airflow
integration @wslulciuc
0.14.1 - 2021-05-05
- Updated dataset lineage query to find most recent job that wrote to it @collado-mike
- Pin http-proxy-middleware to 0.20.0 @wslulciuc
0.14.0 - 2021-05-03
- GA tag for website tracking @rossturk
- Basic CTE support in
marquez-airflow
@mobuchowski - Airflow custom facets, bigquery statistics facets @mobuchowski
- Unit tests for class
JobVersionDao
@wslulciuc - Sentry tracing support @julienledem
- OpenLineage facets support to API response models 🎉 @wslulciuc
BigQueryRelationTransformer
and deletedBigQueryNodeVisitor
@collado-mike- Bump postgres to
12.1.0
@wslulciuc - Update spark job name to reflect spark application name and execution node @collado-mike
- Update
marquez-airflow
integration to use OpenLineage 🎉 @mobuchowski - Migrate tests to junit 5 @mobuchowski
- Rewrite lineage IO sql queries to avoid job_versions_io_mapping_* tables @collado-mike
- Updated OpenLineage impl to only update dataset version on run completion @collado-mike
0.13.1 - 2021-04-01
- Remove unused implementation of SQL parser in
marquez-airflow
@mobuchowski
- Add inputs and outputs to lineage graph @henneberger
- Updated
NodeId
regex to support URIs with scheme and ports @collado-mike
0.13.0 - 2021-03-30
- Secret support for helm chart @KevinMellott91
- New
seed
cmd to populatemarquez
database with source, dataset, and job metadata allowing users to try out features of Marquez (data lineage, view job run history, etc) 🎉 - Docs on applying db migrations manually
- New Lineage API to support data lineage queries 🎉
- Support for logging errors via sentry
- New Airflow example with Marquez 🎉
- Update OpenLinageDao to stop converting URI structures to contain underscores instead of colons and slashes @collado-mike
- Bump testcontainers dependency to
v1.15.2
@ ShakirzyanovArsen - Register output datasets for a run lazily @henneberger
- Refactor spark plan traversal to find input/output datasets from datasources @collado-mike
- Web UI project settings and default marquez port @phixMe
- Associate dataset inputs on run start @henneberger
- Dataset description is not overwritten on update @henneberger
- Latest tags are returned from dataset @henneberger
- Airflow integration tests on forked PRs @mobuchowski
- Empty nominal end time support @henneberger
- Ensure valid dataset fields for OpenLineage @henneberger
- Ingress context templating for helm chart @KulykDmytro
0.12.2 - 2021-03-16
- Use alpine image for
marquez
reducing image size by+50%
@KevinMellott91 - Use alpine image for
marquez-web
reducing image size by+50%
@KevinMellott91
- Ensure
marquez.DAG
is (de)serializable
0.12.0 - 2021-02-08
- Modules:
api
,web
,clients
,chart
, andintegrations
- Working airflow example
runs
table indices for columns:created_at
andcurrent_run_state
@phixMe- New
/lineage
endpoint for OpenLineage support @henneberger - New graphql endpoint @henneberger
- New spark integration @henneberger
- New API to list versions for a dataset
- Drop
Source.type
enum (now a string type)
- Replace
jdbi.getHandle()
withjdbi.withHandle()
to free DB connections from pool @henneberger - Fix
RunListener
when registering outside of theMarquezContext
builder @henneberger
0.11.3 - 2020-11-02
- Add support for external ID on run creation @julienledem
- Throw
RunAlreadyExistsException
on run ID already exists - Add BigQuery, Pulsar, and Oracle source types @sreev
- Add run ID support in job meta; the optional run ID will be used to link a newly created job version to an existing job run, while supporting updating the run state and avoiding having to create another run
- Use
postgres
instead ofdb
inmarquez.dev.yml
- Allow multiple postgres containers in test suite @phixMe
0.11.2 - 2020-08-21
- Always migrate db schema on app start in development config
- Update default db username / password
- Use
marquez.dev.yml
in on docker composeup
0.11.1 - 2020-08-19
-
Use shorten name for namespaces in version IDs
-
Add namespace to Dataset and Job models
-
Add ability to deserialize
int
type to columns @phixMe -
Add
SqlLogger
for SQL profiling -
Add
DatasetVersionId.asDatasetId()
andJobVersionId.asJobId()
-
Add
DatasetService.getBy(DatasetVersionId): Dataset
-
Add
JobService.getBy(JobVersionId): Job
-
Allow for run transition override via
at=<TIMESTAMP>
, whereTIMESTMAP
is an ISO 8601 timestamp representing the date/time of the state transition. For example:POST /jobs/runs/{id}/start?at=<TIMESTAMP>
config.yml
->marquez.yml
- Fix dataset version column mappings
0.11.0 - 2020-05-27
Run.startedAt
,Run.endedAt
,Run.duration
@julienledem- class
MarquezContext
@julienledem - class
RunTransitionListener
@julienledem - Unique identifier class
DatasetId
for datasets @julienledem - Unique identifier class
JobId
for jobs @julienledem - class
RunId
@ravikamaraj - enum
RunState
@ravikamaraj - class
Version
@ravikamaraj
- Job inputs / outputs are defined as
DatasetId
- Bump to JDK 11
- Use of API models under
marquez.api.models
pkg
- API docs example to show correct
SQL
key in job context @frankcash
0.10.4 - 2020-01-17
- Fix
RunState.isComplete()
0.10.3 - 2020-01-17
- Add new logo
- Add
JobResource.locationFor()
- Fix dataset field versioning
- Fix list job runs
0.10.2 - 2020-01-16
- Added Location header to run creation @nkijak
0.10.1 - 2020-01-11
- Rename
datasets.last_modified
0.10.0 - 2020-01-08
- Rename table
dataset_tag_mapping
0.9.2 - 2020-01-07
- Add
Flyway.baselineOnMigrate
flag
0.9.1 - 2020-01-06
- Add redshift data types
- Add links to dropwizard overrides in
config.yml
0.9.0 - 2020-01-05
- Validate
runID
when linked to dataset change - Add
Utils.toUuid()
- Add tests for class
TagDao
- Add default tags to config
- Add tagging support for dataset fields
- Add
docker/config.dev.yml
- Add flyway config support
- Replace deprecated
App.onFatalError()
- Fix error on tag exists
- Fix malformed sql in
RunDao.findAll()
0.8.0 - 2019-12-12
- Add `Dataset.lastModified``
- Add
tags
table schema - Add
GET
/tags
- Use new Flyway version to fix migration with custom roles
- Modify
args
column in table `run_args
0.7.0 - 2019-12-05
- Link dataset versions with run inputs
- Add schema required by tagging
- More tests for class
common.Utils
- Add
ColumnsTest
- Add
RunDao.insert()
- Add
RunStateDao.insert()
- Add
METRICS.md
- Add prometheus dep and expose
GET
/metrics
- Fix dataset field serialization
0.6.0 - 2019-11-29
- Add
Job.latestRun
- Add debug logging
- Adjust class RunResponse property ordering on serialization
- Update logging on default namespace creation
0.5.1 - 2019-11-20
- Add dataset field versioning support
- Add link to web UI
- Add
Job.context
- Update semver regex in build-and-push.sh
- Minor updates to job and dataset versioning functions
- Make
Job.location
optional
0.5.0 - 2019-11-04
- Add
lombok.config
- Add code review guidelines
- Add
JobType
- Add limit and offset support to NamespaceAPI
- Add Development section to
CONTRIBUTING.md
- Add class
DatasetMeta
- Add class
MorePreconditions
- Added install instructions for docker
- Rename guid column to uuid
- Use admin ping and health
- Update
owner
toownerName
- Remove experimental db table versioning code
- Fix
marquez.jar
rename onCOPY
0.4.0 - 2019-06-04
- Add quickstart
- Add
GET
/namespaces/{namespace}/jobs/{job}/runs
0.3.4 - 2019-05-17
- Change
Datasetdao.findAll()
to order byDataset.name
0.3.3 - 2019-05-14
- Set timestamps to
CURRENT_TIMESTAMP
0.3.2 - 2019-05-14
- Set
job_versions.updated_at
toCURRENT_TIMESTAMP
0.3.1 - 2019-05-14
- Handle
Flyway.repair()
error
0.3.0 - 2019-05-14
- Add
JobResponse.updatedAt
- Return timestamp strings as ISO format
- Remove unused tables in db schema
0.2.1 - 2019-04-22
- Support dashes (
-
) in namespace
0.2.0 - 2019-04-15
- Add
@NoArgsConstructor
to exceptions - Add license to
*.java
- Add column constants
- Add response/error metrics to API endpoints
- Add build info to jar manifest
- Add release steps and plugin
- Add
/jobs/runs/{id}/run
- Add jdbi metrics
- Add gitter link
- Add column constants
- Add
MarquezServiceException
- Add
-parameters
compiler flag - Add JSON logging support
- Minor pkg restructuring
- Throw
NamespaceNotFoundException
onNamespaceResource.get()
- Fix dataset list error
0.1.0 - 2018-12-18
- Marquez initial public release.