diff --git a/README.md b/README.md index 88f2cb076e..d03293ff61 100644 --- a/README.md +++ b/README.md @@ -40,10 +40,28 @@ Want to be added? Send a pull request our way! Marquez provides a simple way to collect and view _dataset_, _job_, and _run_ metadata using [OpenLineage](https://openlineage.io). The easiest way to get up and running is with Docker. From the base of the Marquez repository, run: -``` +### MacOS and Linux users: + +```bash $ ./docker/up.sh ``` +### Windows users: + +Before cloning Marquez, configure Git to check out files with Unix-style file endings: + +```bash +$ git config --global core.autocrlf false +``` + +Verify that Bash and PostgreSQL have been installed and added to the PATH variable (Git Bash is recommended). + +Start all services: + +```bash +$ sh ./docker/up.sh +``` + > **Tip:** Use the `--build` flag to build images from source, and/or `--seed` to start Marquez with sample lineage metadata. For a more complete example using the sample metadata, please follow our [quickstart](https://marquezproject.github.io/marquez/quickstart.html) guide. > **Note:** Port 5000 is now reserved for MacOS. If running locally on MacOS, you can run `./docker/up.sh --api-port 9000` to configure the API to listen on port 9000 instead. Keep in mind that you will need to update the URLs below with the appropriate port number. @@ -79,8 +97,8 @@ Versions of Marquez are compatible with OpenLineage unless noted otherwise. We e | **Marquez** | **OpenLineage** | **Status** | |--------------------------------------------------------------------------------------------------|---------------------------------------------------------------|---------------| | [`UNRELEASED`](https://github.com/MarquezProject/marquez/blob/main/CHANGELOG.md#unreleased) | [`1-0-5`](https://openlineage.io/spec/1-0-5/OpenLineage.json) | `CURRENT` | -| [`0.44.0`](https://github.com/MarquezProject/marquez/blob/0.43.0/CHANGELOG.md#0430---2023-12-15) | [`1-0-5`](https://openlineage.io/spec/1-0-5/OpenLineage.json) | `RECOMMENDED` | -| [`0.43.1`](https://github.com/MarquezProject/marquez/blob/0.42.0/CHANGELOG.md#0420---2023-10-17) | [`1-0-5`](https://openlineage.io/spec/1-0-0/OpenLineage.json) | `MAINTENANCE` | +| [`0.46.0`](https://github.com/MarquezProject/marquez/blob/0.46.0/CHANGELOG.md#0460---2024-03-15) | [`1-0-5`](https://openlineage.io/spec/1-0-5/OpenLineage.json) | `RECOMMENDED` | +| [`0.45.0`](https://github.com/MarquezProject/marquez/blob/0.45.0/CHANGELOG.md#0450---2024-03-07) | [`1-0-5`](https://openlineage.io/spec/1-0-0/OpenLineage.json) | `MAINTENANCE` | > **Note:** The [`openlineage-python`](https://pypi.org/project/openlineage-python) and [`openlineage-java`](https://central.sonatype.com/artifact/io.openlineage/openlineage-java) libraries will a higher version than the OpenLineage [specification](https://github.com/OpenLineage/OpenLineage/tree/main/spec) as they have different version requirements. @@ -130,7 +148,7 @@ $ createdb marquez With your database created, you can now copy [`marquez.example.yml`](https://github.com/MarquezProject/marquez/blob/main/marquez.example.yml): -``` +```bash $ cp marquez.example.yml marquez.yml ``` diff --git a/docs/v2/README.md b/docs/v2/README.md index 561cc2e89e..bba7406b4b 100644 --- a/docs/v2/README.md +++ b/docs/v2/README.md @@ -4,7 +4,7 @@ This website is built using [Docusaurus 2](https://docusaurus.io/), a modern sta ### Installation -``` +```bash $ yarn ``` @@ -19,7 +19,7 @@ yarn docusaurus gen-api-docs all ### Local Development -``` +```bash $ yarn start ``` @@ -27,7 +27,7 @@ This command starts a local development server and opens up a browser window. Mo ### Build -``` +```bash $ yarn build ``` @@ -37,13 +37,13 @@ This command generates static content into the `build` directory and can be serv Using SSH: -``` +```bash $ USE_SSH=true yarn deploy ``` Not using SSH: -``` +```bash $ GIT_USER= yarn deploy ``` diff --git a/docs/v2/docs/quickstart/index.mdx b/docs/v2/docs/quickstart/index.mdx index 61f066675b..648dae9cdc 100644 --- a/docs/v2/docs/quickstart/index.mdx +++ b/docs/v2/docs/quickstart/index.mdx @@ -4,6 +4,8 @@ template: basepage sidebar_position: 1 --- +import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; + ## Table of Contents 1. [Prerequisites](#prerequisites) @@ -21,19 +23,47 @@ This guide covers how you can quickly get started collecting _dataset_, _job_, a Before you begin, make sure you have installed: + + + +* [Docker 17.05](https://docs.docker.com/install)+ +* [Docker Compose](https://docs.docker.com/compose/install) + + + + +* [Git Bash](https://gitforwindows.org/) +* [PostgreSQL 14](https://www.postgresql.org/) * [Docker 17.05](https://docs.docker.com/install)+ * [Docker Compose](https://docs.docker.com/compose/install) + + + > Note: In this guide, we'll be running the Marquez HTTP server via Docker. ## Get Marquez To checkout the Marquez source code, run: + + + +```bash +$ git clone https://github.com/MarquezProject/marquez && cd marquez ``` + + + + +```bash +$ git config --global core.autocrlf false $ git clone https://github.com/MarquezProject/marquez && cd marquez ``` + + + ## Marquez Data Model {#marquez-data-model} ### Metadata Storage @@ -60,10 +90,25 @@ In this example, we'll be using sample dataset, job, and run metadata for a hypo To start Marquez with sample metadata that will be used and referenced in later sections, run the following script from the base of the Marquez repository (the `--seed` flag will execute the `marquez seed` [command](https://github.com/MarquezProject/marquez/blob/main/api/src/main/java/marquez/cli/SeedCommand.java)): -``` + + + +```bash $ ./docker/up.sh --seed ``` + + + +Verify that Postgres and Bash are in your `PATH`, then run: + +```bash +$ sh ./docker/up.sh --seed +``` + + + + > **Tip:** Use the `--build` flag to build images from source, or `--tag X.Y.Z` to use a tagged image. To view the Marquez UI and verify it's running, open [http://localhost:3000](http://localhost:3000). The UI enables you to discover dependencies between jobs and the datasets they produce and consume via the lineage graph, view run-level metadata of current and previous job runs, and much more! @@ -78,7 +123,7 @@ To view lineage metadata collected by Marquez, browse to the UI by visiting [htt ### View Job Metadata -You should see the job `namespace`, `name`, the query, and a tab containing its run history: +You should see the job `namespace`, `name`, the query, and a tab containing its run history: ![](tab-view-job-metadata.png) @@ -105,4 +150,4 @@ In this simple example, we showed you how to write sample lineage metadata to a ## Feedback {#feedback} -What did you think of this guide? You can reach out to us on [slack](https://join.slack.com/t/marquezproject/shared_invite/zt-29w4n8y45-Re3B1KTlZU5wO6X6JRzGmA) and leave us feedback, or [open a pull request](https://github.com/MarquezProject/marquez/blob/main/CONTRIBUTING.md#submitting-a-pull-request) with your suggestions! +What did you think of this guide? You can reach out to us on [slack](https://join.slack.com/t/marquezproject/shared_invite/zt-29w4n8y45-Re3B1KTlZU5wO6X6JRzGmA) and leave us feedback, or [open a pull request](https://github.com/MarquezProject/marquez/blob/main/CONTRIBUTING.md#submitting-a-pull-request) with your suggestions!