Use placeholder runs to show pipeline runs in the dashboard without delay #2048

schustmi · 2023-11-14T16:48:48Z

Describe changes

This PR allows us to create a PipelineRun in the database before actually executing the pipeline using the orchestrator.
With this in place, we can now return a reference to this pipeline run when someone is running a pipeline, and can also show the pipeline run in the dashboard immediately.

Notes regarding the migration

This PR adds a unique constraint for the combination of the deployment_id and orchestrator_run_id of the pipeline_run table. These columns were introduced (and have been set to non-null values since)

release 0.21.0 for the orchestrator_run_id
release 0.34.0 for the deployment_id

For this unique constraint to work, we have to consider these scenarios:

pipeline runs that happened before release 0.21.0: For these both columns are NULL. We solve this by writing some unique dummy value in the orchestrator_run_id column.
pipeline runs that happened between releases 0.21.0 and 0.34.0: In this case only the orchestrator_run_id is set. This is only a problem if we assume people run with custom orchestrators that do not generate a globally unique orchestrator_run_id.
pipeline runs that happened after release 0.34.0: For these both deployment_id and orchestrator_run_id are set and the combination of the two is unique, otherwise it would have failed earlier when trying to run those pipelines.

What the migration currently does not account for:

If users manually modified their database to set/delete values of the orchestrator_run_id column.
If users deleted deployments in their database, which sets the deployment_id column to None. Similar to above, this is only a problem if we assume people run with custom orchestrators that do not generate a globally unique orchestrator_run_id.

TODO

Add a new icon for the initializing state of a pipeline run in the dashboard.
- This currently shows the same icon that is displayed when the pipeline is running. IMO, we should still update it but it's not blocking any release. (-> https://zenml.atlassian.net/browse/PROD-160)
~~Make sure the dashboard can handle an empty orchestrator_environment in a PipelineRun response. It is optional on the model but might not be handled correctly in the dashboard.~~
- This works and just displays nothing.

Pre-requisites

Please ensure you have done the following:

I have read the CONTRIBUTING.md document.
If my change requires a change to docs, I have updated the documentation accordingly.
If I have added an integration, I have updated the integrations table and the corresponding website section.
I have added tests to cover my changes.
I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Other (add details above)

schustmi · 2023-11-14T16:50:11Z

@fa9r Very early/rough draft of an idea I had to show pipeline runs in the dashboard immediately. Let me know if you see any immediate concerns with this, especially when it comes to concurrency. The with_for_update() clause on the select statement IMO should cover this, but maybe I'm missing something.

src/zenml/models/pipeline_run_models.py

fa9r

Fundamentally I don't see a reason why this shouldn't work, but let me summarize to check whether I understand it right:

If we run a pipeline with a schedule, we'll do the same as before: create a schedule in DB, hand over to orchestrator, create run when the orchestrator starts the first step and link it to the schedule, link other steps to same run by searching for runs by orchestrator ID
If we run a pipeline without schedule, we directly write a run to the DB with empty orchestrator ID, hand over to orchestrator, set orchestrator ID when the orchestrator starts the first step, link other steps to same run by searching for similar orchestrator ID

Logically this makes sense to me and I think it should work 👍

The only possible failure case I can see is if we try to create a placeholder run for a deployment that already has a placeholder run. But I'm not sure whether that scenario can be reached or not, @schustmi you should know this best 😁

src/zenml/models/pipeline_run_models.py

src/zenml/new/pipelines/pipeline.py

src/zenml/zen_stores/schemas/pipeline_run_schemas.py

src/zenml/zen_stores/sql_zen_store.py

src/zenml/zen_stores/schemas/pipeline_run_schemas.py

github-actions · 2023-12-12T10:54:35Z

E2E template updates in examples/e2e have been pushed.

coderabbitai · 2023-12-20T08:47:19Z

Important

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat with CodeRabbit Bot (`@coderabbitai`)

You can directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
You can tag CodeRabbit on specific lines of code or entire files in the PR by tagging @coderabbitai in a comment. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
You can tag @coderabbitai in a PR comment and ask questions about the PR and the codebase. Examples:
- @coderabbitai generate interesting stats about this repository from git and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid.
- @coderabbitai read the files in the src/scheduler package and generate README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

schustmi added 4 commits November 14, 2023 11:28

POC

a6a3b4d

Maybe solve concurrency issue

a3de167

Readd log

0a170eb

Some comments and small fixes

3a303a0

schustmi requested a review from fa9r November 14, 2023 16:48

github-actions bot added the internal To filter out internal PRs and issues label Nov 14, 2023

htahir1 reviewed Nov 14, 2023

View reviewed changes

src/zenml/models/pipeline_run_models.py Outdated Show resolved Hide resolved

schustmi added 3 commits November 15, 2023 10:32

Small cleanup

80f6bb4

Docstrings

34e014c

rename constraint

8516b7c

fa9r reviewed Nov 15, 2023

View reviewed changes

schustmi added 8 commits November 15, 2023 13:28

Add new execution status

9e2f635

Finish docstring sentence

b18d64e

Delete run if failed during initialization

1373822

Remove wait method

073d1fb

cleanup

dbfa695

Replace emoji

e154fac

Prevent empty run names

d1b9101

Fix return annotation

2e0f005

schustmi changed the title ~~Use placeholder pipeline run to show it in the dashboard immediately~~ Use placeholder runs to show pipeline runs in the dashboard immediately Nov 15, 2023

schustmi changed the title ~~Use placeholder runs to show pipeline runs in the dashboard immediately~~ Use placeholder runs to show pipeline runs in the dashboard without delay Nov 15, 2023

schustmi added 8 commits November 15, 2023 15:28

Remove unique constraint

fb719cb

Merge branch 'develop' into placeholder-pipeline-run-poc

038f587

Add index for pipeline run table

4514403

Better explanation

7408fec

Switch back to unique constraint

9e070cd

Manual migration to add dummy orchestrator run id for old runs

def7193

Remove useless check that is handled by unique constraint

6ce39e9

Add additional check to make sure we're only replacing placeholder runs

5cfe74f

Fix alembic order

b09053a

schustmi force-pushed the placeholder-pipeline-run-poc branch 4 times, most recently from a7aed3b to b09053a Compare November 25, 2023 16:19

schustmi added 2 commits November 28, 2023 17:08

Merge branch 'develop' into placeholder-pipeline-run-poc

cebff65

Fix alembic order

e4eda80

schustmi force-pushed the placeholder-pipeline-run-poc branch from dfe971e to e4eda80 Compare November 28, 2023 16:10

schustmi and others added 11 commits November 29, 2023 15:26

Fix ruff ignore

071fd50

Fix typo

25324aa

Merge branch 'develop' into placeholder-pipeline-run-poc

6aab27b

Merge branch 'develop' into placeholder-pipeline-run-poc

5f36fe6

Import cleanup

c00bf82

Merge branch 'develop' into placeholder-pipeline-run-poc

15cff29

Merge branch 'develop' into placeholder-pipeline-run-poc

df337de

Formatting

c9641eb

Fix alembic order

8e1b0c3

Merge branch 'develop' into placeholder-pipeline-run-poc

cc0f184

Auto-update of E2E template

73e6035

Merge branch 'develop' into placeholder-pipeline-run-poc

8005560

schustmi and others added 6 commits January 2, 2024 09:32

Merge branch 'develop' into placeholder-pipeline-run-poc

347f32c

Fix alembic order

a1961bd

Auto-update of Starter template

0e4f076

Auto-update of E2E template

f2f8fcb

Auto-update of NLP template

d3d72c4

Merge branch 'develop' into placeholder-pipeline-run-poc

da1065c

schustmi merged commit 79d967e into develop Jan 3, 2024
31 of 33 checks passed

schustmi deleted the placeholder-pipeline-run-poc branch January 3, 2024 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use placeholder runs to show pipeline runs in the dashboard without delay #2048

Use placeholder runs to show pipeline runs in the dashboard without delay #2048

schustmi commented Nov 14, 2023 •

edited

Loading

schustmi commented Nov 14, 2023

fa9r left a comment

github-actions bot commented Dec 12, 2023

coderabbitai bot commented Dec 20, 2023 •

edited

Loading

Auto Review Skipped

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

Use placeholder runs to show pipeline runs in the dashboard without delay #2048

Use placeholder runs to show pipeline runs in the dashboard without delay #2048

Conversation

schustmi commented Nov 14, 2023 • edited Loading

Describe changes

Notes regarding the migration

TODO

Pre-requisites

Types of changes

schustmi commented Nov 14, 2023

fa9r left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 12, 2023

coderabbitai bot commented Dec 20, 2023 • edited Loading

Auto Review Skipped

Chat with CodeRabbit Bot (@coderabbitai)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

schustmi commented Nov 14, 2023 •

edited

Loading

coderabbitai bot commented Dec 20, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Configration File (`.coderabbit.yaml`)