Implement caching workflow #325

PhilippeMoussalli · 2023-07-31T09:23:32Z

Related to #317
First Review PRs #318 #320
Design discussions #292

PR that implements the general component caching mechanism described in

Few notes:

There is no need to specify previous run-id when caching since we're checking for existing executions of similar definition based on the manifest and the cache key
Created a Metadata class since the metadata of the manifest had different schema between both the local and remote runner
If a component is cached, it skips the execution entirely and just fetches the existing output manifest (based on the matching cache key). For kubeflow, we still have to write that manifest as an output artifact that will be read by the next component (artifact) but we don't write anything to the base oath (custom_artifact)
Currently implemented only for the remote runner (kfp) and enabled by default, can be disabled by setting Pipeline(cache_disabled = True). Disabled for remote runner since there is no straightforward method of estimating digests in that workflow (see Add method to estimate caching key #318)
*If one component is not cached, then all subsequent components must be re-run/executed

Things to improve in later PRs:

Decide whether we should still estimating the cache during compilation or execution (right now it's done at compilation time to enable defining new arguments but we might miss the fact that some image digests are updated):

A) Compilation

Right now, the component has to be run and produce the output manifest but we already know beforehand if this component should be executed or not. Ideally we would omit execution of component altogether to make pipelines run faster and avoid pod initialization waiting time. This can be done by either:
- Implementing conditional statement in KFP link.
- Assigning the default node pool to the cached component in case specified otherwise (avoid initialization of expensive nodes since the component just has to write the manifest)

B) Runtime

Find a method to estimate the digest from within the component itself (see discussion at Add method to estimate caching key #318), this solution will require all component to be run since the component itself has to decide whether it can be cached or not when it runs
Alternatively we can have one custom component at the beginning of each pipeline runner (initializer component), that decides which components should be run. The advantage here is that we only include all the required packages for estimating digests in one docker images and not all of them. See this for more details on how image digests can be retrieved.

Implement the caching for the local runner

…kflow

PR that creates a metadata class, this will make it easier to implement #368 (was originally part of #325 but decided to break it down to make it easier to review). Few other notable changes: - The `run_id` between both runners has now an identical format (name_timestamp), we no longer need the uid of kfp since it's just used to store the native output artifacts - The `safe_component_name` has been moved from the local runner to the component spec to avoid having to plug it everywhere --------- Co-authored-by: Georges Lorré <35808396+GeorgesLorre@users.noreply.github.com>

PhilippeMoussalli · 2023-08-25T09:31:23Z

Closed in favor of #387

PR that creates a metadata class, this will make it easier to implement #368 (was originally part of #325 but decided to break it down to make it easier to review). Few other notable changes: - The `run_id` between both runners has now an identical format (name_timestamp), we no longer need the uid of kfp since it's just used to store the native output artifacts - The `safe_component_name` has been moved from the local runner to the component spec to avoid having to plug it everywhere --------- Co-authored-by: Georges Lorré <35808396+GeorgesLorre@users.noreply.github.com>

PhilippeMoussalli added 13 commits July 26, 2023 10:49

add method to estimate caching key

1f27bcd

add argument to skip component run

b085166

Merge branch 'main' into skip-component-run-argument

5b7b85f

Merge branch 'main' into skip-component-run-argument

671ee31

add argument to skip component run

52865db

add method to estimate caching key

de2624a

Align metadata

bdbcd5c

add change to source code

8451f95

modify and add tests

af4b9f2

modify fsspec

ffb3c31

move safe component name to pipeline level

779a6b9

modify test path

b32c49e

add tests for manifest save file

6ed124d

PhilippeMoussalli force-pushed the implement-caching-workflow branch from 8d69fa3 to 6ed124d Compare July 31, 2023 12:31

PhilippeMoussalli added 3 commits July 31, 2023 15:58

add tests for parsing manifest response

e68fc27

bugfix

7ed0e7e

Merge branch 'skip-component-run-argument' into implement-caching-wor…

78c6574

…kflow

PhilippeMoussalli force-pushed the implement-caching-workflow branch from 718b7d6 to 78c6574 Compare July 31, 2023 15:18

Change manifest checking method

e514e15

PhilippeMoussalli force-pushed the implement-caching-workflow branch from 99ab8cb to a846678 Compare August 1, 2023 07:26

Omit writing output manifest on cached components

48dba2d

PhilippeMoussalli force-pushed the implement-caching-workflow branch from a846678 to 48dba2d Compare August 1, 2023 07:58

PhilippeMoussalli requested a review from GeorgesLorre August 1, 2023 08:26

PhilippeMoussalli self-assigned this Aug 1, 2023

PhilippeMoussalli linked an issue Aug 1, 2023 that may be closed by this pull request

Integrate cache key with remote runner workflow #317

Closed

Merge branch 'main' into implement-caching-workflow

613fa3e

PhilippeMoussalli mentioned this pull request Aug 17, 2023

Redesign base path file structure for caching and data exploration #368

Closed

PhilippeMoussalli added 2 commits August 18, 2023 10:15

Merge branch 'main' into estimate-cache-key

3a06712

Merge branch 'estimate-cache-key' into skip-component-run-argument

21fa864

PhilippeMoussalli changed the base branch from main to skip-component-run-argument August 18, 2023 08:56

Merge branch 'skip-component-run-argument' into implement-caching-wor…

03cb144

…kflow

PhilippeMoussalli force-pushed the implement-caching-workflow branch from abb1ec2 to 03cb144 Compare August 18, 2023 11:40

PhilippeMoussalli mentioned this pull request Aug 21, 2023

Create separate class for metadata #372

Merged

Base automatically changed from skip-component-run-argument to main August 24, 2023 08:05

PhilippeMoussalli closed this Aug 25, 2023

RobbeSneyders deleted the implement-caching-workflow branch January 11, 2024 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement caching workflow #325

Implement caching workflow #325

PhilippeMoussalli commented Jul 31, 2023 •

edited

Loading

PhilippeMoussalli commented Aug 25, 2023

Implement caching workflow #325

Implement caching workflow #325

Conversation

PhilippeMoussalli commented Jul 31, 2023 • edited Loading

PhilippeMoussalli commented Aug 25, 2023

PhilippeMoussalli commented Jul 31, 2023 •

edited

Loading