Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for customization of pipeline templates #2701

Merged
merged 71 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
d56a396
Remove non-user-facing release notes about pip pin (#2854)
deepyaman Jul 28, 2023
d87bd72
Allow for customization of pipeline templates
jasonmhite Jul 28, 2023
20f7812
Update lookup based on feedback
jasonmhite Jul 28, 2023
c27eb11
Add CLI flag for pipeline creation template -t/--template
jasonmhite Jul 28, 2023
b3bdb46
First basic unit test
jasonmhite Jul 28, 2023
1ddb98d
Add test to check overriding template path on the CLI
jasonmhite Jul 28, 2023
9ec4a53
Linter formatting fixes
jasonmhite Jul 28, 2023
b43dbc2
Release 0.18.12 (#2871)
SajidAlamQB Aug 1, 2023
51a022e
Fix broken link to Prefect website in deploy guide (#2885)
deepyaman Aug 2, 2023
141d502
Clarify the <micropkg_name> argument to kedro micropkg package (#2835)
JonathanDCohen Aug 2, 2023
c373b67
Allow registering of custom resolvers to `OmegaConfigLoader` (#2869)
ankatiyar Aug 2, 2023
d20af76
Document the use of custom resolvers with `OmegaConfigLoader` (#2896)
ankatiyar Aug 7, 2023
307c186
Update kedro pipeline create to use new /conf file structure (#2856)
DimedS Aug 7, 2023
81dd36f
Update kedro catalog create to use new /conf structure (#2884)
DimedS Aug 8, 2023
5d11f7f
Add migration steps for `ConfigLoader` to `OmegaConfigLoader` (#2887)
merelcht Aug 8, 2023
65c7f33
Fix #2498 Adding logging issue 2498 (#2842)
laizaparizotto Aug 8, 2023
480b320
Try only trigger docs build when release notes updated (#2907)
merelcht Aug 8, 2023
51a0745
Add Python 3.11 support to Kedro (#2851)
SajidAlamQB Aug 8, 2023
80516dd
Revise FAQs and README (#2909)
stichbury Aug 8, 2023
b4468de
Update Generator example (#2858)
noklam Aug 8, 2023
c4b0256
update docs to reflect change to /conf file structure (#2913)
DimedS Aug 9, 2023
f753762
Change CONTRIBUTING.md file based on PR #2894 (#2914)
lrcouto Aug 9, 2023
35335cb
Move contribution guidelines from CONTRIBUTING.md to the Wiki (#2894)
lrcouto Aug 10, 2023
d6e8454
Remove redundant pages and direct users to wiki (#2917)
lrcouto Aug 14, 2023
75a78a2
Deprecate abstract "DataSet" in favor of "Dataset" (#2746)
deepyaman Aug 14, 2023
9bc7a46
Reorganise and improve the data catalog documentation (#2888)
stichbury Aug 18, 2023
9e2095b
Add line about Viz to PR template (#2929)
tynandebold Aug 18, 2023
1e55b01
Add architecture graphic back to docs with revisions (#2916)
stichbury Aug 18, 2023
9c7121b
Add kedro catalog resolve command (#2891)
AhdraMeraliQB Aug 18, 2023
ffb46b5
Replace "DataSet" with "Dataset" in Markdown files (#2735)
deepyaman Aug 18, 2023
0111685
Update h1,h2,h3 font sizes in the docs pages (#2938)
tynandebold Aug 18, 2023
53cc179
Automatically trigger `kedro-starters` release on the release of `ked…
ankatiyar Aug 18, 2023
4559ddf
Create issues_metrics.yml (#2814)
noklam Aug 18, 2023
6b12c69
Clean up setuptools and wheel requirements to align with PEP-518 (#2927)
lrcouto Aug 18, 2023
fa97e35
Clean up `kedro pipeline create` outdated docs (#2945)
noklam Aug 21, 2023
786c7d6
Add globals feature for `OmegaConfigLoader` using a globals resolver …
ankatiyar Aug 21, 2023
87b568f
Consolidate two `ruff-pre-commit` entries into one (#2881)
deepyaman Aug 22, 2023
2676c8c
Fix typos across the documentation (#2956)
ankatiyar Aug 22, 2023
53cc5d7
Setup Vale linter as a GHA workflow (#2953)
ankatiyar Aug 22, 2023
476a183
Fix README to show graphics on PyPI (#2961)
stichbury Aug 23, 2023
9426a1b
Add some Vale styles (#2963)
stichbury Aug 23, 2023
230751b
Minor changes to test + release notes + docs
ankatiyar Aug 24, 2023
4c79063
Update anyconfig requirement from ~=0.10.0 to >=0.10,<0.14 (#2876)
dependabot[bot] Aug 24, 2023
55bfbe7
Move default template to static `pyproject.toml`, take 2 (#2853)
astrojuanlu Aug 25, 2023
500a10c
Add deprecation warnings to top-level use of layer in catalog definit…
SajidAlamQB Aug 25, 2023
5a310ae
Update on credentials.md (#2787)
jmnunezd Aug 26, 2023
6e2f457
Cap pluggy 1.3 release (#2981)
DimedS Aug 29, 2023
ed5ed36
Make vale linter only run when PR opened or reopened (#2982)
ankatiyar Aug 29, 2023
53b152a
Update merge-gatekeeper.yml (#2960)
noklam Aug 29, 2023
c9841e2
Configure starters to use OmegaConfigLoader (#2974)
lrcouto Aug 29, 2023
ca5108b
Stop OmegaConfigLoader from reading config from hidden directory like…
noklam Aug 30, 2023
71c1118
Add migration steps for `TemplatedConfigLoader` to `OmegaConfigLoader…
merelcht Aug 30, 2023
9896691
Introduce a sentinel value _NO_VALUE to improve Global resolvers to s…
ankatiyar Aug 30, 2023
8c6c371
Release 0.18.13 (#2988)
ankatiyar Aug 31, 2023
cb9a0cd
Fix docstrings on kedro/extras/datasets (#2995)
lrcouto Sep 1, 2023
ee2f5a2
Minor docs changes on data section to create a PR and test Vale style…
stichbury Sep 1, 2023
be0c34a
PiP/pyproject.toml and Conda/Meta.yaml Sync (#2922)
rxm7706 Sep 1, 2023
7ebbc12
fix typo for build (#3001)
rxm7706 Sep 4, 2023
f9485f0
Add hook example to access `metadata` (#2998)
noklam Sep 5, 2023
2578560
Bump release notes version
jasonmhite Sep 6, 2023
0d77938
Expand docs for customized pipeline templates.
jasonmhite Sep 6, 2023
c12a5de
Update release notes and docs
ankatiyar Sep 7, 2023
c710d38
Update language linter to also run when PR converted ready for review
ankatiyar Sep 7, 2023
8b7cb9a
Fix ci - Use `--resolver=backtracking` with `kedro build-reqs` in e2e…
ankatiyar Sep 6, 2023
9620f80
Update docs
ankatiyar Sep 7, 2023
88971f7
Update style to catch some more US spellings
stichbury Sep 8, 2023
94c769f
Make dataset factory resolve nested dict properly (#2993)
ankatiyar Sep 7, 2023
c6786be
Apply suggestions from code review
jasonmhite Sep 8, 2023
3742bed
Resolve suggestions from code review
jasonmhite Sep 8, 2023
9bd9212
Merge branch 'main' into feature/pipeline-templates
ankatiyar Sep 12, 2023
30e838b
Revert changes ukspelling.yml
ankatiyar Sep 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,19 @@
# Upcoming Release 0.18.14

## Major features and improvements
* Allowed using of custom cookiecutter templates for creating pipelines with `--template` flag for `kedro pipeline create` or via `template/pipeline` folder.

## Bug fixes and other changes
* Updated dataset factories to resolve nested catalog config properly.

## Documentation changes
## Breaking changes to the API
## Upcoming deprecations for Kedro 0.19.0
## Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:

* [Jason Hite](https://github.com/jasonmhite)


# Release 0.18.13

Expand Down
31 changes: 30 additions & 1 deletion docs/source/nodes_and_pipelines/modular_pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ Running the `kedro pipeline create` command adds boilerplate folders and files f
│ └── pipelines
│ ├── __init__.py
│ └── {{pipeline_name}} <-- This folder defines the modular pipeline
│ ├── README.md <-- Pipeline-specific documentation
│ ├── __init__.py <-- So that Python treats this pipeline as a module
│ ├── nodes.py <-- To declare your nodes
│ └── pipeline.py <-- To structure the pipeline itself
Expand All @@ -77,6 +76,36 @@ Running the `kedro pipeline create` command adds boilerplate folders and files f

If you want to do the reverse and remove a modular pipeline, you can use ``kedro pipeline delete <pipeline_name>`` to do so.

### Custom templates

If you want to generate a pipeline with a custom Cookiecutter template, you can save it in `<project_root>/templates/pipeline`.
The `kedro pipeline create` command will pick up the custom template in your project as the default. You can also specify the path to your custom
Cookiecutter pipeline template with the `--template` flag like this:
```bash
kedro pipeline create <pipeline_name> --template <path_to_template>
```
A template folder passed to `kedro pipeline create` using the `--template` argument will take precedence over any local templates.
Kedro supports having a single pipeline template in your project. If you need to have multiple pipeline templates, consider saving them in a
separate folder and pointing to them with the `--template` flag.

#### Creating custom templates

It is your responsibility to create functional Cookiecutter templates for custom modular pipelines. Please ensure you understand the
basic structure of a modular pipeline. Your template should render to a valid, importable Python module containing a
`create_pipeline` function at the top level that returns a `Pipeline` object. You will also need appropriate
`config` and `tests` subdirectories that will be copied to the project `config` and `tests` directories when the pipeline is created.
The `config` and `tests` directories need to follow the same layout as in the default template and cannot
be customised, although the contents of the parameters and actual test file can be changed. File and folder names or structure
do not matter beyond that and can be customised according to your needs. You can use [the
default template that Kedro](https://github.com/kedro-org/kedro/tree/main/kedro/templates/pipeline) uses as a starting point.

Pipeline templates are rendered using [Cookiecutter](https://cookiecutter.readthedocs.io/), and must also contain a `cookiecutter.json`
See the [`cookiecutter.json` file in the Kedro default template](https://github.com/kedro-org/kedro/tree/main/kedro/templates/pipeline/cookiecutter.json) for an example.
It is important to note that if you are embedding your custom pipeline template within a
Kedro starter template, you must tell Cookiecutter not to render this template when creating a new project from the starter. To do this,
you must add [`_copy_without_render: ["templates"]`](https://cookiecutter.readthedocs.io/en/latest/advanced/copy_without_render.html) to the `cookiecutter.json` file for the starter
and not the `cookiecutter.json` for the pipeline template.

### Ensuring portability

Modular pipelines are shareable between Kedro codebases via [micro-packaging](micro_packaging.md), but you must follow a couple of rules to ensure portability:
Expand Down
26 changes: 22 additions & 4 deletions kedro/framework/cli/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,17 @@ def pipeline():
is_flag=True,
help="Skip creation of config files for the new pipeline(s).",
)
@click.option(
"template_path",
"-t",
"--template",
type=click.Path(file_okay=False, dir_okay=True, exists=True, path_type=Path),
help="Path to cookiecutter template to use for pipeline(s). Will override any local templates.",
)
@env_option(help="Environment to create pipeline configuration in. Defaults to `base`.")
@click.pass_obj # this will pass the metadata as first argument
def create_pipeline(
metadata: ProjectMetadata, name, skip_config, env, **kwargs
metadata: ProjectMetadata, name, template_path, skip_config, env, **kwargs
): # noqa: unused-argument
"""Create a new modular pipeline by providing a name."""
package_dir = metadata.source_dir / metadata.package_name
Expand All @@ -107,7 +114,19 @@ def create_pipeline(
f"Make sure it exists in the project configuration."
)

result_path = _create_pipeline(name, package_dir / "pipelines")
# Precedence for template_path is: command line > project templates/pipeline dir > global default
# If passed on the CLI, click will verify that the path exists so no need to check again
if template_path is None:
# No path provided on the CLI, try `PROJECT_PATH/templates/pipeline`
template_path = Path(metadata.project_path / "templates" / "pipeline")

if not template_path.exists():
# and if that folder doesn't exist fall back to the global default
template_path = Path(kedro.__file__).parent / "templates" / "pipeline"

click.secho(f"Using pipeline template at: '{template_path}'")

result_path = _create_pipeline(name, template_path, package_dir / "pipelines")
_copy_pipeline_tests(name, result_path, package_dir)
_copy_pipeline_configs(result_path, project_conf_path, skip_config, env=env)
click.secho(f"\nPipeline '{name}' was successfully created.\n", fg="green")
Expand Down Expand Up @@ -191,12 +210,11 @@ def _echo_deletion_warning(message: str, **paths: list[Path]):
click.echo(indent(paths_str, " " * 2))


def _create_pipeline(name: str, output_dir: Path) -> Path:
def _create_pipeline(name: str, template_path: Path, output_dir: Path) -> Path:
with _filter_deprecation_warnings():
# noqa: import-outside-toplevel
from cookiecutter.main import cookiecutter

template_path = Path(kedro.__file__).parent / "templates" / "pipeline"
cookie_context = {"pipeline_name": name, "kedro_version": kedro.__version__}

click.echo(f"Creating the pipeline '{name}': ", nl=False)
Expand Down
50 changes: 50 additions & 0 deletions tests/framework/cli/pipeline/conftest.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,24 @@
import json
import shutil
from pathlib import Path

import pytest

from kedro.framework.project import settings


def _write_json(filepath: Path, content: dict):
filepath.parent.mkdir(parents=True, exist_ok=True)
json_str = json.dumps(content, indent=4)
filepath.write_text(json_str)


def _write_dummy_file(filepath: Path, content: str = ""):
filepath.parent.mkdir(parents=True, exist_ok=True)
with filepath.open("w") as f:
f.write(content)


@pytest.fixture(autouse=True)
def cleanup_micropackages(fake_repo_path, fake_package_path):
packages = {p.name for p in fake_package_path.iterdir() if p.is_dir()}
Expand Down Expand Up @@ -82,3 +96,39 @@ def cleanup_pyproject_toml(fake_repo_path):
yield

pyproject_toml.write_text(existing_toml)


@pytest.fixture()
def fake_local_template_dir(fake_repo_path):
"""Set up a local template directory. This won't be functional we're just testing the actual layout works.

Note that this is not scoped to module because we don't want to have this folder present in most of the tests,
so we will tear it down every time.
"""
template_path = fake_repo_path / Path("templates")
pipeline_template_path = template_path / Path("pipeline")
cookiecutter_template_path = (
pipeline_template_path / "{{ cookiecutter.pipeline_name }}"
)

cookiecutter_template_path.mkdir(parents=True)

# Create the absolute bare minimum files
cookiecutter_json = {
"pipeline_name": "default",
}
_write_json(pipeline_template_path / "cookiecutter.json", cookiecutter_json)
_write_dummy_file(
cookiecutter_template_path / "pipeline_{{ cookiecutter.pipeline_name }}.py",
)
_write_dummy_file(cookiecutter_template_path / "__init__.py", "")
_write_dummy_file(
cookiecutter_template_path
/ r"config/parameters/{{ cookiecutter.pipeline_name }}.yml",
)
_write_dummy_file(
cookiecutter_template_path / r"tests/test_{{ cookiecutter.pipeline_name }}.py",
)
yield template_path.resolve()

shutil.rmtree(template_path)
72 changes: 72 additions & 0 deletions tests/framework/cli/pipeline/test_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,78 @@ def test_create_pipeline( # pylint: disable=too-many-locals
actual_files = {f.name for f in test_dir.iterdir()}
assert actual_files == expected_files

@pytest.mark.parametrize("env", [None, "local"])
def test_create_pipeline_template( # pylint: disable=too-many-locals
self,
fake_repo_path,
fake_project_cli,
fake_metadata,
env,
fake_package_path,
fake_local_template_dir,
):
pipelines_dir = fake_package_path / "pipelines"
assert pipelines_dir.is_dir()

assert not (pipelines_dir / PIPELINE_NAME).exists()

cmd = ["pipeline", "create", PIPELINE_NAME]
cmd += ["-e", env] if env else []
result = CliRunner().invoke(fake_project_cli, cmd, obj=fake_metadata)

assert (
f"Using pipeline template at: '{fake_repo_path / 'templates'}"
in result.output
)
assert f"Creating the pipeline '{PIPELINE_NAME}': OK" in result.output
assert f"Location: '{pipelines_dir / PIPELINE_NAME}'" in result.output
assert f"Pipeline '{PIPELINE_NAME}' was successfully created." in result.output

# Dummy pipeline rendered correctly
assert (pipelines_dir / PIPELINE_NAME / f"pipeline_{PIPELINE_NAME}.py").exists()

assert result.exit_code == 0

@pytest.mark.parametrize("env", [None, "local"])
def test_create_pipeline_template_command_line_override( # pylint: disable=too-many-locals
self,
fake_repo_path,
fake_project_cli,
fake_metadata,
env,
fake_package_path,
fake_local_template_dir,
):
pipelines_dir = fake_package_path / "pipelines"
assert pipelines_dir.is_dir()

assert not (pipelines_dir / PIPELINE_NAME).exists()

# Rename the local template dir to something else so we know the command line flag is taking precedence
try:
# Can skip if already there but copytree has a dirs_exist_ok flag in >python 3.8 only
shutil.copytree(fake_local_template_dir, fake_repo_path / "local_templates")
except FileExistsError:
pass

cmd = ["pipeline", "create", PIPELINE_NAME]
cmd += ["-t", str(fake_repo_path / "local_templates/pipeline")]
cmd += ["-e", env] if env else []
result = CliRunner().invoke(fake_project_cli, cmd, obj=fake_metadata)

assert (
f"Using pipeline template at: '{fake_repo_path / 'local_templates'}"
in result.output
)
assert f"Creating the pipeline '{PIPELINE_NAME}': OK" in result.output
assert f"Location: '{pipelines_dir / PIPELINE_NAME}'" in result.output
assert f"Pipeline '{PIPELINE_NAME}' was successfully created." in result.output

# Dummy pipeline rendered correctly
assert (pipelines_dir / PIPELINE_NAME / f"pipeline_{PIPELINE_NAME}.py").exists()

assert result.exit_code == 0

@pytest.mark.parametrize("env", [None, "local"])
def test_create_pipeline_skip_config(
self, fake_repo_path, fake_project_cli, fake_metadata, env
Expand Down