Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installable Kedro catalog (Mini-Kedro) #2741

Closed
WaylonWalker opened this issue Jan 9, 2021 · 11 comments
Closed

Installable Kedro catalog (Mini-Kedro) #2741

WaylonWalker opened this issue Jan 9, 2021 · 11 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@WaylonWalker
Copy link
Contributor

Would it make sense to make mini-kedro installable? My use case for projects like that are users doing EDA and just want easy access to the data with no fuss.

If it is something that makes sense I propose adding a setup.py to make it installable, and a single module that sets the catalog up for them, then they can access the project's data as follows.

import my_mini_kedro as mmk
mmk.catalog.load('my_dataset')

Alternatively

This could also be a separate starter that is a kedro-catalog starter.

@JavierHernandezMontes
Copy link

@WaylonWalker do you know a way to load a certain dataset of the catalog with the current version of kedro 0.17.0? as you propose to do.

@WaylonWalker
Copy link
Contributor Author

I just added a very basic setup.py and an __init__.py with a catalog loader to make it work.

@WaylonWalker
Copy link
Contributor Author

@JavierHernandezMontes is your data local or remote? For local data that needs packaged in it might be a bit trickier.

@noklam
Copy link
Contributor

noklam commented Jul 12, 2022

@WaylonWalker Is this still relevant with the IPython extension?

@astrojuanlu
Copy link
Member

At the moment, using Kedro without using the project template is entirely possible: pip install kedro and then instantiating the catalog directly:

from kedro.config import OmegaConfigLoader
from kedro.io import DataCatalog

conf_loader = OmegaConfigLoader("conf")
conf_catalog = conf_loader.get("catalog")
catalog = DataCatalog.from_config(conf_catalog)

catalog.load(...)

On IPython & Jupyter, this is one line:

%load_ext kedro.ipython

catalog.load(...)

I agree it would be nice to make the boilerplate above go away, so I'm moving this issue and renaming it for our consideration. It will have more visibility on the framework repo.

@astrojuanlu astrojuanlu changed the title Installable Mini-Kedro Installable Kedro catalog (Mini-Kedro) Jun 28, 2023
@astrojuanlu astrojuanlu transferred this issue from kedro-org/kedro-starters Jun 28, 2023
@astrojuanlu

This comment was marked as off-topic.

@astrojuanlu

This comment was marked as outdated.

@astrojuanlu astrojuanlu added Issue: Feature Request New feature or improvement to existing feature Community Issue/PR opened by the open-source community labels Sep 27, 2023
@astrojuanlu
Copy link
Member

More evidence of users using the Kedro catalog without the pipelines: #2898 (comment)

@astrojuanlu
Copy link
Member

@merelcht merelcht removed the Community Issue/PR opened by the open-source community label Jul 8, 2024
@astrojuanlu
Copy link
Member

Interesting realisation today. Users have been able to use the DataCatalog as a standalone component since basically forever. The mini-kedro starter was created in Kedro 0.17.0:

https://github.com/kedro-org/kedro-starters/blob/0.17.0/mini-kedro/%7B%7B%20cookiecutter.repo_name%20%7D%7D/Example%20Notebook.ipynb

Then in 2023 we made a big push to show that you can use Kedro standalone components (#2855, #3128)

And yet, even some of our power users had no idea it's possible to do this.

In #3659 I gave a detailed rationale of the technical reasons why it's just better to have fewer dependencies and enable users to install only the parts they want. But I was missing the marketing reasons: old users have preconceived notions about what Kedro can and cannot do, and new users find a wall of documentation explaining the framework way, so most don't realise that there's a library way as well.

@astrojuanlu
Copy link
Member

This is still important for us.

As part of a issue cleanup, and in line with a new scheme in which we want to use Discussions for feature requests & enhancement proposals #3767, I'm moving this to a discussion. Let's continue there.

@kedro-org kedro-org locked and limited conversation to collaborators Oct 31, 2024
@astrojuanlu astrojuanlu converted this issue into discussion #4273 Oct 31, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
Archived in project
Development

No branches or pull requests

5 participants