-
Notifications
You must be signed in to change notification settings - Fork 913
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The development of custom templated kedro starters is impractical #1961
Comments
@foxale Thank you for your thoughtful and well-written issue. Is the goal to develop a template for individual use cases? The template is meant to be a stable structure for projects. If you are developing a new template, I think the making of the template would be the final step instead of the first step. You will start to develop a new project, and iterate it until you find a reusable structure. Once you have this structure, you will create a template out of it and share it across different projects.
Does "local starter" mean your custom template? I think there are 2 levels of code sharing, and starters shouldn't be used as a replacement for a Python Library.
This is indeed the purpose of a starter - and it shouldn't be changed rapidly in project development, what kind of code changes are you introducing? IMO, the starters are good for the code/file that you need to copy-paste over and over again, but they are not module/functions (source code) |
Hi @foxale, do you have any more thoughts or ideas to add here, also following on to @noklam 's comment? We really appreciate the feedback and want to improve your experience using Kedro. However, starter template improvements aren't a huge priority at the moment, so understanding your pain points completely will help giving this issue the right attention. |
(1) could be solved by https://github.com/copier-org/copier update capabilities, (5) could be solved by copier ability to apply a template to the current directory (which |
We haven't heard from the original author for a while and I'm wondering if this is something we want to tackle any time soon? Replacing |
I agree we have other areas to focus on currently, but I think we should keep this issue open. There's a long list of problems our current There's enough evidence that we have to do something, but whatever we do must not be part of the |
Apologies, shortly after writing the post I moved on to a different engagement and eventually to a new job.
Right, but in order to have it you first need to create it. And even after let's say completing the template there is the usual maintenance effort - version bumps, bugfixes, adding new extensions, etc. At that time our templates were quite sophisticated, and we wanted our end users (other data scientists) to use them while we were still working on them and so there was no way to do "make a project to template" conversion the final step. I remember that around the time I first wrote the original post, I came into realization that it would probably be just easier to maintain a codebase composed of a single project instead of a template, and with the addition of proper config it would have the exact same capabilities but way easier to maintain. |
I still think there are some very valid concerns raised here, but I'm turning this into a discussion to continue the conversation there. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Description and context
So to give you some background, we work with a bunch of business use cases involving data processing and machine learning (customer segmentation, next best offer, etc.) and for each business problem we want to have a generalized but customizable solution.
When I first read about kedro starters, it sounded like a perfect match — you can create a custom prompt asking users (data scientists) about name of the project they're working on, the data source where their data is stored, credentials for this data source, and.. boom! Automagically the project is all setup and ready to run.
But after few sprints of starter development we started realizing how harder and slower the development and maintenance of a starter is as compared to the usual, not templated project. The cons we came across:
The flow goes like this:
Which is five painful steps as opposed to just create a new branch + (ii) + (v)
Should testing and linting dependencies and configuration be dropped from starters and / or templates? #1849
kedro-dbt plugin #1813
kedro new
Poetry Support for Kedro Projects #1722
And the pros? So far it looks like the whole idea of (at least custom templated) starters could be reduced to a global config file and git operations.
I wouldn't be surprised if it turned out I'm missing one or more crucial details that makes starters valuable, so if that's the case just fill me in please :)
Possible Implementation
I can see how the templated starters look cool, but what’s the real benefit here?
Possible Alternatives
Project config + git CLI
The text was updated successfully, but these errors were encountered: