-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Add slides to introduce cross validation #492
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but can you please insert the slides at the expected location of the video in the jupyter book as done for the other videos?
I had to forge the URL to be able to preview the slides to check they were no missing figure files:
I think they should be inserted after https://2425-246063957-gh.circle-artifacts.com/0/jupyter-book/python_scripts/02_numerical_pipeline_scaling.html but before the module quizz. |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
The idea is to split the numerical pipeline scaling notebook into two to create a new notebook with only the cross validation, and then insert the video between them. I am already working on such PR, but I need #477 to be reviewed first to avoid conflicts. |
This is done in #498. We can either merge that PR first and modify this PR to modify the |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> 3479007
Partially addresses #445.
The goal of this slides is to dynamically show the splitting into folds using
KFold
andSuffleSplit
strategies.It also works as a motivation of the concept "variability of an estimated generalization performance".
Finally, it encourages "data scientist on the wild" to have good practices for scoring.