-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
238 giving a fraction of samples instead of a number of samples in the subsample class #464
238 giving a fraction of samples instead of a number of samples in the subsample class #464
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I think we have some issues we want to clear up! But otherwise looks like a good PR. Note to also give a response to the issue #231!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR. I suggest you write separate unit tests for better code maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @BaptisteCalot,
Thank you for this PR! I think there are still some adjustments to make! As mentioned in issue #231, I think it's important to have a default value.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @BaptisteCalot,
Perfect, thank you! Please quickly mention the error that will be obtained when replace=False
. Otherwise, I think this can be validated today!
Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
Description
Fix addressing issue 238. We introduce the ability to create a training set using the split method of the subsample class with a fraction between 0 and 1 representing the proportion of the training set. The option to specify an integer representing the number of elements in the training set is still retained
Type of change
In the case where the attribute self.n_samples is a float, the feature n_samples used in the split method of Subsample class becomes self.n_samples * X.shape[0], taking its floor integer part
How Has This Been Tested?
We check that the training and test sets build using the split method are as expected with the given seed. We use two instances of Subsample, one with an integer n_samples and the other with an n_samples less than 1
Checklist
make lint
make type-check
make tests
make coverage
make doc