Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

238 giving a fraction of samples instead of a number of samples in the subsample class #464

Conversation

BaptisteCalot
Copy link
Collaborator

@BaptisteCalot BaptisteCalot commented Jun 12, 2024

Description

Fix addressing issue 238. We introduce the ability to create a training set using the split method of the subsample class with a fraction between 0 and 1 representing the proportion of the training set. The option to specify an integer representing the number of elements in the training set is still retained

Type of change

  • Bug fix

In the case where the attribute self.n_samples is a float, the feature n_samples used in the split method of Subsample class becomes self.n_samples * X.shape[0], taking its floor integer part

How Has This Been Tested?

  • Test test_split_SubSample_n_samples

We check that the training and test sets build using the split method are as expected with the given seed. We use two instances of Subsample, one with an integer n_samples and the other with an n_samples less than 1

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully : make lint
  • Typing passes successfully : make type-check
  • Unit tests pass successfully : make tests
  • Coverage is 100% : make coverage
  • Documentation builds successfully : make doc

Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I think we have some issues we want to clear up! But otherwise looks like a good PR. Note to also give a response to the issue #231!

HISTORY.rst Outdated Show resolved Hide resolved
mapie/subsample.py Outdated Show resolved Hide resolved
mapie/tests/test_subsample.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@thibaultcordier thibaultcordier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR. I suggest you write separate unit tests for better code maintenance.

mapie/subsample.py Outdated Show resolved Hide resolved
mapie/subsample.py Outdated Show resolved Hide resolved
mapie/tests/test_subsample.py Outdated Show resolved Hide resolved
mapie/subsample.py Outdated Show resolved Hide resolved
mapie/subsample.py Outdated Show resolved Hide resolved
mapie/tests/test_utils.py Outdated Show resolved Hide resolved
mapie/utils.py Outdated Show resolved Hide resolved
mapie/utils.py Show resolved Hide resolved
mapie/utils.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @BaptisteCalot,
Thank you for this PR! I think there are still some adjustments to make! As mentioned in issue #231, I think it's important to have a default value.
Thank you!

mapie/subsample.py Outdated Show resolved Hide resolved
mapie/tests/test_utils.py Outdated Show resolved Hide resolved
mapie/utils.py Show resolved Hide resolved
mapie/utils.py Show resolved Hide resolved
mapie/tests/test_utils.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @BaptisteCalot,
Perfect, thank you! Please quickly mention the error that will be obtained when replace=False. Otherwise, I think this can be validated today!

LacombeLouis and others added 2 commits June 19, 2024 17:56
Co-authored-by: Thibault Cordier <124613154+thibaultcordier@users.noreply.github.com>
mapie/utils.py Outdated Show resolved Hide resolved
@BaptisteCalot BaptisteCalot merged commit 4c25001 into master Jun 20, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Giving a fraction of samples instead of a number of samples in the Subsample class
3 participants