Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT, API Change trunk-classification into three separate functions for generating trunk, trunk-mix, trunk-overlap and marron-wand #227

Merged
merged 8 commits into from
Feb 23, 2024

Conversation

adam2392
Copy link
Collaborator

@adam2392 adam2392 commented Feb 21, 2024

Changes proposed in this pull request:

  • separates make_trunk_classification into three functions: make_trunk_class*, make_trunk_mixture_class*, and make_marron_wand_class* so that we can separate logic for simulating trunk, trunk-overlap, trunk-mix and marron-wand simulations

Before submitting

  • I've read and followed all steps in the Making a pull request
    section of the CONTRIBUTING docs.
  • I've updated or added any relevant docstrings following the syntax described in the
    Writing docstrings section of the CONTRIBUTING docs.
  • If this PR fixes a bug, I've added a test that will fail without my fix.
  • If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

  • All GitHub Actions jobs for my pull request have passed.

Signed-off-by: Adam Li <adam2392@gmail.com>
@@ -205,7 +213,7 @@ def make_trunk_classification(
)
)
elif simulation == "trunk_mix":
mixture_idx = rng.choice(2, n_samples // 2, replace=True, shuffle=True, p=[mix, 1 - mix])
mixture_idx = rng.choice(2, n_samples // 2, replace=True, shuffle=True, p=[mix, 1 - mix]) # type: ignore
norm_params = [[mu_0, cov * (2 / 3) ** 2], [mu_1, cov * (2 / 3) ** 2]]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampan501 can you explain why we have a (2/3)**2 here and only in trunk-mix?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When variance is 1, trunk-mix does not look bimodal at low dimensions. I set it to (2/3)**2 since that is consistent with Marron and Wand bimodal

@@ -205,7 +213,7 @@ def make_trunk_classification(
)
)
elif simulation == "trunk_mix":
mixture_idx = rng.choice(2, n_samples // 2, replace=True, shuffle=True, p=[mix, 1 - mix])
mixture_idx = rng.choice(2, n_samples // 2, replace=True, shuffle=True, p=[mix, 1 - mix]) # type: ignore
norm_params = [[mu_0, cov * (2 / 3) ** 2], [mu_1, cov * (2 / 3) ** 2]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When variance is 1, trunk-mix does not look bimodal at low dimensions. I set it to (2/3)**2 since that is consistent with Marron and Wand bimodal

Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: Patch coverage is 86.95652% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 89.67%. Comparing base (95e2597) to head (e106f85).
Report is 1 commits behind head on main.

❗ Current head e106f85 differs from pull request most recent head 0706fe6. Consider uploading reports for the commit 0706fe6 to get more accurate results

Files Patch % Lines
sktree/datasets/hyppo.py 81.25% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #227      +/-   ##
==========================================
- Coverage   89.68%   89.67%   -0.01%     
==========================================
  Files          54       54              
  Lines        4963     4978      +15     
==========================================
+ Hits         4451     4464      +13     
- Misses        512      514       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sampan501
Copy link
Member

@adam2392 Good to merge after changelog update

Signed-off-by: Adam Li <adam2392@gmail.com>
sktree/datasets/hyppo.py Outdated Show resolved Hide resolved
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 changed the title MAINT Patch up trunk-classification and its usage with respect to parameters MAINT, API Patch up trunk-classification and its usage with respect to parameters Feb 23, 2024
@adam2392 adam2392 changed the title MAINT, API Patch up trunk-classification and its usage with respect to parameters MAINT, API Change trunk-classification into three separate functions for generating trunk, trunk-mix, trunk-overlap and marron-wand Feb 23, 2024
Copy link
Member

@sampan501 sampan501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! For simplicity, there is a lot of error checks that are common for all 3 functions. Maybe we can wrap those up into a private method?

@sampan501
Copy link
Member

Other than that, pending CI, lgtm

Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392
Copy link
Collaborator Author

adam2392 commented Feb 23, 2024

Looks good! For simplicity, there is a lot of error checks that are common for all 3 functions. Maybe we can wrap those up into a private method?

Good idea. Rn tho there's only a few. We could migrate the covariance matrix stuff to a _default_covariance(...) which handles the check and setting of covariance possibly. But maybe we can leave for later depending on any future added complexity. Otw over-engineering maybe :p

@sampan501
Copy link
Member

Fair, I'll make an issue for it

@sampan501 sampan501 self-requested a review February 23, 2024 17:50
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 merged commit 928a855 into main Feb 23, 2024
27 checks passed
@adam2392 adam2392 deleted the trunk branch February 23, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants