Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing FormulaicTransformer and deprecating PatsyTransformer #593

Merged
merged 5 commits into from
Oct 27, 2023

Conversation

FBruzzesi
Copy link
Collaborator

Description

Addresses #559 by deprecating PatsyTransformer and introducing a FormulaicTransformer.

In writing of FormulaicTranformer I followed what is described in the formulaic documentation for integrating with sklearn. The main (tiny) differences with such implementation can be summarized as:

  • keeping the return_type parameter available (similar to what PatsyTransformer does)
  • avoiding parsing of parameters in __init__ method
  • how fit check is performed in .transform()

Additionally the PR makes both dependencies optional.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the style guidelines (flake8)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (also to the readme.md)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added tests to check whether the new feature adheres to the sklearn convention
  • New and existing unit tests pass locally with my changes

@deprecated(
version="0.6.17",
reason="Please use `sklego.preprocessing.FormulaicTransformer` instead. "
"This object will be removed from the preprocessing submodule in version 0.9.0.",
Copy link
Collaborator Author

@FBruzzesi FBruzzesi Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally made up version number in which to remove😂

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries :) That said ... I'd be fine with 0.8.0.

assert pipe.fit(X, y).predict(X).shape[0] == X.shape[0]


def test_unseen_categories(df):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differently from patsy, formulaic raises a warning but doesn't break with unseen categories

@FBruzzesi FBruzzesi changed the title Feature/patsy formulaic Introducing FormulaicTransformer and deprecating PatsyTransformer Oct 27, 2023
assert isinstance(df_fit_transformed, expected_type)



Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One extra test for this one, could we confirm that we're able to save/load this component? This was an issue with patsy because it wouldn't pickle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added pickling test.
Also found in the formulaic changelogs that from 0.3.0:

Pickleability is now guaranteed and tested via unit tests. Failure to pickle any formulaic metadata object (such as formulas, model specs, etc) is considered a bug.

Copy link
Owner

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only been able to glance it but it looks good! It'd be nice to add one extra test though, just to be sure.

explosion-bot

This comment was marked as off-topic.

Copy link
Owner

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty neat :) thanks!

@koaning koaning merged commit ecc11aa into koaning:main Oct 27, 2023
7 checks passed
@FBruzzesi FBruzzesi deleted the feature/patsy-formulaic branch October 27, 2023 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants