Example for `preprocessing.dictmapper.DictMapper` and `meta.outlier_classifier.OutlierClassifier` #646

anopsy · 2024-03-24T16:36:57Z

Example added to preprocessing.dictmapper.DictMapper is the correct pull request.

Hi folks,
I did today the example for DictMapper, this one felt a bit tricky.

Since the DictMapper uses a dictionary to map the values to int's I tried to think of a useful example, that will not mimic the LabelEncoder, but will input meaningful values (population of a city /ranking of a university) if I went to far and should go back to a more abstract example, let me know.

What caught my attention is that if you want to cover multiple columns in a df you need to create a dict that includes all of the values just as I did in my example. Is my intuition that this seems a bit clumsy correct?

I also had my doubts about what's the proper style to space between comments in the dict, this version got all the green checks from Codespace tools, so I went for that. Btw this time I checked how the docstrings rendered via mkdocs on the docs site ;)

I'm very curious what was the purpose of the DictMapper. :D

Before working on a large PR, please check with @FBruzzesi or @koaning to confirm that they agree with the direction of the PR. This discussion should take place in a Github issue before working on the PR, unless it's a minor change like spelling in the docs.

Description

Example of how to use DictMapper added to docstrings

Partially Fixes #596

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code follows the style guidelines (ruff)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (also to the readme.md)
I have added tests that prove my fix is effective or that my feature works
I have added tests to check whether the new feature adheres to the sklearn convention
New and existing unit tests pass locally with my changes

If you feel your PR is ready for a review, ping @FBruzzesi or @koaning.

FBruzzesi · 2024-03-24T17:24:34Z

Hey @anopsy, thanks for the PR.

One suggestion I have (and I should add it in the issue referencing these examples) is to combine the examples as a unique one in the top level docstring for the class. In this way users will see the following sections: Parameters, Attributes, Example, and then the list of methods rather than single examples in each method.

Example in documentation and its code.

What caught my attention is that if you want to cover multiple columns in a df you need to create a dict that includes all of the values just as I did in my example. Is my intuition that this seems a bit clumsy correct?

This feels buggy as different columns could have share values that one could want to map to different values. It may be useful to either:

report it as an issue and think if there is a way to fix it
at least document how to work around it (e.g. using sklego ColumnSelector/skrub SelectCols in combination of scikit-learn FeatureUnion)

anopsy · 2024-03-24T18:02:00Z

Thank you @FBruzzesi,
sure I will combine the examples into one, next time.
I also thought that maybe the author ment it to be used with SelectCols, if that's the case, I can change the example to a single column df or combine with SelectCols if it can be used in a pipeline.

… little mistake I made in outlier_remover

anopsy · 2024-03-25T14:00:15Z

Hi, I have a question concerning committing from the Codespace. I opened Codespace from my fork, I run git pull origin examples (name of my branch), then added and committed the changes and then I run git push origin examples. The problem is I can't see the Compare&Pull Request button anywhere. I'm completely new to Codespace, tried to google that but couldn't find any specific answer. Should I be opening the Codespaces from scikit-lego original fork for the git push origin to work? I hope it's okay to ask this here. If not let me know and sorry for the inconvenience

koaning · 2024-03-25T14:09:51Z

You probably pushed to your own repository on Github. If you look at your own branches do you see the branch that you've just worked on?

FBruzzesi · 2024-03-25T14:10:46Z

Hey there! I never worked with codespace's but in general once a PR is open, every pushed commit will automatically appear in the PR without the need to Compare & pull request.

If the commits in the screenshots are what you refer to, then you can see them in the PR changes already.

anopsy · 2024-03-25T14:29:56Z

You probably pushed to your own repository on Github. If you look at your own branches do you see the branch that you've just worked on?

Yes and it says: This branch is 5 commits ahead of koaning/scikit-lego:main.

once a PR is open, every pushed commit will automatically appear in the PR without the need to Compare & pull request.

That's convenient to know, thank you Francesco!

If the commits in the screenshots are what you refer to, then you can see them in the PR changes already.

Yes those are mine! Cool, so it worked after all, thank you folks!

FBruzzesi · 2024-03-27T08:14:30Z

sklego/meta/outlier_classifier.py

@@ -52,6 +52,33 @@ def fit(self, X, y=None):
        ValueError
            - If the underlying model is not an outlier detection model.
            - If the underlying model does not have a `decision_function` method.
+
+        Example


Hey @anopsy, I have the same feedback as for DictMapper: if you could move the example up in the docstring I think it would be easier and faster for folks to find when scrolling through the api documentation without the need to step down into the .fit(..) method.

I think this example is ready to merge after that change

I will do that!

FBruzzesi · 2024-03-27T08:20:40Z

sklego/preprocessing/dictmapper.py

@@ -43,6 +43,41 @@ def fit(self, X, y=None):
        -------
        self : DictMapper
            The fitted transformer.
+
+        Example


If you manage to add how to make it interact with either sklego.preprocessing.ColumnSelector or sklearn.composeColumnTransformer I believe it would be of great help and ready to merge

Yes, this double dict was making me really uncomfortable 😅 I'm on it

I did the fixes, but I'm unable to push them. I need to figure out what's going on and be back 😅

Yes, I was also confused. However, I finally figured out what was happening and was able to push this time. The problem was that one of the files, sklego.model_selection.py, which I wasn’t even working on, failed to pass the ruff-format and was reformatted by ruff. Since I hadn’t worked on it, I decided to revert that change (git restore) and attempted to commit only the two files I had been working on. But after pushing, I received the message “Everything-up-to-date” and didn’t see any changes on my branch. Today, I decided to accept the formatting changes made by ruff, and I was finally able to push. I hope the formatter didn’t break anything in model_selection.py. What’s the proper way to handle this kind of problem in the future?

…dictmapper

sklego/preprocessing/dictmapper.py

FBruzzesi

Looks good 🎉 Thanks for the changes!

anopsy added 2 commits March 24, 2024 15:58

Added an example to preprocessing.dictmapper.DictMapper

e237b68

Example added to preprocessing.dictmapper.DictMapper

7dbc8d2

anopsy added 3 commits March 25, 2024 13:20

Example added to meta.outlierclassifier.OutlierClassifier

5097b30

Merge branch 'koaning:main' into examples

a6cec13

Added example to meta.outlier_classifier.Outlier.Classifier and fixed…

e22eeb2

… little mistake I made in outlier_remover

FBruzzesi reviewed Mar 27, 2024

View reviewed changes

moved examples to class docstring, used compose.ColumnTransformer in …

bf4a91c

…dictmapper

FBruzzesi reviewed Mar 29, 2024

View reviewed changes

sklego/preprocessing/dictmapper.py Outdated Show resolved Hide resolved

Update sklego/preprocessing/dictmapper.py

d9667ce

FBruzzesi approved these changes Mar 29, 2024

View reviewed changes

FBruzzesi changed the title ~~Example of preprocessing.dictmapper.DictMapper added~~ Example for preprocessing.dictmapper.DictMapper and meta.outlier_classifier.OutlierClassifier Mar 29, 2024

FBruzzesi merged commit 14ef241 into koaning:main Mar 29, 2024
17 checks passed

FBruzzesi mentioned this pull request Apr 8, 2024

[DOCS] Example usage in docstring #596

Open

33 tasks

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example for `preprocessing.dictmapper.DictMapper` and `meta.outlier_classifier.OutlierClassifier` #646

Example for `preprocessing.dictmapper.DictMapper` and `meta.outlier_classifier.OutlierClassifier` #646

anopsy commented Mar 24, 2024 •

edited

Loading

FBruzzesi commented Mar 24, 2024 •

edited

Loading

anopsy commented Mar 24, 2024 •

edited

Loading

anopsy commented Mar 25, 2024 •

edited

Loading

koaning commented Mar 25, 2024

FBruzzesi commented Mar 25, 2024 •

edited

Loading

anopsy commented Mar 25, 2024

FBruzzesi Mar 27, 2024

anopsy Mar 27, 2024

FBruzzesi Mar 27, 2024

anopsy Mar 27, 2024

anopsy Mar 28, 2024

anopsy Mar 29, 2024

FBruzzesi left a comment

Example for preprocessing.dictmapper.DictMapper and meta.outlier_classifier.OutlierClassifier #646

Example for preprocessing.dictmapper.DictMapper and meta.outlier_classifier.OutlierClassifier #646

Conversation

anopsy commented Mar 24, 2024 • edited Loading

Description

Type of change

Checklist:

FBruzzesi commented Mar 24, 2024 • edited Loading

anopsy commented Mar 24, 2024 • edited Loading

anopsy commented Mar 25, 2024 • edited Loading

koaning commented Mar 25, 2024

FBruzzesi commented Mar 25, 2024 • edited Loading

anopsy commented Mar 25, 2024

FBruzzesi Mar 27, 2024

Choose a reason for hiding this comment

anopsy Mar 27, 2024

Choose a reason for hiding this comment

FBruzzesi Mar 27, 2024

Choose a reason for hiding this comment

anopsy Mar 27, 2024

Choose a reason for hiding this comment

anopsy Mar 28, 2024

Choose a reason for hiding this comment

anopsy Mar 29, 2024

Choose a reason for hiding this comment

FBruzzesi left a comment

Choose a reason for hiding this comment

Example for `preprocessing.dictmapper.DictMapper` and `meta.outlier_classifier.OutlierClassifier` #646

Example for `preprocessing.dictmapper.DictMapper` and `meta.outlier_classifier.OutlierClassifier` #646

anopsy commented Mar 24, 2024 •

edited

Loading

FBruzzesi commented Mar 24, 2024 •

edited

Loading

anopsy commented Mar 24, 2024 •

edited

Loading

anopsy commented Mar 25, 2024 •

edited

Loading

FBruzzesi commented Mar 25, 2024 •

edited

Loading