Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement LabelEncoder methods and add fit method test #28702

Merged
merged 9 commits into from
Aug 10, 2024

Conversation

muzakkirhussain011
Copy link
Contributor

@muzakkirhussain011 muzakkirhussain011 commented Mar 29, 2024

PR: Enhancements to LabelEncoder

Overview

This PR introduces critical enhancements to the LabelEncoder class, enabling the conversion of string labels into numerical identifiers. Due to the limitation of ivy.array not accepting string elements, we've utilized the list data type for encoding operations. This ensures compatibility across all backends and maintains the core functionality of the LabelEncoder.

Details

  • String Handling: The LabelEncoder now uses lists to process and encode string objects into numerical identifiers. This change is due to ivy.array's inability to handle string types, which are essential for the LabelEncoder's operation.
  • Backend Compatibility: By employing lists, we ensure that the LabelEncoder remains functional across various computational backends, as lists are universally supported.
  • Testing Rigor: Tests for the fit method are consistently passing, confirming the method's reliability in identifying and ordering unique labels.
  • Method Prerequisites: The tests for transform, fit_transform, and inverse_transform methods are failing because these methods require the LabelEncoder to be fitted with data first. This prerequisite is not currently met within the automated testing setup, leading to failures.
  • Manual Verification: Manual testing of the transform and fit_transform methods with direct input has been successful, indicating that the methods function correctly when the encoder is properly fitted.

Attachments

  • Screenshot of the fit method test passing.
  • Screenshot of the transform method test failing.
  • Screenshot of the manual test for transform and fit_transform methods passing.

This PR is a step forward in enhancing our machine learning toolkit's capabilities, ensuring more robust and versatile encoding functionalities and seamless backend integration

Closes #

Checklist

  • Did you add a function?
  • Did you add the tests?
  • Did you run your tests and are your tests passing?
  • Did pre-commit not fail on any check?
  • Did you follow the steps we provided?

image
image
image

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Compliance Checks

Thank you for your Pull Request! We have run several checks on this pull request in order to make sure it's suitable for merging into this project. The results are listed in the following section.

Issue Reference

In order to be considered for merging, the pull request description must refer to a specific issue number. This is described in our contributing guide and our PR template.
This check is looking for a phrase similar to: "Fixes #XYZ" or "Resolves #XYZ" where XYZ is the issue number that this PR is meant to address.

@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Mar 29, 2024

Hi @Ishticode ,

Could you please review the “Enhancements to LabelEncoder” PR? Your feedback would be invaluable. Screenshots of the tests and manual verifications are attached for reference

Thanks

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @muzakkirhussain011
We should not really be adding docstring to the frontends as the actual native frameworks explain the function or classes exactly. You can see this as a NOTE on our frontends guide at https://unify.ai/docs/ivy/overview/deep_dive/ivy_frontends.html. Let me know if something is unclear. Thank you very much

@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Mar 29, 2024

Thank you very much @muzakkirhussain011 We should not really be adding docstring to the frontends as the actual native frameworks explain the function or classes exactly. You can see this as a NOTE on our frontends guide at https://unify.ai/docs/ivy/overview/deep_dive/ivy_frontends.html. Let me know if something is unclear. Thank you very much

Hi @Ishticode ,

I acknowledge the frontend documentation standards as highlighted in the guide and will ensure compliance moving forward. Thank you for bringing this to my attention. I’ve removed the docstrings from the frontends as per the guide to maintain consistency with the native frameworks’ documentation.

Best regards

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,

I'm following up on this PR as it has been 5 days since it was opened. Your review would be very helpful for us to proceed. Looking forward to your feedback.

Best Regards

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Can we test for newly implemented functions which remove the NotImplementedError. Thansk :)

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,
The tests for transform, fit_transform, and inverse_transform methods are failing because these methods require the LabelEncoder to be fitted with data first. This prerequisite is not currently met within the automated testing setup, leading to failures.

Manual Verification: Manual testing of the transform and fit_transform methods with direct input has been successful, indicating that the methods function correctly when the encoder is properly fitted.

I have attached Screenshot of the fit method test passing.

Screenshot of the transform method test failing.

Screenshot of the manual test for transform and fit_transform methods passing.

In the PR description.

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, in that case we should still add the test but make a fit call before calling the actual test call. :)

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,
Even though I called the fit method beforehand I'm still getting the not fitted error.

Test for transform method:

@handle_frontend_method(
    class_tree=CLASS_TREE + ".LabelEncoder",
    init_tree="sklearn.preprocessing.LabelEncoder",
    method_name="transform",
    dtype_x=helpers.dtype_and_values(
        available_dtypes=helpers.get_dtypes("float_and_integer"),
        min_num_dims=1,
        max_num_dims=1,
    ),
)
def test_sklearn_label_encoder_transform(
    dtype_x,
    frontend,
    frontend_method_data,
    init_flags,
    method_flags,
    on_device,
    backend_fw,
):
    input_dtype, x = dtype_x
    # Initialize the LabelEncoder and fit it with the input data
    encoder = LabelEncoder()
    encoder.fit(x[0])

    helpers.test_frontend_method(
        init_input_dtypes=input_dtype,
        init_all_as_kwargs_np={},
        method_input_dtypes=input_dtype,
        method_all_as_kwargs_np={
            "y": x[0],
        },
        frontend_method_data=frontend_method_data,
        init_flags=init_flags,
        method_flags=method_flags,
        frontend=frontend,
        on_device=on_device,
        backend_to_test=backend_fw,
    )

Output of the test:
image

@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Aug 7, 2024

Hi @Ishticode,
hope you're doing well! I wanted to kindly remind you that this PR has been awaiting your review for 4 months now. I have added the tests as mentioned. Your feedback is greatly appreciated, and I'm looking forward to hearing your thoughts when you have a moment to spare.
Thank you
Best regards

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,
hope you're doing well! I wanted to kindly remind you that this PR has been awaiting your review for 4 months now. I have added the tests as mentioned above. Your feedback is greatly appreciated, and I'm looking forward to hearing your thoughts when you have a moment to spare.
Thank you
Best regards

@Ishticode Ishticode removed their assignment Aug 8, 2024
@muzakkirhussain011
Copy link
Contributor Author

Hi @Sam-Armstrong  ,
hope you're doing well! I wanted to kindly remind you that this PR has been awaiting your review for 4 months now. I have added the tests as mentioned above. Your feedback is greatly appreciated, and I'm looking forward to hearing your thoughts when you have a moment to spare.
Thank you
Best regards

Copy link
Contributor

@Sam-Armstrong Sam-Armstrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow review on this! The tests you wrote are all failing, please can you fix them? You can follow the precedent of the first test in this file.

pytest ivy_tests/test_ivy/test_frontends/test_sklearn/test_preprocessing/test_label.py

Comment on lines 64 to 65
encoder = LabelEncoder()
encoder.fit(x[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't be initializing a LabelEncoder here; see the first test in this file as an example of how to write the test.

@muzakkirhussain011
Copy link
Contributor Author

Hi @Sam-Armstrong ,
The tests for transform, fit_transform, and inverse_transform methods are failing because these methods require the LabelEncoder to be fitted with data first. This prerequisite is not currently met within the automated testing setup, leading to failures.

Manual Verification: Manual testing of the transform and fit_transform methods with direct input has been successful, indicating that the methods function correctly when the encoder is properly fitted.

I have attached Screenshot of the fit method test passing.

Screenshot of the transform method test failing.

Screenshot of the manual test for transform and fit_transform methods passing.

In the PR description.

Copy link
Contributor

@Sam-Armstrong Sam-Armstrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sure, that's fine. Considering the sklearn frontend is low-priority right now, I think we don't have to worry about modifying the testing infrastructure to get these working. Please could you leave todo notes on the tests as I've indicated below, then we can get this merged. Thanks!

…ing/test_label.py

Co-authored-by: Sam Armstrong <88863522+Sam-Armstrong@users.noreply.github.com>
@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Aug 8, 2024

Hi @Sam-Armstrong ,
I've added the todo notes on the tests as requested. The modifications to the testing infrastructure can be addressed later, given the low priority of the sklearn frontend. I've completed the tasks and the changes are ready to be merged. Please let me know if there's anything else I can assist with.
Thanks!

@muzakkirhussain011
Copy link
Contributor Author

Hi @Sam-Armstrong ,
I wanted to follow up and confirm that the requested changes, including the addition of the todo notes, have been implemented as per your instructions. Please let me know if there's anything else you need. Thank you.

Copy link
Contributor

@Sam-Armstrong Sam-Armstrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @muzakkirhussain011, will merge now 😄

@Sam-Armstrong Sam-Armstrong merged commit 022be66 into ivy-llc:main Aug 10, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants