Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement LabelEncoder methods and add fit method test #28702

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

muzakkirhussain011
Copy link
Contributor

@muzakkirhussain011 muzakkirhussain011 commented Mar 29, 2024

PR: Enhancements to LabelEncoder

Overview

This PR introduces critical enhancements to the LabelEncoder class, enabling the conversion of string labels into numerical identifiers. Due to the limitation of ivy.array not accepting string elements, we've utilized the list data type for encoding operations. This ensures compatibility across all backends and maintains the core functionality of the LabelEncoder.

Details

  • String Handling: The LabelEncoder now uses lists to process and encode string objects into numerical identifiers. This change is due to ivy.array's inability to handle string types, which are essential for the LabelEncoder's operation.
  • Backend Compatibility: By employing lists, we ensure that the LabelEncoder remains functional across various computational backends, as lists are universally supported.
  • Testing Rigor: Tests for the fit method are consistently passing, confirming the method's reliability in identifying and ordering unique labels.
  • Method Prerequisites: The tests for transform, fit_transform, and inverse_transform methods are failing because these methods require the LabelEncoder to be fitted with data first. This prerequisite is not currently met within the automated testing setup, leading to failures.
  • Manual Verification: Manual testing of the transform and fit_transform methods with direct input has been successful, indicating that the methods function correctly when the encoder is properly fitted.

Attachments

  • Screenshot of the fit method test passing.
  • Screenshot of the transform method test failing.
  • Screenshot of the manual test for transform and fit_transform methods passing.

This PR is a step forward in enhancing our machine learning toolkit's capabilities, ensuring more robust and versatile encoding functionalities and seamless backend integration

Closes #

Checklist

  • Did you add a function?
  • Did you add the tests?
  • Did you run your tests and are your tests passing?
  • Did pre-commit not fail on any check?
  • Did you follow the steps we provided?

image
image
image

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Compliance Checks

Thank you for your Pull Request! We have run several checks on this pull request in order to make sure it's suitable for merging into this project. The results are listed in the following section.

Issue Reference

In order to be considered for merging, the pull request description must refer to a specific issue number. This is described in our contributing guide and our PR template.
This check is looking for a phrase similar to: "Fixes #XYZ" or "Resolves #XYZ" where XYZ is the issue number that this PR is meant to address.

@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Mar 29, 2024

Hi @Ishticode ,

Could you please review the “Enhancements to LabelEncoder” PR? Your feedback would be invaluable. Screenshots of the tests and manual verifications are attached for reference

Thanks

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @muzakkirhussain011
We should not really be adding docstring to the frontends as the actual native frameworks explain the function or classes exactly. You can see this as a NOTE on our frontends guide at https://unify.ai/docs/ivy/overview/deep_dive/ivy_frontends.html. Let me know if something is unclear. Thank you very much

@muzakkirhussain011
Copy link
Contributor Author

muzakkirhussain011 commented Mar 29, 2024

Thank you very much @muzakkirhussain011 We should not really be adding docstring to the frontends as the actual native frameworks explain the function or classes exactly. You can see this as a NOTE on our frontends guide at https://unify.ai/docs/ivy/overview/deep_dive/ivy_frontends.html. Let me know if something is unclear. Thank you very much

Hi @Ishticode ,

I acknowledge the frontend documentation standards as highlighted in the guide and will ensure compliance moving forward. Thank you for bringing this to my attention. I’ve removed the docstrings from the frontends as per the guide to maintain consistency with the native frameworks’ documentation.

Best regards

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,

I'm following up on this PR as it has been 5 days since it was opened. Your review would be very helpful for us to proceed. Looking forward to your feedback.

Best Regards

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Can we test for newly implemented functions which remove the NotImplementedError. Thansk :)

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,
The tests for transform, fit_transform, and inverse_transform methods are failing because these methods require the LabelEncoder to be fitted with data first. This prerequisite is not currently met within the automated testing setup, leading to failures.

Manual Verification: Manual testing of the transform and fit_transform methods with direct input has been successful, indicating that the methods function correctly when the encoder is properly fitted.

I have attached Screenshot of the fit method test passing.

Screenshot of the transform method test failing.

Screenshot of the manual test for transform and fit_transform methods passing.

In the PR description.

Copy link
Contributor

@Ishticode Ishticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, in that case we should still add the test but make a fit call before calling the actual test call. :)

@muzakkirhussain011
Copy link
Contributor Author

Hi @Ishticode ,
Even though I called the fit method beforehand I'm still getting the not fitted error.

Test for transform method:

@handle_frontend_method(
    class_tree=CLASS_TREE + ".LabelEncoder",
    init_tree="sklearn.preprocessing.LabelEncoder",
    method_name="transform",
    dtype_x=helpers.dtype_and_values(
        available_dtypes=helpers.get_dtypes("float_and_integer"),
        min_num_dims=1,
        max_num_dims=1,
    ),
)
def test_sklearn_label_encoder_transform(
    dtype_x,
    frontend,
    frontend_method_data,
    init_flags,
    method_flags,
    on_device,
    backend_fw,
):
    input_dtype, x = dtype_x
    # Initialize the LabelEncoder and fit it with the input data
    encoder = LabelEncoder()
    encoder.fit(x[0])

    helpers.test_frontend_method(
        init_input_dtypes=input_dtype,
        init_all_as_kwargs_np={},
        method_input_dtypes=input_dtype,
        method_all_as_kwargs_np={
            "y": x[0],
        },
        frontend_method_data=frontend_method_data,
        init_flags=init_flags,
        method_flags=method_flags,
        frontend=frontend,
        on_device=on_device,
        backend_to_test=backend_fw,
    )

Output of the test:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants