Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identifier issue_manager #947

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

01PrathamS
Copy link
Contributor

@01PrathamS 01PrathamS commented Jan 7, 2024

Summary

🎯 Purpose: Add new issue manager like identifier_column that checks whether features contains a column that is entirely sequential integers.

[ ✏️this issue manager checks if data has identifier column if yes than it returns 0 else return None ]

import numpy as np 
import pandas as pd 
from cleanlab import Datalab
data = {
    'A': [1, 2, 3, 4, 5],
    'B': [6, 5, 8, 9, 10],
    'C': [2, 12, 13, 14, 15],
    'D': [4, 17, 18, 19, 20],
    'E': [21, 22, 23, 24, 4]
}
df = pd.DataFrame(data)
lab = Datalab(data=df, label_name="D", task="classification")
lab.find_issues(features=df.values, issue_types={"identifier_column":{}})

***====output===****

Finding identifier_column issues ...

Audit complete. 1 issues found in the dataset.

lab.issue_summary
          issue_type  score  num_issues
0  identifier_column    0.0           1

Reviewer Notes

💡 Test cases done.

solves: #923

Copy link

codecov bot commented Jan 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (594307e) 96.86% compared to head (49213e7) 94.89%.
Report is 11 commits behind head on master.

❗ Current head 49213e7 differs from pull request most recent head ce2fb90. Consider uploading reports for the commit ce2fb90 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #947      +/-   ##
==========================================
- Coverage   96.86%   94.89%   -1.98%     
==========================================
  Files          71       71              
  Lines        5708     5716       +8     
  Branches      972      972              
==========================================
- Hits         5529     5424     -105     
- Misses         93      215     +122     
+ Partials       86       77       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant