-
Hi, Cleanlab Team how to use the clean lab in case of NER problems any documentation or tutorials to follow? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Apologies for the delayed response, just seeing this now. FYI you'll get faster answers if you ask questions in our Slack channel: |
Beta Was this translation helpful? Give feedback.
-
Here's outline of how existing Cleanlab v2.0.0 repo can be used for token classification tasks. Basic Idea: Treat the labels and model’s predictions at each token as if they were labels & predictions for independent training examples (ignoring which document each token/label comes from). Then just run regular cleanlab as if this were a multiclass classification task (with each document broken up into many separate examples, one for each token). So when running cleanlab's After you have run cleanlab's |
Beta Was this translation helpful? Give feedback.
Here's outline of how existing Cleanlab v2.0.0 repo can be used for token classification tasks.
Basic Idea: Treat the labels and model’s predictions at each token as if they were labels & predictions for independent training examples (ignoring which document each token/label comes from). Then just run regular cleanlab as if this were a multiclass classification task (with each document broken up into many separate examples, one for each token). So when running cleanlab's
find_label_issues(labels, pred_probs)
andget_label_quality_scores(labels, pred_probs)
, the labels should be for each token in your entire corpus, and the pred_probs should be the corresponding class-probabilities estimat…