CANINE for Medical Natural Language Inference on MedNLI data

We are interested in Natural Language Inference (NLI) on medical data using CANINE, a pre-trained tokenization-free encoder, that operates directly on character sequences without explicit tokenization and a fixed vocabulary, it is available in this repo. We want to predict the relation between a hypothesis and a premise as: Entailement, Contraction or Neutral using MedNLI, a medical dataset annotated by doctors for NLI. We will also use BERT.

This work is part of a project in the course Algorithms for speech and natural language processing at the MVA master program. The repo for the project with more experiments using CANINE for different NLP tasks can be found here.

Setup

# Clone this repository
git clone https://github.com/loubnabnl/canine-mednli.git
cd canine-mednli/
# Install packages
pip install -r requirements.txt

Data

Access for the data can be requested here. It contains a training, validation and test set with pairs of sentences along with the label of their relation. The data must be placed in the folder data/ .

NLI

To use our fine-tuned BERT and CANINE models on MedNLI, you can download the weights in this link, and you should place them in the folder trained-models/. To train a new model on MedNLI you can run the following command

python main.py --model canine --noisy False

Noise robustness

Since CANINE doesn't use a fixed vocabulary, it can be intresting to use it on noisy data where there are many out-of-vocabulary words, mispellings and errors. We provide code to generate noisy versions of MedNLI for a given noise level, by adding, deleting replacing and swapping letters in the words. You can run the following commands:

cd ./utils
python noisy_data.py --noise_level 0.4

To train and evaluate CANINE on noisy data, you can run:

python main.py --model canine --noisy True

Results

Results on clean data:

Model	Test accuracy
BERT	77.6_±0.6
CANINE-C	73.07_±0.3

Results of noise robustness experiments: the left plot correponds to training on clean data and testing on noisy data and the right plot corresponds to the training on noisy data as well

For the NLI task on clean MedNLI we get an accuracy of 77.6% using BERT and an accuracy of 73.07% using CANINE. However when we add a noise with probability 0.4 to the test data, the performance of BERT drops to 59.92% while the accuarcy of CANINE drops only to 65.75%. Training the models on noisy data results in an improvement for both models but CANINE is still preferred to BERT with a 1.4% difference in accuracy. This suggests that CANINE can be more suitable for noisy text than BERT, but for clean data we didn't see and advantadge for CANINE in this task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

trained-models

trained-models

utils

utils

LICENSE

LICENSE

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

CANINE for Medical Natural Language Inference on MedNLI data

Setup

Data

NLI

Noise robustness

Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
models		models
trained-models		trained-models
utils		utils
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

License

loubnabnl/canine-mednli

Folders and files

Latest commit

History

Repository files navigation

CANINE for Medical Natural Language Inference on MedNLI data

Setup

Data

NLI

Noise robustness

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages