Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Ignore padding tokens when using Translation Task + padding='max_length' #266

Open
SeanNaren opened this issue Jun 23, 2022 · 6 comments
Open
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@SeanNaren
Copy link
Contributor

馃悰 Bug

When using the Translation Task, we need to ensure that we skip padding tokens within the loss calculation. Currently we do not replace the padding with -100, which could be deterimental.

@SeanNaren SeanNaren added bug / fix Something isn't working help wanted Extra attention is needed labels Jun 23, 2022
@spranjal25
Copy link

Hi @SeanNaren, I'm looking to contribute in ML-Software projects and have been using pytorch-lightning myself (read: I'm a fan!). Can you tell me where to get started for this issue? I'd like to scope if I can devote some of my time fixing this one.

@Borda
Copy link
Member

Borda commented Sep 14, 2022

@SeanNaren would you have some points on how/where to start? 馃惏

@uakarsh
Copy link

uakarsh commented Oct 7, 2022

Hi @SeanNaren, @Borda, I think here is what is being asked to be modified.

I referred to this example.
In here, we use TranslationTransformer for the training purpose, and it inherits from Seq2SeqTransformer. If we see this line, we see that the output is loss, logits, however here the loss is calculated taking the padding token into account.

I found the answer about how to solve it, and it is described by the Hugging Face community here.

So, I guess the change to be made is (in simple language), in the same line, i.e here:

  1. Obtain the loss, logits from the common step
  2. Initialize the Cross-Entropy loss with ignore_index = -100.
  3. Make the indexes of the target tokens which are 0 to -100
  4. Calculate the final loss and then perform the steps as usual.

Hope this helps in solving the issue.

@Borda
Copy link
Member

Borda commented Nov 7, 2022

@spranjal25 are you fine with @uakarsh suggestion?

@spranjal25
Copy link

@spranjal25 are you fine with @uakarsh suggestion?

Yes, that's very helpful. I think I can get started on it. Will pick this up ASA I get some free time, are we looking at a timeline here though @Borda?

@Borda
Copy link
Member

Borda commented Nov 7, 2022

not a special rush :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants