You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the tokenizer to encoding pair sentences in TemplateProcessing in batch_encode.
There's a confusing part where the method requires two lists for sentence A and sentence B.
According to the guide documentation: "To process a batch of sentences pairs, pass two lists to the Tokenizer.encode_batch method: the list of sentences A and the list of sentences B."
Since it instructs to input two lists, it seems like [[A1, A2], [B1, B2]] --(encode)-> {A1, B1}, {A2, B2}.
However, the actual input expects individual pairs batched, not splitting the sentence pairs into lists for A and B.
So, it should be [[A1, B1], [A2, B2]] to encode as {A1, B1}, {A2, B2}.
I've also confirmed that the length of the input list for encode_batch keeps increasing with the number of batches.
Since the guide instructs to input sentence A and sentence B, this is where the confusion arises.
If I've misunderstood anything, could you help clarify this point so I can understand it better?
The text was updated successfully, but these errors were encountered:
Hello.
I'm using the tokenizer to encoding pair sentences in TemplateProcessing in batch_encode.
There's a confusing part where the method requires two lists for sentence A and sentence B.
According to the guide documentation: "To process a batch of sentences pairs, pass two lists to the Tokenizer.encode_batch method: the list of sentences A and the list of sentences B."
Since it instructs to input two lists, it seems like [[A1, A2], [B1, B2]] --(encode)-> {A1, B1}, {A2, B2}.
However, the actual input expects individual pairs batched, not splitting the sentence pairs into lists for A and B.
So, it should be [[A1, B1], [A2, B2]] to encode as {A1, B1}, {A2, B2}.
I've also confirmed that the length of the input list for encode_batch keeps increasing with the number of batches.
Since the guide instructs to input sentence A and sentence B, this is where the confusion arises.
If I've misunderstood anything, could you help clarify this point so I can understand it better?
The text was updated successfully, but these errors were encountered: