Exporting Llama3's tokenizer #3555

vifi2021 · 2024-05-08T23:18:58Z

Hello,

I am following https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md#option-c-download-and-export-llama3-8b-model to make Llama3-8B-instruct to run on an S21 Ultra. But seems like the examples.models.llama2.tokenizer.tokenizer cannot process llama3's tokenizer.model.

Has anyone run into this issue?

The text was updated successfully, but these errors were encountered:

iseeyuan · 2024-05-09T00:25:23Z

@larryliu0820 , could you help look at this?

larryliu0820 · 2024-05-09T00:28:33Z

Hello you don't need to process tokenizer.model you can just feed it into this step: https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md#step-4-run-on-your-computer-to-validate

vifi2021 · 2024-05-09T01:02:12Z

Thanks for the fast response.
But I am not running it on my computer. I am running it on an Android phone.
In https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md#step-5-run-benchmark-on-android-phone step 2.2, it requires to upload model, tokenizer to the phone:

adb push <model.pte> /data/local/tmp/llama/
adb push <tokenizer.bin> /data/local/tmp/llama/

Looks like we still need tokenizer.bin?

jonatananselmo · 2024-05-09T02:07:55Z

Just do:

adb push <tokenizer.model> /data/local/tmp/llama/

And use <tokenizer.model> wherever you need to specify the tokenizer.

JacobSzwejbka · 2024-05-09T20:16:33Z

@kirklandsign Since theres some android question can u take a look

mergennachin · 2024-05-10T15:03:56Z

@vifi2021

In Step 4, we have this "For Llama3, you can pass the original tokenizer.model (without converting to .bin file)."

It also applies to subsequent steps.

@kirklandsign

In our readme file, for Step 5, 2.2 and 2.3, we make that clear.

JacobSzwejbka assigned larryliu0820 May 9, 2024

JacobSzwejbka added the bug Something isn't working label May 9, 2024

mergennachin added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: doc Related to our documentation, both in docs/ and docblocks labels May 10, 2024

mergennachin assigned kirklandsign May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exporting Llama3's tokenizer #3555

Exporting Llama3's tokenizer #3555

vifi2021 commented May 8, 2024

iseeyuan commented May 9, 2024

larryliu0820 commented May 9, 2024

vifi2021 commented May 9, 2024

jonatananselmo commented May 9, 2024

JacobSzwejbka commented May 9, 2024

mergennachin commented May 10, 2024

Exporting Llama3's tokenizer #3555

Exporting Llama3's tokenizer #3555

Comments

vifi2021 commented May 8, 2024

iseeyuan commented May 9, 2024

larryliu0820 commented May 9, 2024

vifi2021 commented May 9, 2024

jonatananselmo commented May 9, 2024

JacobSzwejbka commented May 9, 2024

mergennachin commented May 10, 2024