You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am pleased to see that you have provided some pre-trained Tokenizer save files in this repository. However, I noticed that you may not provide the code for training this DNATokenizer. I am very interested in how you trained this .
Specifically, I would like to know the following:
What specific method or algorithm did you use to train the Tokenizer? Did you use any existing libraries or frameworks?
What hardware environment did you train on? For example, CPU or GPU, as well as the model and quantity.
Which version of Python did you use for training?
Approximately how much time did it take to train this Tokenizer on the complete dataset?
Providing this information would be very helpful for me to understand your workflow and how to retrain the Tokenizer in my environment. Thank you for your time and assistance!
The text was updated successfully, but these errors were encountered:
I wanted to know how the DNATokenizer works and wanted to understand how to run tokenization_dna.py as a standalone as I am only interested in the way that the tokenisation takes places on the sequence.
Dear author,
I am pleased to see that you have provided some pre-trained Tokenizer save files in this repository. However, I noticed that you may not provide the code for training this DNATokenizer. I am very interested in how you trained this .
Specifically, I would like to know the following:
Providing this information would be very helpful for me to understand your workflow and how to retrain the Tokenizer in my environment. Thank you for your time and assistance!
The text was updated successfully, but these errors were encountered: