question about details and hyperparameter settings #23

unclestrong · 2022-03-03T14:36:20Z

Hi shion_honda：

I'm a student studying in bioinformataics， I found your research about SMILES Transformer is very very powerful!
Your paper give a hint that I can use your pretrained model or create my own pretrained model in my fied.

Now I can use your pretrained model provided in google drive to reach a impressive accuracy.And I want to creat my own pretrained model through your project on other large scale dataset.

Then I stuck in reproduce your pretrained model.I can run all your project perfectly,But I can't reach a accuracy level like you provided model.

So I write this email to ask you some details and hyperparameter settings.If you are convinient,would please answer my few questions?

1.Do you use the half data of chembl24? You mention in your paper that you sampled 861000.
2.How many epoch do you trained? You wrote 5 in your github project,but you provided the model named trfm_12_23000,which means at least 12.
3.What's the number of batch_size? You wrote 8 in your github project,but it seems much too smaller to 861000.

In all,I can't trained a model which is powerful as yours,even I use the same data. There must be some details I ignored.

I really like your paper,project and your open-source spirit.But I am really confused, there must be some details I ignored.
So if you are convenient,Could please give me some clue or answer,it's really really helpful.

Thank you sincerely!

shionhonda · 2022-03-05T09:06:23Z

Thanks for your question.
If you did run all the scripts provided and didn't reach enough accuracy, then some details are accidentally wrong or missing in the paper.
For Q1, I did use Chembl24 as written in README.

Canonical SMILES of 1.7 million molecules that have no more than 100 characters from Chembl24 dataset were used.

And for Q2 and Q3, I regret to say that I couldn't find the right details.

I'm sorry to bother you, but if you need to re-training the model, you could search hyperparameters yourself. You can use as large a batch size as possible, and train the model for more epochs with early stopping.

unclestrong · 2022-03-06T08:43:36Z

Thanks for your question. If you did run all the scripts provided and didn't reach enough accuracy, then some details are accidentally wrong or missing in the paper. For Q1, I did use Chembl24 as written in README.

Canonical SMILES of 1.7 million molecules that have no more than 100 characters from Chembl24 dataset were used.

And for Q2 and Q3, I regret to say that I couldn't find the right details.

I'm sorry to bother you, but if you need to re-training the model, you could search hyperparameters yourself. You can use as large a batch size as possible, and train the model for more epochs with early stopping.

Wow thank you for your reply，I would try a few more times with your advices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about details and hyperparameter settings #23

question about details and hyperparameter settings #23

unclestrong commented Mar 3, 2022

shionhonda commented Mar 5, 2022

unclestrong commented Mar 6, 2022

question about details and hyperparameter settings #23

question about details and hyperparameter settings #23

Comments

unclestrong commented Mar 3, 2022

shionhonda commented Mar 5, 2022

unclestrong commented Mar 6, 2022