Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about details and hyperparameter settings #23

Open
unclestrong opened this issue Mar 3, 2022 · 2 comments
Open

question about details and hyperparameter settings #23

unclestrong opened this issue Mar 3, 2022 · 2 comments

Comments

@unclestrong
Copy link

Hi shion_honda:

I'm a student studying in bioinformataics, I found your research about SMILES Transformer is very very powerful!
Your paper give a hint that I can use your pretrained model or create my own pretrained model in my fied.

Now I can use your pretrained model provided in google drive to reach a impressive accuracy.And I want to creat my own pretrained model through your project on other large scale dataset.

Then I stuck in reproduce your pretrained model.I can run all your project perfectly,But I can't reach a accuracy level like you provided model.

So I write this email to ask you some details and hyperparameter settings.If you are convinient,would please answer my few questions?

1.Do you use the half data of chembl24? You mention in your paper that you sampled 861000.
2.How many epoch do you trained? You wrote 5 in your github project,but you provided the model named trfm_12_23000,which means at least 12.
3.What's the number of batch_size? You wrote 8 in your github project,but it seems much too smaller to 861000.

In all,I can't trained a model which is powerful as yours,even I use the same data. There must be some details I ignored.

I really like your paper,project and your open-source spirit.But I am really confused, there must be some details I ignored.
So if you are convenient,Could please give me some clue or answer,it's really really helpful.

Thank you sincerely!

@shionhonda
Copy link
Contributor

Thanks for your question.
If you did run all the scripts provided and didn't reach enough accuracy, then some details are accidentally wrong or missing in the paper.
For Q1, I did use Chembl24 as written in README.

Canonical SMILES of 1.7 million molecules that have no more than 100 characters from Chembl24 dataset were used.

And for Q2 and Q3, I regret to say that I couldn't find the right details.

I'm sorry to bother you, but if you need to re-training the model, you could search hyperparameters yourself. You can use as large a batch size as possible, and train the model for more epochs with early stopping.

@unclestrong
Copy link
Author

Thanks for your question. If you did run all the scripts provided and didn't reach enough accuracy, then some details are accidentally wrong or missing in the paper. For Q1, I did use Chembl24 as written in README.

Canonical SMILES of 1.7 million molecules that have no more than 100 characters from Chembl24 dataset were used.

And for Q2 and Q3, I regret to say that I couldn't find the right details.

I'm sorry to bother you, but if you need to re-training the model, you could search hyperparameters yourself. You can use as large a batch size as possible, and train the model for more epochs with early stopping.

Wow thank you for your reply,I would try a few more times with your advices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants