Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: what datasets were pre-trained models pre-trained on? #199

Open
rhjohnstone opened this issue Oct 20, 2022 · 1 comment
Open

Comments

@rhjohnstone
Copy link

Some of the pre-trained models are just described as "pre-trained", while others are described as "pre-trained then fine-tuned on x". What data was the original pre-trained performed on, and for how long?

e.g. from the docs:

'gin_supervised_contextpred': A GIN model pre-trained with supervised learning and context prediction
'gin_supervised_masking_BACE': A GIN model pre-trained with supervised learning and masking, and fine-tuned on BACE

@mufeili
Copy link
Contributor

mufeili commented Oct 22, 2022

You may find the details of pre-training in https://arxiv.org/abs/1905.12265. supervised means supervised pre-training on a ChEMBL dataset was performed. contextpred means self-supervised pre-training with context prediction on a ZINC15 dataset was performed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants