Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hyperparameter predictorbase #832

Merged
merged 8 commits into from
May 22, 2024

Conversation

c-w-feldmann
Copy link
Contributor

Description

This PR fixes warnings from lightning, caused by setting parameters.
(Also removes a duplicate line)

Example / Current workflow

from chemprop.nn.predictors import BinaryClassificationFFN
from chemprop.nn.loss import BCELoss
from torch import nn

BinaryClassificationFFN(criterion=BCELoss(), output_transform=nn.Identity())

>> /my_python_path/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'criterion' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['criterion'])`.
>> /my_python_path/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'output_transform' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['output_transform'])`

Bugfix / Desired workflow

See PR

@JacksonBurns
Copy link
Member

Thanks for the PR! @hwpang are these test failures because the saved checkpoints include the hyperparameters which are now ignored in this PR?

@c-w-feldmann
Copy link
Contributor Author

c-w-feldmann commented Apr 24, 2024

Thanks for the PR! @hwpang are these test failures because the saved checkpoints include the hyperparameters which are now ignored in this PR?

When I run pytest locally on the original code I get:

2 failed, 660 passed, 15 skipped, 560 warnings in 940.00s (0:15:39)

My code produces:

17 failed, 645 passed, 15 skipped, 421 warnings in 930.48s (0:15:30) 

I guess it's not my code, but I also don't really understand why these errors have not occurred earlier.

Edit
I forgot to checkout the correct env. I fixed the obtained results. Most errors originate from my fix. Sorry for that. I will try to come up with a better solution.

Edit 2
I think I found a solution, but it is not really good looking. Help would be appreciated.

@kevingreenman kevingreenman added this to the v2.0.1 milestone May 18, 2024
@JacksonBurns JacksonBurns requested a review from hwpang May 21, 2024 20:14
@JacksonBurns
Copy link
Member

@hwpang could you review? Not sure which of these lines are actually required. Pretty sure we need to rebuild the checkpoint files with these hparams ignored.

@hwpang
Copy link
Contributor

hwpang commented May 22, 2024

Thanks for the PR! This seems fine to me. Could you run this script? The script will regenerate all the checkpoint files we use for tests. After you run the script, please commit those new checkpoint files and push to this PR. This will run the tests with the newly generated checkpoints to ensure that they are indeed correct and compatible.

@c-w-feldmann
Copy link
Contributor Author

c-w-feldmann commented May 22, 2024

@hwpang

Could you run this script?

Done

Copy link
Contributor

@hwpang hwpang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hwpang hwpang merged commit 86f4023 into chemprop:main May 22, 2024
13 checks passed
@c-w-feldmann c-w-feldmann deleted the fix_hyperparameter_predictorbase branch May 23, 2024 07:08
hwpang pushed a commit to hwpang/chemprop that referenced this pull request May 23, 2024
@KnathanM
Copy link
Contributor

KnathanM commented Jun 3, 2024

@hwpang could you review? Not sure which of these lines are actually required. Pretty sure we need to rebuild the checkpoint files with these hparams ignored.

Based on #898, I don't think the checkpoint files needed to be rebuilt because this PR just manually saves criterion and output_transform to hparams instead of having save_hyperparameters do it automatically. If I am wrong and the checkpoint files did need to be rebuilt, then models created in v2.0.0 might not be compatible with v2.0.1 and we should comment on that in the release notes for the next release.

@hwpang or @JacksonBurns can one of you check my thinking on this?

@JacksonBurns
Copy link
Member

I'm not sure, deferring this to your judgement

@hwpang
Copy link
Contributor

hwpang commented Jun 5, 2024

@KnathanM Thanks for tagging me! I looked into this and the tldr is that the checkpoint file has changed slightly, but they are equivalent.

Using the previous method self.save_hyperparameters() to save criterion and output_transform automatically, the checkpoint_auto['hyper_parameters']['predictor'] has

"activation":       RELU
"cls":              <class 'chemprop.nn.predictors.RegressionFFN'>
"criterion":        None
"dropout":          0.0
"hidden_dim":       300
"input_dim":        300
"n_layers":         1
"n_tasks":          1
"output_transform": UnscaleTransform()
"task_weights":     None
"threshold":        None

And checkpoint_auto['state_dict']['predictor.criterion.task_weights'] has

tensor([[1.]])

Using the current method by setting ignore and saving the criterion and output_transform manually, checkpoint_man['hyper_parameters']['predictor'] has

"activation":       RELU
"cls":              <class 'chemprop.nn.predictors.RegressionFFN'>
"criterion":        MSELoss(task_weights=[[1.0]])
"dropout":          0.0
"hidden_dim":       300
"input_dim":        300
"n_layers":         1
"n_tasks":          1
"output_transform": UnscaleTransform()
"task_weights":     None
"threshold":        None

And the state dict has checkpoint_man['state_dict']['predictor.criterion.task_weights']:

tensor([[1.]])

The only difference I have observed is that the criterion is saved explicitly when we use the manual method. But either of these model files should produce the same predictions when used in inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants