-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add variant transfer prepper v2 #1013
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
…to add-variant-transfer-prepper
for more information, see https://pre-commit.ci
… into add-variant-transfer-prepper-v2
for more information, see https://pre-commit.ci
# NOTE: The return_df option doesn't work, the DataFrame.pivot() function fails. Fix is pending. | ||
scores_list = scorer.score( | ||
results, | ||
use_dask=inputs.use_dask, | ||
dask_client=dask_client, | ||
return_df=False, | ||
dask_failure_mode=inputs.dask_failure_mode, | ||
) | ||
del results | ||
import csv | ||
with open(data_intermediates / "docking_scores_raw.csv", 'w', newline='') as myfile: | ||
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL) | ||
for item in scores_list: | ||
wr.writerow([item]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mariacm12 could you add info on the error you got? is it the target-related stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the dataframe has duplicate entries I think. Although I couldn't see the duplicates myself when I generated the csv file the alternative way (from DataFrame.pivot() docs: Raises ValueError when there are any index, columns combinations with multiple values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here (line 432) I had to use the hash of the reference complex, but the target name of the actual target. This is a quick fix because the hash
and unique_name
attributes depend on the ligand's InChIKey, which in this case the target doesn't have. Previously the target name of the reference was used so it was re-writting all of the docking results with the same name (the reference's).
The hash string still needs to be fixed though, I just don't know how to do it without breaking the other workflows. because the current fix still leads to problems when importing files from cache.
results, | ||
use_dask=inputs.use_dask, | ||
dask_client=dask_client, | ||
return_df=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you return_df=True here will be a lot easier to wrangle, including CSV using df.save_csv
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hmacdope see Maria's comments, return_df
isn't working. ultimately we'd want to use that though
Description
copied over from #972
We want a way to transfer molecules from a crystal structure to an alphafold (or otherwise generated) structure and then re-dock them with OpenEye / etc.
Based on very old code from ben.
This is also a way around #927
Todos
Notable points that this PR has either accomplished or will accomplish.
make_du_from_new_lig
Questions
Status
Developers certificate of origin