Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add variant transfer prepper v2 #1013

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

apayne97
Copy link
Contributor

@apayne97 apayne97 commented Apr 25, 2024

Description

copied over from #972
We want a way to transfer molecules from a crystal structure to an alphafold (or otherwise generated) structure and then re-dock them with OpenEye / etc.

Based on very old code from ben.

This is also a way around #927

Todos

Notable points that this PR has either accomplished or will accomplish.

  • re-created make_du_from_new_lig
  • created LigandTransferProteinPrepper
  • created ligand_transfer_docking_workflow
  • moved some inputs options to WorkflowInputsBase so that I can re-use them without needing to use everything else
  • created cli
  • incorporate maria's changes
  • speed up prep with parallelization - use just spruce (i.e. don't make a DU) to prep the structures, then pass that structure 1 time with all the ligands
  • make tests
  • make how-to

Questions

  • Question 1 ..

Status

  • Ready to go

Developers certificate of origin

Comment on lines 301 to 314
# NOTE: The return_df option doesn't work, the DataFrame.pivot() function fails. Fix is pending.
scores_list = scorer.score(
results,
use_dask=inputs.use_dask,
dask_client=dask_client,
return_df=False,
dask_failure_mode=inputs.dask_failure_mode,
)
del results
import csv
with open(data_intermediates / "docking_scores_raw.csv", 'w', newline='') as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
for item in scores_list:
wr.writerow([item])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariacm12 could you add info on the error you got? is it the target-related stuff?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the dataframe has duplicate entries I think. Although I couldn't see the duplicates myself when I generated the csv file the alternative way (from DataFrame.pivot() docs: Raises ValueError when there are any index, columns combinations with multiple values)

Copy link
Contributor

@mariacm12 mariacm12 Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (line 432) I had to use the hash of the reference complex, but the target name of the actual target. This is a quick fix because the hash and unique_name attributes depend on the ligand's InChIKey, which in this case the target doesn't have. Previously the target name of the reference was used so it was re-writting all of the docking results with the same name (the reference's).
The hash string still needs to be fixed though, I just don't know how to do it without breaking the other workflows. because the current fix still leads to problems when importing files from cache.

results,
use_dask=inputs.use_dask,
dask_client=dask_client,
return_df=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you return_df=True here will be a lot easier to wrangle, including CSV using df.save_csv below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hmacdope see Maria's comments, return_df isn't working. ultimately we'd want to use that though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants