Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent users enabling annotations with mismatching data type (flow etc) #8788

Open
gokalpcelik opened this issue Apr 22, 2024 · 7 comments · May be fixed by #8810
Open

Prevent users enabling annotations with mismatching data type (flow etc) #8788

gokalpcelik opened this issue Apr 22, 2024 · 7 comments · May be fixed by #8810
Assignees

Comments

@gokalpcelik
Copy link
Contributor

Here is an user trying to enable flow based annotations for non-flow data.

https://gatk.broadinstitute.org/hc/en-us/community/posts/24596063911963-Error-while-running-Mutect2-java-lang-IllegalArgumentException-the-index-points-past-the-last-element-of-the-collection-or-array-334-333

What could be done to prevent this?
Possible ideas

  • Check data and disable nonmatching annotations with a warning in the log
  • Prevent running the command at all with error messages in the log
  • Change the way annotations are named such as FB_myannotation...
@jamesemery
Copy link
Collaborator

@ilyasoifer Looks like this user got tricked by some of the flow based annotations that don't work on their data. I would like to cut down on the risk that this happens for users. If we had more foresight I would advocate renaming all of the flow specific annotations to something like "flowbased_#####". How destructive would this be for your pipelines?

We have some appropriate checks in GATK for the flow-ness of the bam that give warnings more broadly about flow-based mode but we don't currently have any safeguards in the annotations. Thoughts?

@ilyasoifer
Copy link
Collaborator

@jamesemery, @gokalpcelik - this looks like a bug actually, this annotation should work on any data IMO. Would it be possible to get data to reproduce this crash? We should fix this quickly.

@gokalpcelik
Copy link
Contributor Author

Looks like some of the code have changed since then line numbers don't match with the master branch. Should we ask user to try running this annotation with the latest GATK?

@ilyasoifer
Copy link
Collaborator

@gokalpcelik I see that the bug exists in the updated code too. We can fix it, but would be good to have some dataset that can be used to validate. Any chance to ask the user to generate a small example?

@gokalpcelik
Copy link
Contributor Author

I asked the user to send us a piece of data. On the other hand the file name starts with SRR6127704. Is this something that we can get from SRA?

@ilyasoifer
Copy link
Collaborator

ilyasoifer commented Apr 24, 2024 via email

@ilyasoifer ilyasoifer linked a pull request May 3, 2024 that will close this issue
@ilyasoifer
Copy link
Collaborator

@gokalpcelik, @jamesemery - I submitted a PR (#8810) that should be fixing this issue. Could you please take a look?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants