Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ApplyVQSR Exception thrown at a/some variation site(s) after INDEL VQSR #8767

Open
Dkyuan opened this issue Apr 9, 2024 · 0 comments
Open

Comments

@Dkyuan
Copy link

Dkyuan commented Apr 9, 2024

Bug Report

Affected tool(s) or class(es)

ApplyVQSR, org.broadinstitute.hellbender.exceptions.GATKException

Affected version(s)

4.5.0.0 (gatk-package-4.5.0.0-local.jar)
HTSJDK Version: 4.1.0
Picard Version: 3.1.1

Description

It Encountered input variant which isn't found in the input recal file. And suggested to make sure VariantRecalibrator and ApplyVQSR were run on the same set of input variants. ### I'm sure with this.

Steps to reproduce

  1. step1. E_J_HC.2.vcf.gz as input variation to get EJ.HC.snps.recal file, using VariantRecalibrator in SNP mode
  2. step2. E_J_HC.2.vcf.gz as input variation and recal-file EJ.HC.snps.recal to get SNP recalibrated vcf file [EJ.HC.snps.VQSR.vcf.gz], using ApplyVQSR in SNP mode
  3. step3. EJ.HC.snps.VQSR.vcf.gz as input variation to get EJ.HC.snps.indels.recal file, using VariantRecalibrator in INDEL mode, with --max-gaussians 6; Tool returned: true
  4. step4. EJ.HC.snps.VQSR.vcf.gz as input variation and recal-file EJ.HC.snps.indels.recal to get INDEL recalibrated vcf file EJ.HC.vqsr.vcf.gz, using ApplyVQSR

Expected behavior

Give a recalibrated vcf file ( both SNPs and INDELs )

Actual behavior

Stopped at step4 at position 1:21246 when writing the recalibrated vcf file EJ.HC.vqsr.vcf.gz.
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at 1:21246 [VC EJ.HC.snps.VQSR.vcf.gz @ 1:21246 Q2489.48 of type=SNP alleles=[G*, T] attr={AC=2, AF=0.500, AN=4, DP=155, ExcessHet=0.0000, FS=0.000, MLEAC=2, MLEAF=0.500, MQ=60.00, QD=27.97, SOR=1.204, VQSLOD=-3.754e-01, culprit=QD} GT=GT:AD:DP:GQ:PL 0/0:64,0:64:99:0,120,1800 1/1:0,89:89:99:2505,267,0 filters= at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:145) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1098) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:166) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:209) at org.broadinstitute.hellbender.Main.main(Main.java:306) Caused by: org.broadinstitute.hellbender.exceptions.UserException: Encountered input variant which isn't found in the input recal file. Please make sure VariantRecalibrator and ApplyVQSR were run on the same set of input variants. First seen at: [VC EJ.HC.snps.VQSR.vcf.gz @ 1:21246 Q2489.48 of type=SNP alleles=[G*, T] attr={AC=2, AF=0.500, AN=4, DP=155, ExcessHet=0.0000, FS=0.000, MLEAC=2, MLEAF=0.500, MQ=60.00, QD=27.97, SOR=1.204, VQSLOD=-3.754e-01, culprit=QD} GT=GT:AD:DP:GQ:PL 0/0:64,0:64:99:0,120,1800 1/1:0,89:89:99:2505,267,0 filters=


what I checked:

  1. I noticed there is a similar closed issue ApplyVQSR exception  #8054 , but I didn't know how it was solved

  2. it indeed has the SNP 1:21246 in EJ.HC.snps.VQSR.vcf.gz but not in the recal-file EJ.HC.snps.VQSR.vcf.gz
    1 21246 . G T 2489.48 PASS AC=2;AF=0.5;AN=4;DP=155;ExcessHet=0;FS=0;MLEAC=2;MLEAF=0.5;MQ=60;QD=27.97;SOR=1.204;VQSLOD=-0.3754;culprit=QD GT:AD:DP:GQ:PL 0/0:64,0:64:99:0,120,1800 1/1:0,89:89:99:2505,267,0

  3. I checked the code for the four steps,and I'm sure I used the same input variation file in VariantRecalibrator and ApplyVQSR. The code I used is shown below:

gatk --java-options "-Xmx800g" VariantRecalibrator \
       -R ref.fa \
       -V E_J_HC.2.vcf.gz \
       -resource:conc,known=true,training=true,truth=true,prior=10.0 conc.snp.PASS.vcf.gz \
       -an DP -an QD -an FS -an SOR -an ReadPosRankSum -an MQRankSum \
       -mode SNP \
       -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 95.0 -tranche 90.0 \
       --rscript-file EJ.HC.snps.plots.R \
       --tranches-file EJ.HC.snps.tranches \
       -O EJ.HC.snps.recal && \
gatk --java-options "-Xmx800g" ApplyVQSR \
        -R ref.fa \
        -V E_J_HC.2.vcf.gz \
        --truth-sensitivity-filter-level 99.0 \
        --tranches-file EJ.HC.snps.tranches \
        --recal-file EJ.HC.snps.recal \
        -mode SNP \
        -O EJ.HC.snps.VQSR.vcf.gz

gatk --java-options "-Xmx800g" VariantRecalibrator \
        -R ref.fa \
        -V EJ.HC.snps.VQSR.vcf.gz \
        -resource:conc,known=true,training=true,truth=true,prior=10.0 conc.indel.PASS.vcf.gz \
        -an DP -an QD -an FS -an SOR -an ReadPosRankSum -an MQRankSum \
        -mode INDEL \
        --max-gaussians 6 \
        --rscript-file EJ.HC.snps.indels.plots.R \
        --tranches-file EJ.HC.snps.indels.tranches \
        -O EJ.HC.snps.indels.recal && \
gatk --java-options "-Xmx800g" ApplyVQSR \
        -R ref.fa \
        -V EJ.HC.snps.VQSR.vcf.gz \
        --truth-sensitivity-filter-level 99.0 \
        --tranches-file EJ.HC.snps.indels.tranches \
        --recal-file EJ.HC.snps.indels.recal \
        -O EJ.HC.vqsr.vcf.gz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant