Skip to content

Variant Calling with Deep Learning for Prokaryotic genomes using TensorFlow. Adds (0 <= P(Variant) <= 1) to VCF files under INFO as abbreviated "ACC=".

License

Notifications You must be signed in to change notification settings

tmsincomb/DeepVCF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is DeepVCF?

A deep learning SNP variant caller aimed for Prokaryotic genomes!

Prerequisites

WARNING :: Tensorflow stable does not work with python 3.9 yet!

Python 3.8

Install

git clone git@github.com:tmsincomb/DeepVCF.git
pip install -e ./DeepVCF

Optional Install

For creating in silico datasets

conda install -y dwgsim samtools bcftools bwa

Usage: Init

from DeepVCF.core import DeepVCF

deepvcf = DeepVCF()

Simple Usage: Variant Calling

deepvcf.train(reference_file, alignment_file, vcf_file)  # vcf treated as truth 
vcf_df = deepvcf.create_vcf(
    reference_file=query_ref_file,   
    alignment_file=query_align_file,
    output_folder='./',  
    output_prefix='my-variants'  # auto adds .deepvcf.vcf to end of file created
)
vcf_df.head() # shows pandas DataFrame for variant outputs

Tutorials

Recreating Example Datasets
Usage Demo with In Silico datasets
Model Validation with human datasets from GIAB

About

Variant Calling with Deep Learning for Prokaryotic genomes using TensorFlow. Adds (0 <= P(Variant) <= 1) to VCF files under INFO as abbreviated "ACC=".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published