Skip to content

Tookit for gtf format conversion and sequence extraction

License

Notifications You must be signed in to change notification settings

PoisonAlien/gtf2fasta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

gtf2fasta

A minimal tool for sequence extraction for every transcript (only from exonic regions) in a gtf from a fasta file.

It's written in Julia and has no dependencies. See here for Julia installation.

This tool requires indexed fasta file for memory efficient sequence extraction.

Usage

#Extracting fasta sequence from gtf file (sequences are written to stdout)

$ gtf2fasta.jl ens82.gtf hg19.fa | head
>ENST00000456328
GTTAACTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCA
GACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCA
GGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTG
...

It will also generate ens82.gtf.transcript.dict.tsv with transcript to gene mappings.

About

Tookit for gtf format conversion and sequence extraction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages