Skip to content

Concurrent K-mer expansion for sequence data in Go.

License

Notifications You must be signed in to change notification settings

lambertsbennett/GoKmer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GoKmer

When applying NLP techniques to sequence data, I found that I often needed to expand sequences into k-mers. This is not particularly an enriching experience, so I worked on a small tool that concurrently expands sequences into k-mers of a given size. This tool operates quite rapidly and can process large assemblies in seconds on an 8 core laptop.

Usage

gokmer -file PATH_TO_SEQS -n NUM_PROCESSORS -k-mer-size K_MER_SIZE -o OUTPUT_FILE

INPUTS:

  • Sequence Fasta file.

OUTPUTS:

  • Parquet file with sequence ID and k-mers.

About

Concurrent K-mer expansion for sequence data in Go.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages