Skip to content

Pseudogenes detection #125

Answered by oschwengers
YiJessePi asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @YiJessePi,
sure and thanks for reaching out! The methodology that is currently implemented is far from being comprehensive but rather a solid starting ground to catch more different cases in the upcoming releases.

Currently, Bakta uses CDSs that remain as hypothetical proteins as seed sequences. Bakta then searches for reference proteins in its PSC database using relaxed thresholds for sequence identity and subject coverage of 80% and 40%, respectively. PSC references with sufficient hits are then aligned against the 6-frame translated CDS sequences elongated 300 bp in up- and downstream directions. These alignments are then analyzed in order to detect pseudogenization causes, e.g. is…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by oschwengers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants