Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stitching fragmented ORFs #91

Open
cmorganl opened this issue Mar 26, 2022 · 0 comments
Open

Stitching fragmented ORFs #91

cmorganl opened this issue Mar 26, 2022 · 0 comments
Labels
feature request A request for a new feature unlike one that already exists

Comments

@cmorganl
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
A user, Aditi Nagaraj, found a series of ORFs predicted by Prodigal (within treesapp assign) that had fragmented a single RpoB protein sequence into five consecutive ORFs.

Example outputs can be generated from the following command with rpob_test.txt and the RpoB reference package from RefPkgs:

treesapp assign \
-i rpob_test.txt -o RpoB_fragment_test/ \
--refpkg_dir RefPkgs/Translation/RpoB/seed_refpkg/final_outputs/ 

Describe the solution you'd like
A single ORF should be reported in cases where the whole protein sequence has been fragmented into pieces.

The 'stitching' can happen after the profile HMM alignment results have been parsed. A new function needs to be written that compares the alignment positions of ORFs on a single contig or scaffold (i.e. parent sequence). If it finds multiple ORFs from the same parent sequence whose profile HMM positions do not overlap and are located on the same strand, then the ORFs must be stitched.

Stitching involves going back to the (untranslated) input sequences, finding the start and stop positions, deducing the frame in which the ORFs were translated in, and conceptually translating a single sequence using the same translation table used by Prodigal.

@cmorganl cmorganl added the feature request A request for a new feature unlike one that already exists label Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request A request for a new feature unlike one that already exists
Projects
None yet
Development

No branches or pull requests

1 participant