Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ribotricer Output #143

Open
bshim181 opened this issue Aug 30, 2023 · 4 comments
Open

Ribotricer Output #143

bshim181 opened this issue Aug 30, 2023 · 4 comments

Comments

@bshim181
Copy link

bshim181 commented Aug 30, 2023

Hello,

I am trying to merge two list of ORFs predicted by the RibORF algorithm and Ribotricer Algorithm. I am aware that RibORF predicts ORFs at the transcript level while the Ribotricer predicts ORFs at the exonal level. I would like to create a bed file where it encompasses predictions from both outputs. Is there way to merge two ORFs predicted at different level (transcript vs exonal)?

Is there a feature where I could transform the exonal predictions by RibORF to transcript level predictions? Also does it make sense to input an offset corrected bam file (corrected by ribORF) for ribotricer input?

@saketkc
Copy link
Collaborator

saketkc commented Aug 30, 2023

My suggestion would be to use the same annotation file for both. That is, when you generate ribotricer index, you should be able to modify it and use as in index to RibORF. I haven't used RibORF for a while, but when we benchmarked, this is the strategy we used.

@bshim181
Copy link
Author

This is the ORF annotation file for ribotricer. I was wondering what the coordinates of the last column stand for.

Screenshot 2023-08-31 at 9 20 33 AM

This is the ORF annotation for the same transcript_ID by ribORF. I was wondering how we would know the start and end coordinate of the parent transcript (not each of its isoforms) which is specified in the ribORF annotations.
Screenshot 2023-08-31 at 9 22 01 AM

@saketkc
Copy link
Collaborator

saketkc commented Sep 6, 2023

The last column are the coordinates of the that ORF - comma separated values indicate the start and end points (these are exons and hence not continuous).

Hope that helps!

@bshim181
Copy link
Author

Hello,

I have been working to transform the Ribotricer index to the RibORF index. I have noticed that the exon boundaries coordinates are not multiple of 3 while RibORF index demonstrates transcript level annotations with transcript length being multiple of 3. Would this cause issues? I have generated the prediction with RIbORF algorithm with the transformed index. It is, however, difficult to verify whether it was converted accurately.

I was wondering if you have converted RibORF annotations or the code use to generate the conversion from one index to the another, would you be able to share that with us? or would that be difficult?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants