Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roimethstat seems not be able to identify overlapping region #21

Open
kyliuae opened this issue Jan 24, 2022 · 5 comments
Open

roimethstat seems not be able to identify overlapping region #21

kyliuae opened this issue Jan 24, 2022 · 5 comments

Comments

@kyliuae
Copy link

kyliuae commented Jan 24, 2022

Hi Methpipe team,

I'm David, a PhD student and i am working on calculating the methylation in the PMD regions and the flanking area. Basically, i chop the PMDs and the extended regions into smaller bins and calculate the average methylation level. However, for some regions, the coordinates seem to overlap (e.g. the end coordinate surpass the start coordinate of the next region). No matter how i sort (sort -k 1,1 -k 3,3n -k 2,2n -k 6,6 or sort -k1,1 -k2,2n) the program still stated the region of interest file isn't sorted.

My version of the Methpipe is methpipe-5.0.0. Attached are the example screenshot (i do not think the -nan in PMD definition, which i also used Methpipe, is the reason). Thank you very much for the help.

Best Regards,
David
Screenshot 2022-01-24 at 3 45 36 PM

@mengzhou
Copy link
Member

Please try the -L option of roimethstat if you have overlapping regions. This option will make the program load all CpG sites of your input into memory first.

@kyliuae
Copy link
Author

kyliuae commented Jan 25, 2022

Thanks. Unfortunately, i already tried -L option before, but the same error still exist.

Do i need to send you extra information for this? Thank you.

@guilhermesena1
Copy link
Contributor

it may also be possible that your chroms are not sorted alphabetically using C locale. Have you tried sorting your BED file using the command below?

LC_ALL=C sort -k1,1 -k2,2n -k3,3n -k4,4

@kyliuae
Copy link
Author

kyliuae commented Jan 26, 2022

Thank you for the comment. But i did try your suggestion (i even try to erase the PMD identifier and the column at the back just leaving the 4 columns needed to sort using your suggestion), but the error still exist 🥲. I'm pretty sure my xxx.meth file has no problem since i can run roimethstat for the regions without this overlapping issue so the problem just remain in region file i guess.

Do i need to send you extra information for this? Thank you so much for the help again.

@guilhermesena1
Copy link
Contributor

So I tried to reproduce with the following BED file:

[21:55:10] [odin : methpipe] cat test.bed 
chr1  13038878  13049837  + 0 test1
chr1  13040868  13043468  + 0 test2
chr1  13043468  13046068  + 0 test3
chr1  13046068  13048668  + 0 test4
chr1  13048668  13051268  + 0 test5
chr1  13049837  13060796  + 0 test6
chr1  13051268  13053868  + 0 test7
chr1  13053868  13056468  + 0 test8

and running

roimethstat -P -v -o test.roi test.bed in.meth 

(I also tried with the -L flag), and the program runs to completion, showing all 8 fields:

chr1	13038878	13049837	+:106:8:7:8	0.875	+
chr1	13040868	13043468	+:25:3:2:3	0.666667	+
chr1	13043468	13046068	+:29:2:2:2	1	+
chr1	13046068	13048668	+:30:2:2:2	1	+
chr1	13048668	13051268	+:18:5:24:27	0.888889	+
chr1	13049837	13060796	+:110:20:115:139	0.827338	+
chr1	13051268	13053868	+:29:0:0:0	-nan	+
chr1	13053868	13056468	+:19:0:0:0	-nan	+

I tried both with methpipe 4.1.1 and 5.0.0.
I agree that our error reports aren't very helpful since they don't print which pair of regions are causing the error for not being sorted (and we'll work on more informative error messages for the next release!)

by any chance would you be able to share the entire BED file you're using? So we can try to reproduce the error and zero in on the actual pair of regions causing the problem?

@andrewdavidsmith andrewdavidsmith transferred this issue from smithlabcode/methpipe Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants