Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAPQ Filtering Post Alignment Using --local Parameter #646

Open
allenloong opened this issue Jan 2, 2024 · 2 comments
Open

MAPQ Filtering Post Alignment Using --local Parameter #646

allenloong opened this issue Jan 2, 2024 · 2 comments

Comments

@allenloong
Copy link

allenloong commented Jan 2, 2024

Hi,
I am currently using Bismark for my DNA methylation analysis and have a question regarding the post-alignment filtering process. Specifically, I'm using the --local parameter for alignment, which I understand allows for more flexible alignments.

My question is about the necessity and implications of filtering alignments based on their MAPQ scores post-alignment. In my case, is it advisable to filter out alignments with a MAPQ score less than 40? I am aware that such filtering can help remove low-quality or ambiguous alignments, but I am also concerned about potentially losing valuable data.

Could you provide guidance or recommendations on this? Any additional insights or considerations I should be aware of when deciding on MAPQ thresholds.

Thanks.
Allen

@FelixKrueger
Copy link
Owner

Dear Allen,

I have to admit that I can't really offer any useful advice on filtering on MAPQ values in locally aligned data, as we typically performing adapter/quality trimming, followed by no further filtering at all as we assume that poor quality data has been removed, and Bismark does not report perfectly multi-mapping reads anyway. Here is a blog post on the rationale for global alignments.

I you want to go down a local/filtering route I. assume general rules apply, see some considerations on MAPQ implementation here.

@allenloong
Copy link
Author

Dear Felix,

Thank you for your prompt response. I've carefully reviewed the two blog posts you referenced, which, actually, inspired me to implement post-alignment filtering. Utilizing SeqMonk, I noticed an increase in read counts in local mode. Interestingly, a comparison of alignments between local and global modes revealed a widespread increase in reads across the entire genome.

I’m not sure if there is any rule on defining "outliers" in 2Kb window analyses. Additionally, I'm curious about the validity of using correlation as a metric to establish MAPQ thresholds when comparing local and global alignments.

Thanks,
Allen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants