-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference in mapping quality and alignment for bwa mem2 compared to bwa mem #246
Comments
Hi @janrehker what is the environment and commands lines? Does it happen with only 1 thread? thanks |
Hi fredjarlier, Thank you for your reply! CentOS8, Reducing the number of threads from 20 to 1, I now get zero difference: /mnt/NFS/s-msp01/pathodata/apps/bwa-0.7.17/bin/bwa mem -t 1 /mnt/NFS/s-msp01/pathodata/DB/GRCh38.p13/GCA_000001405.15_GRCh38_full_plus_hs38d1_analysis_set.fna 270190-RunA048-2864-2023-IGVhg38-F11-66_S2_R1_001.fastq.gz 270190-RunA048-2864-2023-IGVhg38-F11-66_S2_R2_001.fastq.gz > test_bwa.sam ./bwa-mem2.avx2 mem -t 1 GRCh38_full_plus_hs38d1/GCA_000001405.15_GRCh38_full_plus_hs38d1_analysis_set.fa 270190-RunA048-2864-2023-IGVhg38-F11-66_S2_R1_001.fastq.gz 270190-RunA048-2864-2023-IGVhg38-F11-66_S2_R2_001.fastq.gz > test_bwa-mem2-avx2.sam samtools sort -n -m 30G -@1 -o test_bwa-mem2-avx2_sorted.sam -O SAM test_bwa-mem2-avx2.sam; samtools sort -n -m 30G -@1 -o test_bwa_sorted.sam -O SAM test_bwa.sam; diff test_bwa-mem2-avx2_sorted.sam test_bwa_sorted.sam > diff_unsorted1thread.txt (... with diff just spitting out the different used command lines) I wonder why this happens, as I would not expect to see a difference between multithreaded alignment and single threaded ones in regard to the alignment itself, just by the way the reads are sorted, the latter being corrected by sorting by readname. Do you have some hint for me with regard to that issue? Best regards, |
With bwa you can use option -K to get fixed chunk sizes and reproducible results. "Use -K 100000000 to achieve deterministic alignment results (Note: this is a hidden option)" |
thks for the info The default chunk size depends on the number of the threads so results may vary according to degree of parallelization... But in the first test you had the same number threads for bwa2 and bwa1, right? if so default chunks have the same size and results shoud be identical. I think It worth trying with the -K option to rule out parallisation effect. Otherwise it could be a matter of thread safety, what is the compiler and CPU? best |
or an alignment array issue with AVX2 |
Similar to #5 when aligning my dataset with bwa 0.7.17-r1188 and compare it with bwa mem2 (avx2 version), I get slightly different results.
While I get the exact same alignment, I get a mapping quality score of 48 for bwa mem2 instead of 39 for bwa mem in this example. In other reads the switch happens in the opposite direction.
In another example...
I get different CIGARS of 9S41M50S (bwa mem2) and 6S44M50S (bwa mem). In my opinion they are both wrong and at least the start should be 5SxxMyyS. Nevertheless I still see a difference between both programs.
Alltogether I do not extract many different alignments / mapping qualities:
11562 out of ~27 million reads or ~0.04%.
However, all of those changes seem to be caused by a difference between the two alignment tools. When I run bwa mem twice on my dataset, I do not get any difference between both aligners, so undercomplex regions with probabilistic mapping do not seem to be the issue here.
While bwa mem2 appeared to be ~2x as fast in my tests and results appear mostly comparable to bwa mem, I cannot confirm the claim, that it is producing alignment identical to bwa at the moment.
Best regards,
Jan Rehker
The text was updated successfully, but these errors were encountered: