Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem of deduplication #512

Open
crazysummerW opened this issue Sep 20, 2023 · 0 comments
Open

problem of deduplication #512

crazysummerW opened this issue Sep 20, 2023 · 0 comments
Labels

Comments

@crazysummerW
Copy link

crazysummerW commented Sep 20, 2023

Hello,
I'm having an issue when using sambamba 1.0.0 to deduplicate paired-end sequencing BAM files in NGS data.
After deduplication, the resulting BAM file contains reads that are completely identical, with exactly the same ID and detailed information. This is causing problems in my structural variation (SV) analysis.

What could be the reason for this, and is there a way to resolve it? I would like the deduplicated BAM file to have unique read IDs.

Information of reads before deduplication:
`samtools view sample.sorted.bam chrY|grep E100074100L1C005R0181713731

E100074100L1C005R0181713731 129 chrY 21869316 60 108M34S chr1 181745266 0 CGTCGTGAGCGCATACACAGTGGACACAGGAATTTTGTGTCCCATTCCCACCAGGCTAGCAGTGGAGATGAAGTGAGACTGGGCTTTGGAGAGGTGAGGAGATGGGGCGGCCGAGGGGCCTACGCACCATGCTGCTCGGTCA DDDDDDDCCDDDDDDDDDDDDDDDDDDDDCDDCDDDDCDDCDDCDDDDDDDDDCDDDCCDDCCDCDDDDCDCDDDDCDDDDCCCDDDDCDDCCDDDCDDCDCCCCCCDDCCDB@DCCDCDDCDDCDDDCDCDCDCDDCDCDC NM:i:0MD:Z:108 MC:Z:124M18S AS:i:108 XS:i:51 SA:Z:chr1,181745350,-,40M102S,60,0; RG:Z:DP19786-713309
E100074100L1C005R0181713731 129 chrY 21869316 60 108M34S chr1 181745266 0 CGTCGTGAGCGCATACACAGTGGACACAGGAATTTTGTGTCCCATTCCCACCAGGCTAGCAGTGGAGATGAAGTGAGACTGGGCTTTGGAGAGGTGAGGAGATGGGGCGGCCGAGGGGCCTACGCACCATGCTGCTCGGTCA DDDDDDDCCDDDDDDDDDDDDDDDDDDDDCDDCDDDDCDDCDDCDDDDDDDDDCDDDCCDDCCDCDDDDCDCDDDDCDDDDCCCDDDDCDDCCDDDCDDCDCCCCCCDDCCDB@DCCDCDDCDDCDDDCDCDCDCDDCDCDC NM:i:0MD:Z:108 MC:Z:124M18S AS:i:108 XS:i:51 SA:Z:chr1,181745350,-,40M102S,60,0; RG:Z:DP19786-713309`

Information of reads after deduplication:
`samtools view sample.sorted.dedup.bam chrY|grep E100074100L1C005R0181713731

E100074100L1C005R0181713731 129 chrY 21869316 60 108M34S chr1 181745266 0 CGTCGTGAGCGCATACACAGTGGACACAGGAATTTTGTGTCCCATTCCCACCAGGCTAGCAGTGGAGATGAAGTGAGACTGGGCTTTGGAGAGGTGAGGAGATGGGGCGGCCGAGGGGCCTACGCACCATGCTGCTCGGTCA DDDDDDDCCDDDDDDDDDDDDDDDDDDDDCDDCDDDDCDDCDDCDDDDDDDDDCDDDCCDDCCDCDDDDCDCDDDDCDDDDCCCDDDDCDDCCDDDCDDCDCCCCCCDDCCDB@DCCDCDDCDDCDDDCDCDCDCDDCDCDC NM:i:0MD:Z:108 MC:Z:124M18S AS:i:108 XS:i:51 SA:Z:chr1,181745350,-,40M102S,60,0; RG:Z:DP19786-713309
E100074100L1C005R0181713731 129 chrY 21869316 60 108M34S chr1 181745266 0 CGTCGTGAGCGCATACACAGTGGACACAGGAATTTTGTGTCCCATTCCCACCAGGCTAGCAGTGGAGATGAAGTGAGACTGGGCTTTGGAGAGGTGAGGAGATGGGGCGGCCGAGGGGCCTACGCACCATGCTGCTCGGTCA DDDDDDDCCDDDDDDDDDDDDDDDDDDDDCDDCDDDDCDDCDDCDDDDDDDDDCDDDCCDDCCDCDDDDCDCDDDDCDDDDCCCDDDDCDDCCDDDCDDCDCCCCCCDDCCDB@DCCDCDDCDDCDDDCDCDCDCDDCDCDC NM:i:0MD:Z:108 MC:Z:124M18S AS:i:108 XS:i:51 SA:Z:chr1,181745350,-,40M102S,60,0; RG:Z:DP19786-713309`

Looking forward to your reply.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant