Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single base exon in splice alignment #1099

Open
baraaorabi opened this issue Aug 8, 2023 · 1 comment
Open

Single base exon in splice alignment #1099

baraaorabi opened this issue Aug 8, 2023 · 1 comment

Comments

@baraaorabi
Copy link

baraaorabi commented Aug 8, 2023

I have a long read (ONT) that I am aligning to the reference genome using Minimap2 -x splice mode.
Minimap2 is generating a 1bp exon as part of the alignment of this alignment:

$ wget https://ftp.ensembl.org/pub/release-108/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome.10.fa.gz
$ mamba create -n minimap2-test -c bioconda minimap2==2.26
$ mamba activate minimap2-test
(minimap2-test) $ minimap2 -c --eqx -x splice -t 32 Homo_sapiens.GRCh38.dna.chromosome.10.fa.gz  \
  <(echo -e ">read1\nAATGTACTTCGGTTCAATTACGTATTGCTTCAGGTAGTCATTGATGA
CATCATCCACGGGTAATTGCTGAGGGAGAAGCTGGACGCAGTTTAAATGATCCTTATTGGCCATGTGCAATGGAGATAATCCATTGCTGATTTTGAAAGAATGGGGGTGCGAGCTCGATCAAGCAGCATTTCTACCACCTGCTCGTGGCCACATGCTCACAGTGCAGTGAGTGTCAGACCAT
CCTGGTTTTGGCATCGATTTTAGCTCCTCATCGAGCAATAGTTTTACCATATTTGCATTTATCCTTTTTGATGCAACATGTGGAGTTGATGTCATTCCTTGCAGTGAAATCACAGCAGCCATCGGTTTAACAGTCAACGTGGCTACATAGATATTTCCATAGTGAGCAGCTATGTAGGCAGG
TGAAGCCACTCTTTAGATTCCACATCTGCATTTTGTCATTCTGCAGCAGCAAGGCAGCGGCTTCGTGTCGTCCTTTTCGGGCCGCGATCATGAAGAGCTGGGAGACGCACTTTCCTTTGGTGTCATTCACTAGACAGGAGCAGAAACGACTTGGTCGTGACTTCATTGCAAAGCCACTGCCA
ATGGTGTAAGCCATCTCTGTGGCTAGGCTCTGGCTGCCCCATTATCAAGAAACTTGACAACTTCAGTGATTTTCCTGGAGCTGCCATATACAATGGCGTGAAATCATTCGGAGATTGTGCATTGGCATTGGCCTCCCATGTTAGTAACCAAGACTTTACCACCTCTGCGTGCCCAGCCAAAG
ATGCGATGTACCAATGCTGTGTCCTTATCTTTTAGCTGCATCCACATTAGCCTCTCTGCAGCAGCTCAGAAACAACCTCTATGGCCTTCTTTGGAAGAGTGGAGAGCGTTCAACCCATTCTGATTGCAAATGTTGATGTCAACTCCATTTTTTATGTAGGTCAGGAGCTTTTCAAGTTCAGC
TCGAGCTGCTCTTAAGTAGCTTGCATTGCGCATCAGACTTTTTCCTTCCTTTTCCTATGAGCGGGGTTGGATTTTCTCCTTTGGTTCTCCACATCTTTCCCTTTTAAATCCAGGTGAAGTTTACTTAAAGACAGGATCACCGCCGGTATTCCAATCATCCAAAACTGGCAGAGTCAAAGCAT
TGAGACCAAAGTCCATTTACGGGTGCTCCCAGGCAAAGCAGCCACTGCTTAAGCCAGTCCTGCTGTCAGAGAGGCGCGGCTGCTAATGTAGCCCTGTTCAGCTTTCCTCGGAAGAAACGGGGTGTTCCTCTATATCAAGCAGTGAATTATTCTTCTAAAAGAAAATCATATTTAACCTTTCA
TTTTTAACCAATCGTTACCCTGAGATAAGGTCTGACCAGACTTTGTTGCTTGTTCCTCCACGGTGGTTTGAGCCTGAAATGGTGGGAGAGAGAAGAGAGGAACACACACACACACCCTCACACACACAGAAATAAAGCTACAGACTGCAATACGTAACAAGACGAAAGT")

This generates the following PAF alignment:

read1       1490    29      1469    +       10      133797422       60208124        60572958        869     1483    60      NM:i:614        ms:i:645        AS:i:292        nn:i:0  ts:A:-tp:A:P  cm:i:79 s1:i:433        s2:i:171        de:f:0.0833     rl:i:0  cg:Z:14=1X18=1D2=2D5=2X27=2D4=1X1=1D28=5178N1I2=2D1=1X16=2I2=1I14=1X25=3D1=1X4=1D12=1I12=1D21177N29=1D31=3X18=3D4=1I10=27072N6=1X7=1D10=1D1X12=1I14=1X26=1X1=1D2=1X1=1D11=1877N4=1I17=1D19=1X3=1X5=1D10=1I16=1I21=1D18=1X4=1I7=1I19=1X1=2X24=1D7=1D6110N20=1D3=1X5=1X2=3D17=1D2=1D13=1I24=1X4=8544N1=1X14=1X6=1I2=1I4=1I2=1I11=1D14=1X22=1I1X10=2D5=1I1=177N4=1D16=1X2=2D29=2D17=2D1=1D21=389N10=522I111186N1=182183N27=1I14=

Note the 1bp match (111186N, 1=, 182183N). Why is there such a single basepair match? It looks like Minimap2 is trying to force splicing on this read since the total of the two surrounding splice gaps is >200Kbp

@lh3
Copy link
Owner

lh3 commented Aug 12, 2023

That is probably the case, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants