You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering if samba outputs the changes (i.e. what sequences have been inserted, changed, removed) that it made to the assembly?
I would also be fine getting that information from the intermediate files. Is format of these files documented somewhere?
I am guessing that the .patches.uniq.links.txt file holds the changes to the assembly, is that correct?
I also do not fully understand the format.
Every line contsins the following columns: <ctg1>.<pos1> <num1> <str1> <ctg2>.<pos2> <num2> <str2> <len> <seq>
Does each line then mean that <seq> has been placed between <ctg1>.<pos1> and <ctg2>.<pos2>.
If the two strands are different, it would reverse-complement one of the contigs?
What do <num1> and <num2> stand for?
Thanks,
Markus
The text was updated successfully, but these errors were encountered:
No, from the output files I could not figure it out.
However, I am running SAMBA in the mode, where it is only allowed to fill in gaps.
So i used this information to create a script that matches the contigs (here: continuous sequences between gaps) of the input assembly against the output assembly.
Since the contigs are not changed at all, you do not even need an aligner here, an exact string match (e.g. str.index in python) is enough.
Then knowing where the contigs from the input are located in the output, you can reproduce the size and position of filled in gaps.
Oh and you have to make sure to cut away the first and last 1000bp (-o parameter of SAMBA) of the input contigs before the matching since SAMBA will mess with these.
Hi,
I am wondering if samba outputs the changes (i.e. what sequences have been inserted, changed, removed) that it made to the assembly?
I would also be fine getting that information from the intermediate files. Is format of these files documented somewhere?
I am guessing that the .patches.uniq.links.txt file holds the changes to the assembly, is that correct?
I also do not fully understand the format.
Every line contsins the following columns:
<ctg1>.<pos1> <num1> <str1> <ctg2>.<pos2> <num2> <str2> <len> <seq>
Does each line then mean that
<seq>
has been placed between<ctg1>.<pos1>
and<ctg2>.<pos2>
.If the two strands are different, it would reverse-complement one of the contigs?
What do
<num1>
and<num2>
stand for?Thanks,
Markus
The text was updated successfully, but these errors were encountered: