You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 8, 2022. It is now read-only.
When loading seq via bio.FASTA, comparisons often fail because s'a' != s'A' (and most FASTAs are soft-masked and thus contain loads of lowercase letters). One has to go over this by doing seq = seq(str(seq).upper()).
seq = str does not work
seq1 + seq1 does not work
seq1 + str1 does not work
How do you get a k-mer from a sequence? k = Kmer[20](s)?
How do you get a sequence from a k-mer? I can get string via str(k), but not a sequence (seq(k) fails).
Many slicing operators do not work on seqs and Kmers greatly reducing their usability.
The text was updated successfully, but these errors were encountered:
This is because sequence is just essentially a string right now internally. Maybe we should have more strict requirements on what can be included in a sequence (i.e. just IUPAC uppercase characters? -- that would require converting when we read sequence data from disk).
TBH I don't think + and = should be overloaded for seq+str -- they are different types and they should be treated differently IMO. If this is really needed then I think an explicit seq1 + seq(str2) is better -- just my opinion. seq1 + seq2 is something we could support pretty easily.
k = Kmer[20](s) is right for that. seq(k) to get a sequence from a k-mer is something we should probably add too.
We can also support more slices on seq. On Kmer it's a lot harder since slices change the type: e.g. k[:3] is of type Kmer[3] and k[:4] is of type Kmer[4] -- not sure what the best way to handle this is. Longer-term I'd prefer to unify k-mer types into a single type and have the compiler deduce and optimize cases where the k-mer length is constant.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
On top of my head:
seq
viabio.FASTA
, comparisons often fail becauses'a' != s'A'
(and most FASTAs are soft-masked and thus contain loads of lowercase letters). One has to go over this by doingseq = seq(str(seq).upper())
.seq = str
does not workseq1 + seq1
does not workseq1 + str1
does not workk = Kmer[20](s)
?str(k)
, but not a sequence (seq(k)
fails).seq
s andKmer
s greatly reducing their usability.The text was updated successfully, but these errors were encountered: