You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have run kmdiff and identified overrepresented kmers among two groups. Following this, I created a membership matrix to identify kmers present in all my case samples, and intersected these with the overrepresented kmers identified by kmdiff. Now I am interested in getting the counts of these in each of my case samples. I already have the count matrices produced by kmdiff. Dumping these to text and grepping them is obviously one way of doing it, but clearly not very efficient. What would your recommendation be here? Unfortunately my C++ is terrible.
The text was updated successfully, but these errors were encountered:
I posted this question before I understood the merge and aggregate command. In case someone else has the same issue, I solved it by doing the following:
The first command creates a matrix with kmers occurring in at least as many samples as I have cases (N_CASES), and the second command dumps this as a text file. Following this I grepped count-matrix.out with the list of kmers I had identified previously.
Note: using this count matrix it should be possible to find these kmers without creating the membership matrix.
I have run kmdiff and identified overrepresented kmers among two groups. Following this, I created a membership matrix to identify kmers present in all my case samples, and intersected these with the overrepresented kmers identified by kmdiff. Now I am interested in getting the counts of these in each of my case samples. I already have the count matrices produced by kmdiff. Dumping these to text and grepping them is obviously one way of doing it, but clearly not very efficient. What would your recommendation be here? Unfortunately my C++ is terrible.
The text was updated successfully, but these errors were encountered: