You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been running projection on a reconstructed pangenome and a set of assembly FastA files for input genomes, in order to assign each gene to a gene family in the pangenome for each input genome.
The documentation states that gene_to_gene_family.tsv "provides the mapping of genes to gene families of the pangenome." I was expecting to see one line per gene for an input genome, which indicates that the gene in a line is assigned to a gene family in the reconstructed pangenome. But this isn't what I got. Instead, I got files with 100s of thousands of lines, even though an input genome contains 2.5k to 2.9k genes.
Any clarifications would be much appreciated. Thank you in advance.
The text was updated successfully, but these errors were encountered:
However, indeed it is right that the current behavior is not the one that was intended. I see where the bug is. Currently, the "gene_to_gene_family.tsv" file contains this information for ALL given input genomes, and not just the single input genome. The file is likely equal between the different "input genome" output directories. we'll get a fix for this in the upcoming version.
Thank you for the explanation. I checked whether "The file is likely equal between the different "input genome" output directories" for a few input genomes. But it didn't seem to be the case. I look forward to the updated version. Thank you.
I have been running
projection
on a reconstructed pangenome and a set of assembly FastA files for input genomes, in order to assign each gene to a gene family in the pangenome for each input genome.I tried consulting the documentation about the output of
projection
, but the link doesn't seem to go anywhere (https://github.com/labgem/PPanGGOLiN/blob/f3ba6a1f33256f19175b570c4b711bb8970d0365/docs/user/projection.md).The documentation states that
gene_to_gene_family.tsv
"provides the mapping of genes to gene families of the pangenome." I was expecting to see one line per gene for an input genome, which indicates that the gene in a line is assigned to a gene family in the reconstructed pangenome. But this isn't what I got. Instead, I got files with 100s of thousands of lines, even though an input genome contains 2.5k to 2.9k genes.Any clarifications would be much appreciated. Thank you in advance.
The text was updated successfully, but these errors were encountered: