Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the --region parameter better match the output of PGAP #288

Open
Dx-wmc opened this issue May 15, 2024 · 1 comment
Open

Make the --region parameter better match the output of PGAP #288

Dx-wmc opened this issue May 15, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@Dx-wmc
Copy link

Dx-wmc commented May 15, 2024

While using Bakta, I realized that one of the known genes was not predicted. However, both Prokka and PGAP accurately predicted the gene. I tried using the -region option for this. Although Prokka's gbk file can be read well, PGAP's file is not ideal. Upon examining PGAP's gbk generation, I found that some of its gene fragments are not multiples of 3 in length (these genes are labeled as pseudogenes). After removing these non-3 genes, it ran successfully. However, manually modifying these files is too costly when applying PGAP annotations to Bakta in bulk. Therefore, I would like --region to automatically identify and exclude or skip these problematic genes, significantly improving efficiency.

@Dx-wmc Dx-wmc added the enhancement New feature or request label May 15, 2024
@Dx-wmc
Copy link
Author

Dx-wmc commented May 16, 2024

Also, another point of confusion for me is, is it not possible to use a gbk file with -- region parameter for different genomes, I have now manually annotated a gbk file. But when I want to apply it to another genome the --region parameter reports an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant