Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues with reference list file #99

Open
rosave9 opened this issue May 27, 2022 · 0 comments
Open

issues with reference list file #99

rosave9 opened this issue May 27, 2022 · 0 comments

Comments

@rosave9
Copy link

rosave9 commented May 27, 2022

Hi
I'm having issues running a reference list I downloaded from: enve-omics.ce.gatech.edu/data/fastani
(I downloaded the D1 dataset).

I open it and turn it into a .txt file and I see this when I open it in text edit:

gi|918757418|ref|NZ_CP011489.1| Actinobacteria bacterium IMCC26256, complete genome
AACGTGGGGCAATATGAGTTCTCCACAGAGCGCATCAAGGGCCCCTACCATGTTTTCAGGCTGGGGACAA
CCTCGCAAACTCGCTTAGTTAGCGGCTCTCCGAGGTTATCCACCGTTCGCTATTCGGGCGCTAGTTTGCT (......)
next genome
NUCLEOTIDE-SEQUENCE (...)
etc

Then I opened it in command and I see :

Acetobacter_ghanensis.LargeContigs.fna
Acetobacterium_woodii_DSM_1030.LargeContigs.fna
Acetobacter_pasteurianus_IFO_3283_01.LargeContigs.fna
Acetobacter_senegalensis.LargeContigs.fna
etc

So I tried a few things

  1. I converted all .fna files into 1 single .txt file using
    cat *.fna >database.txt
    and tried running that .txt file as my --rf but that failed

  2. I copied all the following genome names (see below) into 1 txt file and used that as my --rf but still failed
    Acetobacter_ghanensis.LargeContigs.fna
    Acetobacterium_woodii_DSM_1030.LargeContigs.fna
    Acetobacter_pasteurianus_IFO_3283_01.LargeContigs.fna
    Acetobacter_senegalensis.LargeContigs.fna
    and more...

  3. To the file in Redirect ANI value to stdout #2 I included the path file so /nfs/turbo/lsi-NPDC/D1-folder/Acetobacter.. etc
    for each file and use that overall .txt file as my --rf and failed again

  4. Tried using the D1.tar.gz original file as my --rf but that also failed

Could I see an example of a --rf file or could I get guidance on how to use a NCBI prokaryote ref genomes database downloaded into my own pipeline?
Thanks

@rosave9 rosave9 changed the title example fo issues with reference list file May 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant