Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aborted (core dumped) with LTR digest #995

Open
omar-almolla209 opened this issue Nov 6, 2021 · 6 comments
Open

Aborted (core dumped) with LTR digest #995

omar-almolla209 opened this issue Nov 6, 2021 · 6 comments

Comments

@omar-almolla209
Copy link

Problem description

While using LTRdigest this error always pops up (which also appears in R studio using ltr digest via the LTRpred package)

This is a bug, please report it at
https://github.com/genometools/genometools/issues
Please make sure you are running the latest release which can be found at
http://genometools.org/pub/
You can check your version number with gt -version.
Aborted (core dumped)

Exact command line call triggering the problem

#PATH:
proteins="/home/omar-almulla/Downloads/"
genome="/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/genomes/"
gff3="/home/omar-almulla/Desktop/Prunus_TE_project/OUTPUT/EDTA_outputs/20-WGS-PCE.2.0/20-WGS-PCE.2.0_shortIDs.fasta.mod.EDTA.raw/LTR/"

gt ltrdigest -hmms $proteins/Pfam-A.hmm -aaout -outfileprefix ltrs_sorted -seqfile $genome/20-WGS-PCE.2.0_shortIDs.fasta -matchdescstart < $gff3/LTR/ltrs_sorted.gff3 > ltrdigest.gff3

What GenomeTools version are you reporting an issue for (as output by gt -version)?

gt (GenomeTools) 1.6.2
Copyright (c) 2003-2016 G. Gremme, S. Steinbiss, S. Kurtz, and CONTRIBUTORS
Copyright (c) 2003-2016 Center for Bioinformatics, University of Hamburg
See LICENSE file or http://genometools.org/license.html for license details.

Used compiler: cc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Compile flags: -g -Wall -Wunused-parameter -pipe -fPIC -Wpointer-arith -Wno-unknown-pragmas -O3 -Werror

What operating system (e.g. Ubuntu, Mac OS X), OS version (e.g. 15.10, 10.11) and platform (e.g. x86_64) are you using?

Ubuntu 20.04

@satta
Copy link
Member

satta commented Nov 11, 2021

Are you sure there are no more lines before the "This is a bug" line? I would need those to locate the issue, as they describe the error context.
Also, would you be OK with sharing some of your input files to help reproduce the problem? Thanks!

@omar-almolla209
Copy link
Author

Thanks for the reply. To date I have solved the problem by giving ltr digest the output obtained from the EDTA 1.6 version. The above error appeared only when using the EDTA version 1.9 outputs.
Unfortunately, I am not allowed to make the input files public as they are in the process of being published.
Anyway with these changes everything is ok:

`tRNAs="/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/Hmm_trna"
proteins="/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/Hmm_trna"
genome='/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/genomes/Prunus_avium_NCBI'
EDTA_output_1_6_path="/home/omar-almulla/Desktop/Prunus_TE_project/OUTPUTS/EDTA_output/Prunus_avium_NCBI/EDTA_1.6_output/Prunus_avium_NCBI_genomic.fna.EDTA.raw"
output="/home/omar-almulla/Desktop/Prunus_TE_project/OUTPUTS/LTRdigest_output/Prunus_avium_NCBI"

gt -j 4 ltrdigest -outfileprefix Prunus_avium_NCBI_ltr -trnas $tRNAs/plants-tRNA_cat.fa -hmms $proteins/hmm_* -seqfile $genome/Prunus_avium_NCBI_genomic.fna -matchdescstart $EDTA_output_1_6_path/Prunus_avium_NCBI_genomic.fna.LTR.intact.fa_SORTED_.1.6.gff3 > $output/Prunus_avium_NCBI_digest.gff
`

@satta
Copy link
Member

satta commented Nov 20, 2021

I see. I'll keep this one open but can not do much without the test data. I am unfortunately not familiar with EDTA or LTRpred but perhaps that tool creates weird GFF3 structure?

Anyway, could you please still share the line you got before the "this is a bug, please report" line, if that's OK for you? It should contain something like "Assertion failed: ..." and would at least help us place the error somewhere, and also make this issue searchable for others with a similar problem.

@omar-almolla209
Copy link
Author

My script:

gt -j 4 ltrdigest -outfileprefix Prunus_avium_ltr -trnas ./INPUT/Hmm_trna/plants-tRNA_cat.fa -hmms ./INPUT/Hmm_trna/hmm_* -seqfile ./INPUT/genomes/Prunus_avium_NCBI/Prunus_avium_NCBI.fna -matchdescstart ./OUTPUTS/EDTA_output/Prunus_avium_NCBI/EDTA_1.9_output/Prunus_avium_NCBI.fna.mod.EDTA.raw/*SORTED.gff3 > Prunus_avium_digest.gff

I could not replicate the same error. Now appear:

Segmentation fault (core dumped)

@omar-almolla209
Copy link
Author

##gff-version 3
##sequence-region CM024352.1 1 62324707
##sequence-region CM024353.1 1 46928806
##sequence-region CM024354.1 1 42862123
##sequence-region CM024355.1 1 37373756
##sequence-region CM024356.1 1 41299679
##sequence-region CM024357.1 1 42624765
##sequence-region CM024358.1 1 30632009
##sequence-region CM024359.1 1 38835769
##sequence-region JAAOZG010000014 1 51232
##sequence-region JAAOZG010000020 1 36342
##sequence-region JAAOZG010000023 1 31182
##sequence-region JAAOZG010000027 1 27350
##sequence-region JAAOZG010000035 1 22413
##sequence-region JAAOZG010000061 1 97395
CM024352.1 EDTA repeat_region 191737 201094 . ? . ID=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000657;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA target_site_duplication 191737 191741 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA long_terminal_repeat 191742 193449 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA LTR_retrotransposon 191742 201089 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000186;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA long_terminal_repeat 199383 201089 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA target_site_duplication 201090 201094 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT

CM024352.1 EDTA repeat_region 1617430 1629426 . ? . ID=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000657;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA target_site_duplication 1617430 1617434 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000434;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA long_terminal_repeat 1617435 1619599 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000286;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA Gypsy_LTR_retrotransposon 1617435 1629421 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0002265;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA long_terminal_repeat 1627258 1629421 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000286;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA target_site_duplication 1629422 1629426 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000434;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT

CM024352.1 EDTA repeat_region 1946186 1956558 . ? . ID=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000657;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
CM024352.1 EDTA target_site_duplication 1946186 1946190 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
CM024352.1 EDTA long_terminal_repeat 1946191 1948386 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
CM024352.1 EDTA LTR_retrotransposon 1946191 1956553 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000186;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
CM024352.1 EDTA long_terminal_repeat 1954358 1956553 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
CM024352.1 EDTA target_site_duplication 1956554 1956558 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT

@satta
Copy link
Member

satta commented Dec 20, 2021

I am afraid the GFF3 file is not enough for me to replicate the issue, I would also need the other files (sequence FASTA and tRNA files). Basically I need a way to trigger the error on my side with your command line call. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants