Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using make-species-subset command to extract plant specific terms from go-plus.owl #283

Open
wkpalan opened this issue May 13, 2019 · 9 comments

Comments

@wkpalan
Copy link

wkpalan commented May 13, 2019

I am trying to follow steps mentioned in #165 to make a plant specific obo file using the following command.

 owltools go-plus.owl --reasoner hermit --make-species-subset -t NCBITaxon:33090 -o -f obo plant.obo

This does work and produce a plant.obo file, and I do not encounter memory errors, but there are multiple warnings output by the command. I have given an example of the warnings I see.

2019-05-13 16:44:47,724 WARN  (ChangeIndexingProcessor:66) [reasoner.indexing.axiomIgnored]ELK does not support ObjectAllValuesFrom. Axiom ignored: 
SubClassOf(<http://purl.obolibrary.org/obo/GO_0048838> 
ObjectAllValuesFrom(<http://purl.obolibrary.org/obo/RO_0002162> 
<http://purl.obolibrary.org/obo/NCBITaxon_33090>))

I was wondering what this warning means. The GO term itself is plant specific, but does not show up in the plants.obo file that was produced by the command that was run above

id: GO:0048838
name: release of seed from dormancy
namespace: biological_process
def: "The process in which the dormant state is broken in a seed. 
Dormancy is characterized by a suspension of physiological activity 
that can be reactivated upon release." 
[GOC:dph, GOC:jid, GOC:tb, ISBN:9781405139830]
is_a: GO:0010162 ! seed dormancy process
is_a: GO:0097438 ! exit from dormancy

The owl section for GO:0048838 is shown below. It seems that there are two restrictions only_in_taxon and in_taxon, and one of those has the allValuesFrom mentioned in the warning. Is there a workaround that I am missing?

<owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0048838">
    <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0010162"/>
    <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0097438"/>
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002160"/>
            <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_33090"/>
        </owl:Restriction>
    </rdfs:subClassOf>
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002162"/>
            <owl:allValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_33090"/>
        </owl:Restriction>
    </rdfs:subClassOf>
    <obo:IAO_0000115 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The process in which the dormant state is broken in a seed. Dormancy is characterized by a suspension of physiological activity that can be reactivated upon release.</obo:IAO_0000115>
    <oboInOwl:hasOBONamespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">biological_process</oboInOwl:hasOBONamespace>
    <oboInOwl:id rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GO:0048838</oboInOwl:id>
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">release of seed from dormancy</rdfs:label>
</owl:Class>
<owl:Axiom>
    <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/GO_0048838"/>
    <owl:annotatedProperty rdf:resource="http://purl.obolibrary.org/obo/IAO_0000115"/>
    <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The process in which the dormant state is broken in a seed. Dormancy is characterized by a suspension of physiological activity that can be reactivated upon release.</owl:annotatedTarget>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GOC:dph</oboInOwl:hasDbXref>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GOC:jid</oboInOwl:hasDbXref>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GOC:tb</oboInOwl:hasDbXref>
    <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ISBN:9781405139830</oboInOwl:hasDbXref>
</owl:Axiom>
@wkpalan wkpalan changed the title Using make-species-subset command to make plant specific OBO Using make-species-subset command to extract plant specific terms from go-plus.owl May 13, 2019
@balhoff
Copy link
Member

balhoff commented May 14, 2019

@wkpalan this is a warning from the ELK reasoner, and it is nothing to worry about. GO uses a few OWL axioms that are not part of the OWL EL profile, which describes the kind of reasoning that ELK can do. It just means that ELK will ignore these axioms for the purpose of the classification it is able to do.

I see you specified "hermit", but without looking into the owltools code, I'm not sure why it's using ELK instead of HermiT. However if you really try to classify go-plus with HermiT, you will be waiting a VERY long time.

@wkpalan
Copy link
Author

wkpalan commented May 14, 2019

Thanks @balhoff. I will ignore the warning for now.

I'm sorry, I had pasted the wrong command. I used elk reasoner first

owltools go-plus.owl --reasoner elk --make-species-subset -t NCBITaxon:33090 -o -f obo plant.obo

then I tried to use hermit to see if that would work. As you mentioned that command has not finished till now and using all of allocated memory of 60GB.

I am still unclear why the specific GO term is not included in the plant.obo output file.

@balhoff
Copy link
Member

balhoff commented May 14, 2019

@wkpalan do you get any other errors when making the obo file, such as "duplicate labels"? I wonder if you are experiencing something related to this issue: geneontology/go-ontology#17263

@wkpalan
Copy link
Author

wkpalan commented May 14, 2019

I am currently checking it. I'll get back to you once I have results.

@wkpalan
Copy link
Author

wkpalan commented May 14, 2019

I ran the above command and output to two different formats obo and owl.

obo

owltools go-plus.owl --reasoner hermit --make-species-subset -t NCBITaxon:33090 -o -f obo plant.obo > obo.out.txt 2>&1

obo.out.txt

owl

owltools go-plus.owl --reasoner hermit --make-species-subset -t NCBITaxon:33090 -o -f owl plant.owl > owl.out.txt 2>&1

owl.out.txt

Neither command produced errors, but neither command includes the GO term from above in the output

@wkpalan
Copy link
Author

wkpalan commented May 14, 2019

I downloaded the latest go-plus.owl from http://purl.obolibrary.org/obo/go/extensions/go-plus.owl and now I get the same error mentioned in geneontology/go-ontology#17263

@balhoff
Copy link
Member

balhoff commented May 15, 2019

@wkpalan I am working to clean up the double labels in the GO imports, but in the meantime you can get an OBO file by using --no-check:

owltools go-plus.owl --reasoner hermit --make-species-subset -t NCBITaxon:33090 -o -f obo --no-check plant.obo > obo.out.txt 2>&1

@wkpalan
Copy link
Author

wkpalan commented May 24, 2019

Hi Jim, I started a new job so didn't have time to follow up, but I am running the command currently. How much memory and how long do you think this would take? I have a server which has 120GB and a maximum walltime of 5 days with 16 CPU cores. First time around it ran for a 24 hours and failed.

@balhoff
Copy link
Member

balhoff commented May 29, 2019

@wkpalan were you using hermit? I had that in my example command because it was in yours, but I think you will have better success with elk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants