Skip to content

BlueObelisk/oscar4

Repository files navigation

OSCAR4

Java CI with Maven Maven Central

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles. It can be used to identify chemical names, reaction names, ontology terms, enzymes and chemical prefixes and adjectives, and chemical data such as state, yield, IR, NMR and mass spectra and elemental analyses. In addition, where possible, any chemical names detected will be annotated with structures derived either by lookup, or name-to-structure parsing using OPSIN or with identifiers from the ChEBI (`Chemical Entities of Biological Interest’) ontology.

OSCAR has been under development since 2002. The current version, OSCAR4, focuses on providing a core library that facilitates integration with other tools. Its simple to use API is modularised to promote extension into other domains and allows for its use within workflow systems like Taverna and U-Compare.

OSCAR is developed by the Murray-Rust research group at the Unilever Centre for Molecular Science Informatics, University of Cambridge. The corresponding publication can be found here and the authors would appreciate it if this is cited in any work that makes use of the code.

Examples

The following code will identify chemical named entities in text, and output a list of them together with their Standard InChI, when available.

String s = "....";

Oscar oscar = new Oscar();
List<ResolvedNamedEntity> entities = oscar.findAndResolveNamedEntities(s);
for (ResolvedNamedEntity ne : entities) {
    System.out.println(ne.getSurface());
    ChemicalStructure stdInchi = ne.getFirstChemicalStructure(FormatType.STD_INCHI);
    if (stdInchi != null) {
        System.out.println(stdInchi);
    }
    System.out.println();
}

Support

Issue/Feature Request Tracker

Mailing List (Google Group)

Downloads

OSCAR4 is available for download:

OSCAR4-5.2.0 JAR with dependencies

About

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles.

Topics

Resources

License

Stars

Watchers

Forks

Languages