Skip to content
Michael Röder edited this page Dec 14, 2015 · 2 revisions

The Natural Language Interchange Format (NIF) is an RDF vocabulary that can be used to describe natural language resources that can be interchanged between different systems. GERBIL uses is to sent and receive documents and annotations from the benchmarked annotators. For more general information about NIF, visit its website.

Some NIF basics

In this section, we want to show some of the core concepts of NIF that might be important to understand the way how documents are sent and received by GERBIL.

How text can be annotated

In NIF, additional information about a part of a text, e.g., information about a named entity inside the text, can be added by using an RDF node that points to the texts RDF node with the nif:referenceContext property. Note that the texts RDF node has to have the type nif:Context.

URIs

NIF resources typically have their character boundings at the end of their URI. Let's assume that there is a text ex:Text with 100 characters. The URI of the RDF node representing this text in NIF would be ex:Text#char=0,100. An annotation inside the text starting at character 42 with a length of 10 would have the URI ex:Text#char=42,52. However, while the URIs typically already include the positions, NIF defines the two properties nif:beginIndex and nif:endIndex that are used to add the begin and end positions to the RDF nodes.

Positions

In NIF, positions are determined by counting character points. While in simple texts, there might be no difference between counting characters and character points, it is important to be aware of the fact that these two ways of counting can differ. In Java, the length of a String in codepoints can be determined in the following way:

String text = ...;
int length = text.codePointCount(0, text.length());

Note that - like in Java - the end position of a String in NIF is the first position behind the String.

NIF properties

In the following table, there are some helpful properties that can be used to express features of an annotation.

Property Meaning Comment
nif:anchorOf Contains the String the annotation is referencing inside the referenced text. optional
nif:beginIndex Defines the start position of the String. mandatory
nif:endIndex Defines the first position after the String. mandatory
nif:referenceContext References the text to which this annotation belongs to. mandatory
ITSRDF.taClassRef References to URIs defining the type of the String. should be present in the result of entity typing tasks
ITSRDF.taConfidence Defines a confidence value for this annotation. optional
itsrdf:taIdentRef References to URIs defining the meaning of the String. should be present in the result of linking tasks

NIF document in GERBIL

In the version supported by GERBIL, NIF does not define the type Document. However, GERBIL parses an RDF node as document if this node has the type nif:Context and has the property nif:isString.

During the communication with NIF based webservices, GERBIL sends and expects to receive single NIF documents. using the Turtle serialization, such a document can look like this:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146>
        a                     nif:RFC5147String , nif:String , nif:Context ;
        nif:beginIndex        "0"^^xsd:nonNegativeInteger ;
        nif:endIndex          "146"^^xsd:nonNegativeInteger ;
        nif:isString          "Florence May Harding studied at a school in Sydney, and with Douglas Robert Dundas , but in effect had no formal training in either botany or art."@en .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=44,50>
        a                     nif:RFC5147String , nif:String ;
        nif:anchorOf          "Sydney"@en ;
        nif:beginIndex        "44"^^xsd:nonNegativeInteger ;
        nif:endIndex          "50"^^xsd:nonNegativeInteger ;
        nif:referenceContext  <http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146> ;
        itsrdf:taIdentRef     <http://dbpedia.org/resource/Sydney> .