Guidelines for developing NIF-based NLP services

This document describes best practices to follow for the implementation of RESTful NLP web services that rely on the NLP Interchange Format (NIF). „NIF is an RDF/OWL-based format that aims to achieve interoperability between NLP tools language resources and annotations.“ As a proof-of-concept, we have implemented NIF wrappers for the Stanford POS tagger and Stanford parser. Both are licensed under Creative Commons Attribution 4.0 International.

Example Implementation

Wrapping the Stanford POS Tagger

Given the content of a file namend example.ttl

@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25>
a             nif:Context , nif:RFC5147String , nif:Sentence ;
nif:isString  "This is a sample sentence"^^xsd:string .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "This"^^xsd:string ;
nif:beginIndex        "0"^^xsd:int ;
nif:endIndex          "4"^^xsd:int ;
nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:sentence	      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "is"^^xsd:string ;
nif:beginIndex        "5"^^xsd:int ;
nif:endIndex          "7"^^xsd:int ;
nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ;
nif:sentence	      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "a"^^xsd:string ;
nif:beginIndex        "8"^^xsd:int ;
nif:endIndex          "9"^^xsd:int ;
nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:sentence	      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "sample"^^xsd:string ;
nif:beginIndex        "10"^^xsd:int ;
nif:endIndex          "16"^^xsd:int ;
nif:nextWord          <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25> ;
nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:sentence	      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "sentence"^^xsd:string ;
nif:beginIndex        "17"^^xsd:int ;
nif:endIndex          "25"^^xsd:int ;
nif:previousWord      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:sentence	      <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:referenceContext  <e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

our web service wrapping the Stanford POS tagger can be invoked via curl using the following example call.

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=true --data-urlencode i="$(<example.ttl)"

The input is expected to be in NIF format and to contain at least one nif:Context element as well as a set of nif:Word elements. The service reads the nif:anchorOf values of all nif:Words elements belonging to a given nif:Context found in the input and passes them to the Stanford POS tagger. Each word is then annotated by adding a nif:posTag property with the POS tag as a literal value to the nif:Word.

The example output of the service can be found here:

@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "This"^^xsd:string ;
nif:beginIndex        "0"^^xsd:int ;
nif:endIndex          "4"^^xsd:int ;
nif:nextWord          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:posTag            "DT"^^xsd:string ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7>
a                     nif:Word , nif:RFC5147String ;
nif:anchorOf          "is"^^xsd:string ;
nif:beginIndex        "5"^^xsd:int ;
nif:endIndex          "7"^^xsd:int ;
nif:nextWord          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:posTag            "VBZ"^^xsd:string ;
nif:previousWord      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25>
a             nif:Context , nif:RFC5147String , nif:Sentence ;
nif:isString  "This is a sample sentence"^^xsd:string .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "sample"^^xsd:string ;
nif:beginIndex        "10"^^xsd:int ;
nif:endIndex          "16"^^xsd:int ;
nif:nextWord          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25> ;
nif:posTag            "NN"^^xsd:string ;
nif:previousWord      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9>
a                     nif:Word , nif:RFC5147String ;
nif:anchorOf          "a"^^xsd:string ;
nif:beginIndex        "8"^^xsd:int ;
nif:endIndex          "9"^^xsd:int ;
nif:nextWord          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:posTag            "DT"^^xsd:string ;
nif:previousWord      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "sentence"^^xsd:string ;
nif:beginIndex        "17"^^xsd:int ;
nif:endIndex          "25"^^xsd:int ;
nif:posTag            "NN"^^xsd:string ;
nif:previousWord      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

Wrapping the Stanford Parser

Our web service wrapping the Stanford dependency parser can be invoked via curl using the following example call where the input is assumed to be given in a turtle file called input.tll.

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordParserWebService/NifStanfordParser -d v=true --data-urlencode i="$(<input.ttl)"

The service can be used to parse input that is already POS tagged. I.e. it expects the input to be in NIF format and contain

at least one nif:Context element
one nif:Word element for each word in the nif:isString property of its context containing a POS annotation in nif:posTag and the represented string in nif:anchorOf.

The words are ordered by context (using nif:referenceContext) and position (using nif:beginIndex) in order to reconstruct the original texts. The service then passes the annotated input to the Stanford parser. For each dependency relation of the parse a nif:dependency property is added to the relation's head with the URI of the dependent word as object. As a word can only have one head, the type of the relation is annotated in the nif:dependencyRelationType property of the dependent word (as a literal).

@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4>
a                           nif:RFC5147String , nif:Word ;
nif:anchorOf                "This"^^xsd:string ;
nif:beginIndex              "0"^^xsd:int ;
nif:dependencyRelationType  "nsubj"^^xsd:string ;
nif:endIndex                "4"^^xsd:int ;
nif:nextWord                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:posTag                  "DT"^^xsd:string ;
nif:referenceContext        <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7>
a                           nif:Word , nif:RFC5147String ;
nif:anchorOf                "is"^^xsd:string ;
nif:beginIndex              "5"^^xsd:int ;
nif:dependencyRelationType  "cop"^^xsd:string ;
nif:endIndex                "7"^^xsd:int ;
nif:nextWord                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:posTag                  "VBZ"^^xsd:string ;
nif:previousWord            <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ;
nif:referenceContext        <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25>
a             nif:Context , nif:RFC5147String , nif:Sentence ;
nif:isString  "This is a sample sentence"^^xsd:string .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16>
a                           nif:RFC5147String , nif:Word ;
nif:anchorOf                "sample"^^xsd:string ;
nif:beginIndex              "10"^^xsd:int ;
nif:dependencyRelationType  "nn"^^xsd:string ;
nif:endIndex                "16"^^xsd:int ;
nif:nextWord                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25> ;
nif:posTag                  "NN"^^xsd:string ;
nif:previousWord            <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ;
nif:referenceContext        <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9>
a                           nif:Word , nif:RFC5147String ;
nif:anchorOf                "a"^^xsd:string ;
nif:beginIndex              "8"^^xsd:int ;
nif:dependencyRelationType  "det"^^xsd:string ;
nif:endIndex                "9"^^xsd:int ;
nif:nextWord                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:posTag                  "DT"^^xsd:string ;
nif:previousWord            <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:referenceContext        <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence                <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

<uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=17,25>
a                     nif:RFC5147String , nif:Word ;
nif:anchorOf          "sentence"^^xsd:string ;
nif:beginIndex        "17"^^xsd:int ;
nif:dependency        <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ,
                      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=8,9> ,
                      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,4> ,
                      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=5,7> ;
nif:endIndex          "25"^^xsd:int ;
nif:posTag            "NN"^^xsd:string ;
nif:previousWord      <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=10,16> ;
nif:referenceContext  <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> ;
nif:sentence          <uuid:e899ea51-fb30-4102-8cdd-9d0ec691a0db#char=0,25> .

Chaining

As one of the services described above (the tagger) produces output the other one (the parser) relies on, they can be used to demonstrate the integration of NIF compliant NLP services.

The following nested call combines both calls from the previous two examples. It invokes the tagger which produces the output of example 1 and passes this POS annotated NIF data to the parser. The output is the same as in Exmple 2.

curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordPOSTaggerWebService/NifStanfordPOSTagger -d v=true --data-urlencode i="$(<example.ttl)"
| curl -G http://sc-lider.techfak.uni-bielefeld.de/NifStanfordParserWebService/NifStanfordParser -d v=true --data-urlencode i@-

Natural Language Processing Interchange Format (NIF)

Recommended service parameters

Log Messages

Example Implementation

Chaining

References