Skip to content

buda-base/jena-stable-turtle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jena Stable Turtle output plugin

This repository contains code to define new RDF Writers for Jena which is turtle always sorted in the same way. It has been developed to reduce the diff noise when the data is stored on a git repository, we are confident there are plenty of other use cases where it will be useful.

The repository contains two writers, for the Turtle and TriG formats.

Changes from the stock turtle output

Sorting some particular cases

There is always some arbitrary decisions to be taken for some cases. We took the following when sorting objects:

  • first URIs (sorted) then literals (sorted) then blank nodes
  • first rdf:langStrings then xsd:strings then numbers then everything else, sorted by type uri then value
  • rdf:langStrings are sorted by lang then value, in the root unicode collator (not in the locale corresponding to the language)
  • numbers are sorted first by value then by type uri ("+1"^^xsd:integer < "1"^^xsd:integer < "+1"^^xsd:nonNegativeInteger < "1.2"^^xsd:float < "2"^^xsd:integer)

Installation

Using maven:

    <dependency>
      <groupId>io.bdrc</groupId>
      <artifactId>jena-stable-turtle</artifactId>
      <version>0.7.2</version>
    </dependency>

build and deploy:

mvn clean package
mvn deploy -DperformRelease=true

Then go to https://oss.sonatype.org/ and do the close and release

Use

From Java

// register the STTL writer
Lang sttl = STTLWriter.registerWriter();
// build a map of namespace priorities
SortedMap<String, Integer> nsPrio = ComparePredicates.getDefaultNSPriorities();
nsPrio.put(SKOS.getURI(), 1);
nsPrio.put("http://purl.bdrc.io/ontology/admin/", 5);
nsPrio.put("http://purl.bdrc.io/ontology/toberemoved/", 6);
// build a list of predicates URIs to be used (in order) for blank node comparison
List<String> predicatesPrio = CompareComplex.getDefaultPropUris();
predicatesPrio.add("http://purl.bdrc.io/ontology/admin/logWhen");
predicatesPrio.add("http://purl.bdrc.io/ontology/onOrAbout");
predicatesPrio.add("http://purl.bdrc.io/ontology/noteText");
// pass the values through a Context object
Context ctx = new Context();
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsPriorities"), nsPrio);
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsDefaultPriority"), 2);
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "complexPredicatesPriorities"), predicatesPrio);
// the base indentation, defaults to 4
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsBaseIndent"), 4);
// the minimal predicate width, defaults to 14
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "predicateBaseWidth"), 14);
// longest length for subject to be on the same line with the predicate, defaults to 20
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "longSubject"), 20);
// put multiple objects on separate lines each, defaults to false
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "objectsMultiLine"), false);
// put final dot on new line for named subjects, defaults to false
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "namedDotNewLine"), false);
Graph g = ... ; // fetch the graph you want to write
RDFWriter w = RDFWriter.create().source().context(ctx).lang(sttl).build();
w.output( ... ); // write somewhere

Note that for TriG order, you must use the same context namespace as for turtle: STTLWriter.SYMBOLS_NS.

Set the symbol STTLWriter.SYMBOLS_NS + "onlyWriteUsedPrefixes" to true to only write prefixes that are actually used.

Command line

Put the compiled .jar file into the jena class path and then call

riot --pretty sttl yourfile.ttl

License

All the code on this repository is under the Apache 2.0 License.

The original parts are Copyright © 2017-2019 Buddhist Digital Resource Center, and the files TurtleShell.java (coming from the Jena repository) and TriGShell.java (extracted from this file) are Copyright © 2011-2017 Apache Software Foundation (ASF), see NOTICE for more information.