This repository contains an evaluation setup that compares:
- Jena's default spatial index implementation, referred to as vanilla
- Our improved implementation, referred to as geoplus
The evaluation is uses our simple GridBench benchmark.
We are in the process of finalizing the RDF evaluation dataset generation and deployment pipeline. The links will be updated to final versions in the coming days. ~ 2024-03-28
Alpha deployments of the datasets are deployed under:
http://maven.aksw.org/repository/snapshots/org/aksw/eval/gridbench/jena/
In our evaluation we use three sets of queries which target the same spatial regions but differ in the sets of graphs they affect.
ng-one
: Benchmark queries target a single named graph in the dataset. UsesGRAPH <CONST>
.
Click here to show the query
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT (count(*) AS ?c)
WHERE
{ GRAPH <http://www.example.org/graph/0>
{ BIND("POLYGON((-90 -90, -90 -78.75, -78.75 -78.75, -78.75 -90, -90 -90))"^^geo:wktLiteral AS ?queryGeom)
?feature spatial:intersectBoxGeom ( ?queryGeom ) ;
geo:hasGeometry ?featureGeom .
?featureGeom geo:asWKT ?featureGeomWkt
FILTER geof:sfIntersects(?featureGeomWkt, ?queryGeom)
}
}
ng-all
: Benchmark queries target all named graphs in the dataset. UsesGRAPH ?g
.
Click here to show the query
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT (count(*) AS ?c)
WHERE
{ GRAPH ?g
{ BIND("POLYGON((-90 -90, -90 -78.75, -78.75 -78.75, -78.75 -90, -90 -90))"^^geo:wktLiteral AS ?queryGeom)
?feature spatial:intersectBoxGeom ( ?queryGeom ) ;
geo:hasGeometry ?featureGeom .
?featureGeom geo:asWKT ?featureGeomWkt
FILTER geof:sfIntersects(?featureGeomWkt, ?queryGeom)
}
}
ug
: Benchmark queries target the union default graph, i.e. a view over all named graphs. Does not useGRAPH
.
Click here to show the query
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT (count(*) AS ?c)
WHERE
{ { BIND("POLYGON((-90 -90, -90 -78.75, -78.75 -78.75, -78.75 -90, -90 -90))"^^geo:wktLiteral AS ?queryGeom)
?feature spatial:intersectBoxGeom ( ?queryGeom ) ;
geo:hasGeometry ?featureGeom .
?featureGeom geo:asWKT ?featureGeomWkt
}
FILTER geof:sfIntersects(?featureGeomWkt, ?queryGeom)
}
- eval-geoplus-ng-one
- eval-geoplus-ng-all
- eval-geoplus-ug
- eval-vanilla-ng-one
- eval-vanilla-ng-all
- eval-vanilla-ug
Coming soon. ~ 2024-03-28
Show Query
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX lsq: <http://lsq.aksw.org/vocab#>
PREFIX agg: <http://jena.apache.org/ARQ/function/aggregate#>
SELECT ?time ?value WHERE {
{ SELECT ?benchmarkRun (AVG(?duration) AS ?durationAvg) (agg:stdev(?duration) AS ?durationStdev) {
GRAPH ?query {
?query lsq:hasLocalExec ?localExec .
?localExec
lsq:benchmarkRun ?benchmarkRun ;
lsq:hasQueryExec/lsq:evalDuration ?duration .
}
} GROUP BY ?benchmarkRun }
GRAPH ?query {
?query
lsq:hasLocalExec ?localExec ;
geo:hasGeometry/geo:asWKT ?wkt .
?localExec
lsq:benchmarkRun ?benchmarkRun ;
lsq:hasQueryExec ?queryExec .
?queryExec
prov:atTime ?time ;
lsq:evalDuration ?duration .
}
GRAPH ?runGraph { ?benchmarkRun lsq:runId ?runId }
FILTER(?runId = 0)
# BIND((?duration - ?durationAvg) / ?durationStdev AS ?value) # How many sigmas a value differs from the average
# BIND(?duration - ?durationAvg AS ?value)
BIND(?duration AS ?value)
}
ORDER BY ASC(?time)
In an attempt to make the benchmark as reproducible as possible, we packaged it up as an Apache Maven build.
- Apache Maven must be installed
- A running Docker deamon
The default configuration requires ~48GB of free RAM.
- Install the eval-template
cd eval-template
mvn install
- Run the actual evaluation
cd eval-parent
mvn package
- The query runtimes are collected in a trig dataset with one named graph per query under:
./eval-parent/eval-EXPERIMENT/target/bench.trig
The amount of grid cells and the number of graphs can be configured in the eval-template/pom.yml
.