Perfomance comparison Oxigraph vs. QLever #841
hannahbast
started this conversation in
General
Replies: 1 comment
-
P.S. In the meantime, I have included other engines in the comparison (it's now: Oxigraph, Apache Jena, Stardog, Blazegraph, Virtuoso, QLever). Results are reported here: https://github.com/ad-freiburg/qlever/wiki/QLever-performance-evaluation-and-comparison-to-other-SPARQL-engines . The first table gives a nice overview. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear Thomas,
I finally found the time to play around with Oxigraph a bit. Let me first say that I am super impressed with the extent and the professionalism of this project and this repository. All the more since so far this has essentially been a one-person project. It's a great idea to provide the various components (like the RDF parser or the SPARQL parser) as independent modules/crates. Compilation from source was unproblematic and the command-line interface is easy to use and self-explanatory. In particular, it is very easy to load a dataset and start a server. Everything just works. In the world of academe, this is the absolute exception.
I compared loading time, index size, and query time of Oxigraph vs. QLever for a moderately sized RDF dataset, namely https://dblp.org/rdf/dblp.ttl.gz (1.7 GB compressed, 390 M triples), and a variety of queries (see below). Everything was run on an AMD Ryzen 9 7950X 16-Core machine with 128 GB and 7.1 TB of NVMe SSD (high-quality but affordable consumer hardware, total cost around 2500 €).
Loading time was 640s for Oxigraph (0.6 M triples/sec) vs. 231s for QLever (1.7 M triples/sec) on NVMe SSD. On HDD, it was 2537s for Oxigraph vs. 270s for QLever (apparently, Oxigraph makes heavy use of random access during loading). The total size of the index files was 66.5 GB for Oxigraph vs. 7.7 GB for QLever (apparently, Oxigraph doesn't compress much yet). I am curious whether the proportions of these stats carry over to a larger dataset like Wikidata (19 B triples). For QLever, load time and index size are essentially proportional to the size of the input dataset.
Here are the results for six queries from https://qlever.cs.uni-freiburg.de/dblp ("Examples"), selected for their variety. For QLever, the cache was cleared before each query. For Oxigraph, no special precautions were taken, except that the server was started from scratch once at the beginning. Both servers were run on SSD. For Oxigraph, it can make a huge difference when the disk cache is empty (
sudo bash -c "sync; sleep 5; echo 3 > /proc/sys/vm/drop_caches"
). For the queries strongly affected by this, I indicated this by writing X -> Y, where X is the query time with empty disk cache and Y is the query time when repeating the query.Beta Was this translation helpful? Give feedback.
All reactions