Skip to content

Java-based Semantic Faceted Search Framework featuring a Text UI, Fluent API and Benchmark Generator

License

Notifications You must be signed in to change notification settings

Scaseco/facete3

Repository files navigation

Develop

Facete3 Faceted Search Framework

Facete is a faceted search framework for SPARQL-accessible data. We are working on aggregeations, so actually its becoming more of a SPARQL-based business intelligence system.

A brief history

  • Facete 1 was probably the first pure production JavaScript SPARQL-based faceted search system deployed at the European Open Dataportal at around 2012
  • Facete 2 was an re-implementation based on Angular 1 (JavaScript) at around 2015
  • Facete 3 is the current iteration which finally got large parts of the API just right - this time its Java.

Factete 3 High Level Components

The project comprises the following component:

Notable Features

  • RDF through and through: Most if not all state is captured in Jena RDF Models and is accessed using Java domain interfaces which hide the RDF from the application layer; a pattern for which Jena provides native support.
  • Set theoretic approach to faceted search: All sets and relations involved to realize faceted search are ultimatively expressed as SPARQL queries. An RDF extensible model serves as the basis for the specification of these sets and relations.
  • Query Rewriting over Construct Views allows filtering and sorting by facets that are not in the SPARQL endpoint
  • Query Rewriting to enable access to blank nodes of remote SPARQL endpoints. See the list of supported RDF database management systems.
  • Data Query API for declaration of retrieval of hierarchical structures from remote SPARQL endpoints; somewhat akin to a programmatic GraphQL API for RDF.

Teasers

Here are a few teasers to give you an impression of the project before you read on.

A screenshot of the Facete3 terminal application on Scholarly Data's SPARQL endpoint:

Docker Images

  • Facete3 terminal app
    • amd64: docker run -it aklakan/facete3
    • arm64: docker run -it aklakan/facete3-arm64

Latest Updates

  • Upcoming
    • Integrated IRI retrieval / download feature
  • 2022-05-04
    • Docker images available on dockerhub for the terminal application
  • 2020-03-10
    • Support for reading from stdin, can be used like this: ./script-that-ouputs-rdf.sh | facete3 - (- indicates to read from stdin)
    • Improved scalability by adding paginator to facet values
  • 2019-12-20
    • Facete3 UI Enhancements
      • Added HDT support
      • Ordering facets and facet values by RDF term and counts
      • Pressing 's' for '(s)how query' context-sensitively brings up a message dialog with the SPARQL query supplying the UI component's displayed data

Screenshot

And here a teaser for what the Facete3 core API looks like - reactive streams powered by RxJava2:

class TestFacetedQuery {
    @Test
    public void testComplexQuery() {
        RDFConnection conn = RDFConnectionFactory.connect(someDataset);
        FacetedQuery fq = fq = FacetedQueryImpl.create(conn);

        FacetValueCount fc =
                // -- Faceted Browsing API
                fq.root()
                .fwd(RDF.type).one()
                    .constraints()
                        .eq(OWL.Class).activate()
                    .end()
                .parent()
                .fwd()
                .facetValueCounts()    
                // --- DataQuery API
                .randomOrder()
                .limit(1)
                .exec()
                // --- RxJava API
                .firstElement()
                .timeout(10, TimeUnit.SECONDS)
                .blockingGet();

        System.out.println("FacetValueCount: " + fc);
    }
}

Building

This project uses Apache Maven and is thus built with:

mvn clean install

⚠️ You need to add facete3-core/facete3-impl/target/generated-sources/apt to the build path!

  • The Facete3 bundle is built under facete3-bundle/target/facete3-bundle-VERSION-jar-with-dependencies.jar with VERSION matching the project version. The bundles are also available for download from the Releases Section.
  • Debian packages are built under facete3-core-parent/facete3-debian-cli and facete3-fsbg-parent/facete3-fsbg-debian-cli. Because they share most of the code, we will combine them into a single one.

Running the bundle

  • The Facete3 Terminal App
# Show help
java -cp facete3-bundle-VERSION-jar-with-dependencies.jar facete3 --help

# Run against a local file, remote RDF document or SPARQL endpoint
java -cp facete3-bundle-VERSION-jar-with-dependencies.jar facete3 http://www.w3.org/1999/02/22-rdf-syntax-ns#

# For installation from the debian package, the command is
facete3 --help
  • Faceted Search Benchmark Generator (fsbg)
# Show help
java -cp facete3-bundle-VERSION-jar-with-dependencies.jar fsbg --help

# Generate a benchmark using default settings
# against a given SPARQL endpoint
java -cp facete3-bundle-VERSION-jar-with-dependencies.jar fsbg http://localhost:8890/sparql


# Generate a benchmark using alternative config (class path or file)
# against a given SPARQL endpoint
# Note: config-tiny.ttl is part of the classpath
java -cp facete3-bundle-VERSION-jar-with-dependencies.jar fsbg -c config-tiny.ttl http://localhost:8890/sparql

# For installation from the debian package, the command is
facete3-fsbg --help

Please refer to the respective Facete 3 component READMEs for details about how to use them.

License

The source code of this repo is published under the Apache License Version 2.0. Dependencies may be licensed under different terms. When in doubt please refer to the licenses of the dependencies declared in the pom.xml files.