Skip to content

How to setup GERBIL

Michael Röder edited this page Aug 11, 2021 · 13 revisions

This page describes the needed steps to setup a local GERBIL system. Note that you don't have to do this if you want to test an annotator or a dataset since this can be done using our online system.

Downloading and running the system

Prerequesites

  • Java 1.7 or newer
  • maven 2

Download the program

The easiest way to get GERBIL is to download it from GitHub.

git clone -b master https://github.com/AKSW/gerbil.git

Download the data

If you are using linux, you can simply run start.sh which will download the data, extract it and starts the system listening on http://localhost:1234/gerbil .

If you are using another operation system or want to download the data to a specific folder you can get the data from https://github.com/AKSW/gerbil/releases/download/v1.2.5/gerbil_data.zip Please extract the data and configure the system.

Configure GERBIL

If you have used the start.sh script, the system should already be configured. Otherwise you should open src/main/properties/gerbil.properties and set org.aksw.gerbil.DataPath to the folder containing the data of the extracted gerbil.zip file.

Start GERBIL

GERBIL can simply be started by running the start.sh script or running mvn clean tomcat:run

Note that while starting GERBIL can show the following warning: [main] WARN [org.aksw.gerbil.datasets.datahub.DatahubNIFLoader] - <Couldn't get any datasets with the gerbil tag from DataHubIO. Exception: org.springframework.web.client.HttpClientErrorException: 404 Not Found>

This warning shouldn't cause any problems and can be ignored.

Restrictions

Due to licensing restrictions, we are not allowed to upload the following datasets:

  • AIDA/CoNLL
  • Microposts 2013
  • Microposts 2014

Regarding the annotators, it is possible that a key or registration is needed to use them without limitations. Please take a look at this wiki page: https://github.com/AKSW/gerbil/wiki/How-to-get-API-keys

With Docker

We provide a Docker image (dicegroup/gerbil) and a Docker compose file to run GERBIL easily. The compose file can be easily run by checking out the git project and executing the following:

docker-compose up

The following directories might be interesting for mounting them to a local directory

Directory Description
/usr/local/tomcat/gerbil_properties The directory contains all properties files used to configure GEBRIL.
/usr/local/tomcat/database The directory contains GERBIL's database. It should be mounted to persist experiment results.
/usr/local/tomcat/cache The directory contains caches. Mounting it can speed up future experiments.
/usr/local/tomcat/datasets The directory containing datasets.
/usr/local/tomcat/indexes The GERBIL image comes without the two indexes that the start.sh script would download. These indexes can be downloaded to a local directory (using scripts/download_indexes.sh) and mounted into the container using this directory.

The Docker container will copy properties files and the datasets, which come with the gerbil_data.zip file into the gerbil_properties and datasets directories if they are not already in these directories. This behavior can be deactivated by setting the environmental variables GERBIL_COPY_DIRS and/or GERBIL_COPY_PROPS to false.