Skip to content
Chunliang Lyu edited this page May 10, 2017 · 9 revisions

In order to maintain the highest possible degree of reproducibility of experiments, we define our understanding of open APIs and open datasets in the following. The goal of GERBIL is to establish a fair test benchmark for entity annotation systems.

Licences

To the best of our knowledge, we do not go against any license specification (datasets, annotators) in GERBIL. All implementations of annotators are instructed to (1) not log any of the input data provided during the evaluation and (2) not log any results generated during the evaluation. Liabilities created by the tools are to be pointed out to the tool developers. Please feel free to contact the development team if any concerns arise and we will be happy to remove your dataset/tool from the framework.

Included APIs

Why is your API not in GERBIL? Well, we have some restrictions pertaining to the APIs we evaluate. Everyone can implement a wrapper for his annotation service or implement the NIF input and output format in his webservice to use GERBIL. Public annotators in GERBIL have to have:

  • An accessible API without key, OR
  • An accessible API with a key. Thereby, the way to get the key has to be documented and the key itself must free for academic use (see How to get API keys).
  • The software or algorithms underlying the software for publicly available annotators in GERBIL has to be
    a) Published in a peer-reviewed way, b) The source code may or may not be available on shared platforms such as Github. c) If the source code is not available, a binary of the approach has to be available.

A description of each annotator is given in the publication and see API descriptions. If you abide by these restriction, please feel free to (1) implement a wrapper or an interface for your tool and (2) send us a pull request so that your tool can be integrated in the next version of GERBIL.

Datasets

Why is your dataset not part of GERBIL? Well, we also have some restrictions here. Everyone can implement a wrapper for a dataset or hand-over his NIF-based dataset to GERBIL via URI, file upload or Dathub upload. Datasets in GERBIL have to have i) No licence (in which case we assume them free to use for academic purposes) or ii) A licence allowing redistribution (since GERBIL sends the datasets to the annotators) for scientific use. iii) In case of a licence which prevents redistribution, the dataset can still be used in GERBIL as the GERBIL users ensure that they will not log the benchmark data.

For each dataset a pointer to the dataset and a short description on how to obtain the dataset will be provided, see Licenses for datasets.