Skip to content

Latest commit

 

History

History
10 lines (9 loc) · 983 Bytes

README.md

File metadata and controls

10 lines (9 loc) · 983 Bytes

Evaluation

Note: At the moment, the implementation can only read the evaluation files in QALD format.

To evaluate the configured pipelines, the eval_config.json file needs to be modified. Afterwards, perform the following steps:

  • Setup and start the BERT Similarity computation service (bert similarity service readme)
  • Execute python run_test.py to generate the gold and prediction files for all the pipelines (check file for custom arguments);
  • Execute python eval_test.py to evaluate each prediction file against its gold file using BENG;
  • Wait for BENG to finish evaluation;
  • Execute python gen_eval_results.py to extract the results from BENG and write it to a tsv file named evaluation_results.tsv;
  • Optionally, execute python format_translated_qald.py to format the predictions back into QALD format.