Skip to content

slowwavesleep/BabelNetExtractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BabelNetExtractor

Scala tool to extract data from local BabelNet indices

Currently, the tool does the following:

  • reads nouns from nouns_no_header.csv
  • queries local BabelNet with these nouns (the expected results are also nouns in English)
  • stores BabelNet ID, candidate FrameNet ID, BabelNet definition, candidate FrameNet definition, and all the edges for each synset in bn_entries.csv and bn_edges.csv

And that's basically it so far.

Steps to reproduce

  • Clone the repository
git clone https://github.com/slowwavesleep/BabelNetExtractor.git
  • Install IntelliJ IDEA (Community version will suffice) if you don't have it

  • Install Scala plugin

  • Open the cloned repository with IDEA (File -> Open)

  • Add BabelNet API to the libraries:

    • File -> Project Structure -> + -> Java

    image

    • Navigate to BabelNet API and select the corresponding .jar file (such as /BabelNet-API-3.6.1/babelnet-api-3.6.1.jar)
    • Add lib directory the same way (such as /BabelNet-API-3.6.1/lib)
    • The list of libraries should look like this:

    image

  • Copy config directory from BabelNet API to the project

  • Point to local indices in config/babelnet.var.properties (for example, babelnet.dir=/home/user/BabelNet/BabelNet-3.6) and comment out the rest of parameters in this file (unless you know what you're doing)

  • Run src/main/scala/Extract.scala using IDEA (right click on it, then Run 'Extract')

  • After a few minutes bn_entries.csv and bn_edges.csv should appear in src/main/resources

About

A scala tool to extract data from local BabelNet indices

Topics

Resources

License

Stars

Watchers

Forks

Languages