Skip to content

The implementation of "Domain Representative Keywords Selection: A Probabilistic Approach" (Findings of ACL '22)

Notifications You must be signed in to change notification settings

pritomsaha/keyword-selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Domain Representative Keywords Selection: A Probabilistic Approach

Introduction

This is the source code for keyword selection framework developed for selecting a set of k keywords from a candidate set.

Usage

We provide the data preprocessing code, and the python implementation of our method and baselines specified in the paper.

If you want to use our data preprocessing code, then you need to create a folder with your dataset name under "/.data/" folder and put the corpus on the "source" folder. How to run the preporcessing is shown in preprocessing folder.

Otherwise, you can directly download our preprocessed datasets and other groundtruth data used in the experiments from Google Drive; unzip it and put the dataset in under the "./data/" folder.

To Run

cd ./src;
python3 keyword_selection.py --config config_filename;

Please see the some config files in the "./src/configs/" folder that are used for the experients presented in the paper. Results are saved under the results folder of the corresponding dataset folder"

Publications

Please cite the following paper if you are using this code. Thanks!

@inproceedings{pritom2022keyword,
  title={Domain Representative Keywords Selection: A Probabilistic Approach},
  author={Akash, Pritom Saha and Huang, Jie and Chang, Kevin Chen-Chuan and Li, Yunyao and Popa, Lucian and Zhai, ChengXiang},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2022},
  year={2022}
}

About

The implementation of "Domain Representative Keywords Selection: A Probabilistic Approach" (Findings of ACL '22)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published