voice-grabber

This repo is a collection of scripts to download the dataset necessary to train the jibjib-model

Repo layout

The complete list of JibJib repos is:

jibjib: Our Android app. Records sounds and looks fantastic.
deploy: Instructions to deploy the JibJib stack.
jibjib-model: Code for training the machine learning model for bird classification
jibjib-api: Main API to receive database requests & audio files.
jibjib-data: A MongoDB instance holding information about detectable birds.
jibjib-query: A thin Python Flask API that handles communication with the TensorFlow Serving instance.
gopeana: A API client for Europeana, written in Go.
voice-grabber: A collection of scripts to construct the dataset required for model training

Scripts

In the top level of this repo, there are several helper scripts to create/change JSON and CSV files, as well as converter.py to convert audio files from mp3 to wav.

data_grabber/

This Go script uses gopeana to populate both a JSON and CSV file with information about the on Europeana published bird voices from the Tierstimmenarchiv (open dataset of the Museum für Naturkunde Berlin)

file_grabber/

This Go script uses the output of data_grabber/ to follow the links provided on Europeana and download the audio files.

wiki_grabber/

This Python script takes input from a CSV file and uses the Wikipedia API to extract summaries about birds, then saves it in a seperate CSV.

xeno_grabber/

This is a collection of scripts to:

clean the files directory (in our case, in order to bring down the total number of classes, birds with a German Wikipedia entry were used.)
nicely crawl Xeno Canto for audio files of birds
download the audio files from Xeno Canto

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data_grabber		data_grabber
file_grabber		file_grabber
info_grabber		info_grabber
meta		meta
xeno_grabber		xeno_grabber
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
converter.py		converter.py
csv_transformer.py		csv_transformer.py
id_mapper.py		id_mapper.py
json_converter.py		json_converter.py
requirements.txt		requirements.txt
splitter.py		splitter.py

License

gojibjib/voice-grabber

Folders and files

Latest commit

History

Repository files navigation

Repo layout

Scripts

About

Topics

Resources

License

Stars

Watchers

Forks

Languages