Skip to content

Generate random species names (or any fixed-length sequence of words).

Notifications You must be signed in to change notification settings

Gullumluvl/Species_markovi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Markov Chain Generator of fixed-length sequences of words.

I use it to generate random species names (Linnean binomial nomenclature).

Takes in input a text file containing one example name per line (e.g one species name per line).

Output a serie of randomly-generated (letter-wise) sequence of words. There is a transition matrix for each position of the word in the line (e.g a transition matrix for the genus name, and one for the species name).

The first time it is run on a training dataset, it will save the transition matrix and word length distribution in .npy files in the current directory. You can delete those files to rerun the learning step.

Usage

# Generate random species names (2 words: Genus+species)
python3 species_markovi.py trainingsets/Opisthokonta_from_timetree.txt -o 3

# To generate random genus names (one word)
./species_markovi.py trainingsets/Opisthokonta-genera_from_timetree.txt -w 1 -o 3

Output example

An example of names generated with a chain of order 3, with a training set of 47850 Opisthokonta species from TimeTree:

Pseudosophilus furicoloroensis
Hydroas morachyuratus
Cetheria dicoloralis
Feylios placontademontalis
Melaconius thymetterina
Amphorus zymalendicus
Mokopis pyrans
Tarhombodina trix
Chus flexaspardelanevana
Panoplodon neva
Notamia zospinis
Aiti planthurenae
Napeonocohyllonotaena atrilis
Spiza virinensis
Eoscelis sphaemariatus
Chla cata
Phylamydoides elongchir
Eum tanum
Scylidura tatulater
Pana pardwickleylosus

And another example of order 3 trained on mammalian genera:

Pomys
Hericrotheirotis
Mineociurus
Braccotocomyscius
Asteaseutes
Phoecopomyomus
Equattinus
Martia
Peomyscophintodorhicomys
Stenobudellorex

Licensing: Do what you want with it.

About

Generate random species names (or any fixed-length sequence of words).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages