Skip to content

Using a machine learning model, classify Georgian names to their corresponding genders.

Notifications You must be signed in to change notification settings

A1K28/name-to-gender-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Georgian name-to-gender classifier with 94%+ accuracy

Static webpage is currently down due to no more firebase support.

~~See the static webapp on my github pages (use WIFI: the model will cost ~14mb on page load).~~

See the python code on my google colab here.

Demo

Demonstration of the project:

The Dataset (not public)

All Georgian names (with count >= 5) are taken and used for training the model. In total ~14k names are used with approximately 45:55-F:M distribution. The dataset was legally acquired from the Georgian government. I am not making it public to not get in any trouble.

The Training

We use a 2-stacked LSTM model with MAXLEN LSTM cells per stack. Each cell accepts a vector of length VOCABLEN. In short, the input is represented with one-hot encoded vectors for names; such that each name is represented by a vector of shape (MAXLEN, VOCABLEN).

MAXLEN is a hyperparameter and VOCABLEN is derived after reading the input data (it depends on the char_idx dictionary, which is a map of all present characters to a number, e.g. 'a' : 0, and so on).

Moreover, we shuffle the train-test data for N iterations of M epochs each to help reduce overfitting. I do not have much information on this after researching. It was simply a choice since I thought it would help the process.

Future Improvements

  1. Use the count variable for each name to feed the LSTM cells more info about each name.
  2. Aid overfitting even more (this is not so easy to fix without trial and error).

Credits for the Model

The model was constructed by prdeepakbabu at LSTM_RNN_architecture.jpg

About

Using a machine learning model, classify Georgian names to their corresponding genders.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published