MusicRJ

A Machine Learning-Audio Signal Processing Project (Ongoing)

To see demo video, click here

Project Details

This is a Machine Learning-Audio Signal Processing Project where a real-time audio signal is classified into speech or music using Deep Neural Network and Convolutional Network. The long term goal is to create an AI personal assistant which listens to audio streams and summarize its content to the end user.

Dataset

The project use the dataset DataGTZAN music/speech collection.

All the wav audio files should be extracted to the Data/Files folder.

Python Version

Python 3.9.12

Setting up virtual environment

Installing Virtual Environment

python -m pip install --user virtualenv

Creating New Virtual Environment

python -m venv envname

Activating Virtual Environment

source envname/bin/activate

Upgrade PIP

python -m pip install --upgrade pip

Installing Packages

python -m pip install -r requirements.txt
pip install PyAudio

How to run

#Data preprocessing
python main.py -s p

#Model Training
python main.py -s t

#Real-time Demonstration
python main.py -s r

Model 1 (Simple DNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 32)                8224                                                              
 dense_1 (Dense)             (None, 64)                2112                                                             
 dense_2 (Dense)             (None, 128)               8320                                                                  
 dense_3 (Dense)             (None, 256)               33024                                                                 
 dense_4 (Dense)             (None, 512)               131584                                                                 
 dense_5 (Dense)             (None, 256)               131328                                                                
 dense_6 (Dense)             (None, 128)               32896                                                                 
 dropout (Dropout)           (None, 128)               0                                                                    
 dense_7 (Dense)             (None, 64)                8256                                                             
 dense_8 (Dense)             (None, 2)                 130                                                                    
=================================================================
Total params: 355,874
Trainable params: 355,874
Non-trainable params: 0
_________________________________________________________________

Model 1 Train and validation loss graph

Model 2 (CNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)                (None, 101, 1290, 32)    320                                                                    
 max_pooling2d (MaxPooling2D)   (None, 50, 645, 32)      0                                             
 conv2d_1 (Conv2D)              (None, 48, 643, 64)      18496                                                              
 max_pooling2d_1 (MaxPooling2D) (None, 24, 321, 64)      0                                                                                       
 conv2d_2 (Conv2D)              (None, 22, 319, 64)      36928                                                                
 flatten (Flatten)              (None, 449152)           0                                                                   
 dense (Dense)                  (None, 64)               28745792                                                        
 dense_1 (Dense)                (None, 2)                130                                                            
=================================================================
Total params: 28,801,666
Trainable params: 28,801,666
Non-trainable params: 0
_________________________________________________________________

Testing

python -m pytest --verbose

Results

Model	Accuracy	Precision	Recall	F1-score
DNN Model	0.9812	0.9980	0.9647	0.9810

Name		Name	Last commit message	Last commit date
Latest commit History 223 Commits
.github/workflows		.github/workflows
Data		Data
Graphs		Graphs
Images		Images
TrainedModel		TrainedModel
config		config
plotter		plotter
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
dataProcessing.py		dataProcessing.py
dlModeling.py		dlModeling.py
main.py		main.py
musicRJ.gif		musicRJ.gif
realTimeTest.py		realTimeTest.py
requirements.txt		requirements.txt

cksajil/MusicRJ

Folders and files

Latest commit

History

Repository files navigation

MusicRJ

A Machine Learning-Audio Signal Processing Project (Ongoing)

Project Details

Dataset

Setting up virtual environment

How to run

Model 1 (Simple DNN) Architecture

Model 1 Train and validation loss graph

Model 2 (CNN) Architecture

Testing

Results

About

Resources

Stars

Watchers

Forks

Languages