Exploring ML for building a more robust and scalable version of Kindly #57

nathanfletcher · 2021-07-13T10:03:01Z

Looking into ways to achieve what Kindly #41 does to make it better.
This may also result in solutions that are not vendor-locked.
Maybe @nathanbaleeta may have a few ideas.

lacabra · 2021-07-19T09:21:57Z

@nathanfletcher: can you document here some of the findings? Thanks! 🙏

nathanfletcher · 2021-07-21T22:31:21Z

I will start here with the basics from discussions with @nathanbaleeta

A number of things I'll be looking into:

Data sourcing. Leverage Twitter API to get access to training data.
Natural Language Processing/ Understanding. Leverage Natural Language Toolkit(NLTK) or spaCy (open source python based natural language processing libraries) for data preprocessing before fitting the cyber bullying model.
Machine Learning Algorithms. Build on Scikit-learn (Open-source Python-based ML library) which ships with several implementations of ML algorithms right out of the box to build and evaluate the cyber bullying model. Explore shallow learning as a proof of concept as we try to collect enough data before embarking on deep learning methods to achieve state-of-the-art results in the long run.
AI/ ML technology stack: Python, Scikit learn, Pandas, NLTK, spaCy, TextBlob, Numpy, Keras, Tensorflow, Jupyter notebooks, Colab, Tensorboard & FastAPI.

nathanbaleeta · 2021-07-26T21:03:47Z

PROBLEM DEFINITION
The use of Twitter and social networking sites (SNS) such as Facebook to communicate with one another and the world, has led to increased instances of cyberbullying, especially among teenagers. (Reference)

Twitter is an American microblogging and social networking service on which users post and interact with messages known as "tweets". Registered users can post, like, and retweet tweets, but unregistered users can only read them. (Wikipedia)

Cyberbullying is the use of information and communication technology to harass and harm in a deliberate, repetitive, and hostile manner.

Types of cyberbullying include bullying someone through social media, harassment, sexting, cyberstalking, deception, impersonation, and sending nasty messages via chat rooms and instant messenger. Here are more examples of cyberbulling.

According to Twitter demographics published by www.statista.com as of April 2021: users aged less than 24 years old were almost the 24 percent worldwide as shown below in the graphic:

SOLUTION
To solve this problem, we will follow the typical machine learning pipeline. We will first import the required libraries and the dataset. We will then do exploratory data analysis to see if we can find any trends in the dataset. Next, we will perform text preprocessing to convert textual data to numeric data that can be used by a machine learning algorithm. Finally, we will use machine learning algorithms to train and test our sentiment analysis models

nathanfletcher · 2021-08-02T12:47:21Z

@lacabra This repository is where my files and practical learnings are https://github.com/nathanfletcher/ml_text_classification

amreenp7 · 2021-10-18T16:11:29Z

@nathanfletcher to include this in documentation before closing it.

nathanfletcher assigned nathanfletcher and nathanbaleeta Jul 13, 2021

nathanbaleeta changed the title ~~TensoFlow and other ML methods for Kindly~~ Exploring ML for building a more robust and more scalable version of Kindly Jul 21, 2021

nathanbaleeta changed the title ~~Exploring ML for building a more robust and more scalable version of Kindly~~ Exploring ML for building a more robust and scalable version of Kindly Jul 21, 2021

nathanfletcher mentioned this issue Sep 1, 2021

Building of the Kindly ML Predictive Model Structure unicef/kindly#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploring ML for building a more robust and scalable version of Kindly #57

Exploring ML for building a more robust and scalable version of Kindly #57

nathanfletcher commented Jul 13, 2021

lacabra commented Jul 19, 2021

nathanfletcher commented Jul 21, 2021 •

edited by nathanbaleeta

nathanbaleeta commented Jul 26, 2021 •

edited

nathanfletcher commented Aug 2, 2021

amreenp7 commented Oct 18, 2021

Exploring ML for building a more robust and scalable version of Kindly #57

Exploring ML for building a more robust and scalable version of Kindly #57

Comments

nathanfletcher commented Jul 13, 2021

lacabra commented Jul 19, 2021

nathanfletcher commented Jul 21, 2021 • edited by nathanbaleeta

nathanbaleeta commented Jul 26, 2021 • edited

nathanfletcher commented Aug 2, 2021

amreenp7 commented Oct 18, 2021

nathanfletcher commented Jul 21, 2021 •

edited by nathanbaleeta

nathanbaleeta commented Jul 26, 2021 •

edited