Welcome to TwitterML

Project to analyse text streams (tweets or docs) using big data and machine learning. Uses Apache Spark to built textual metrics, then processes the text via various classification models to evaluate the sentiment (models via SciKit-Learn).

Free software: GNU General Public License v3
Documentation: https://twitter-ml.readthedocs.io

Features

Classifier Builder - standalone tool to configure classifiers and train them using pre-classified samples
Text Classify - a standalone program for classifying the sentiment of text using NLTK and SciKit-Learn classifiers
Document Scanner - a program for classifying text documents on the Spark platform
Twitter-Kafka Publisher - reads tweets from Twitter and pumps them into a Kafka server (where they can be consumed by out Twitter Consumer programs).
Twitter Analyser - reads tweets from Kafka and performs analysis of the text using the Spark platform.

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
bin		bin
docs		docs
models		models
tests		tests
twitter_ml		twitter_ml
.deepsource.toml		.deepsource.toml
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
learning_curve.png		learning_curve.png
logging.yaml		logging.yaml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
roc_kfolds.png		roc_kfolds.png
sample_waffle.png		sample_waffle.png
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini
voting.yaml		voting.yaml
wordcloud.png		wordcloud.png
wordcloud_mask.png		wordcloud_mask.png

License

paulknewton/twitter-ml

Folders and files

Latest commit

History

Repository files navigation

Welcome to TwitterML

Features

About

Topics

Resources

License

Stars

Watchers

Forks

Languages