Skip to content

wangruinju/PySpark-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PySpark-Machine-Learning

A collection of machine learning examples using PySpark

To clone and run the tutorials, please install Anaconda Python with pyspark and other needed packages.

Basic Spark

Quick through the basic, I recommend a online course from Udemy and this great GitHub Repo.

Natural Language Processing with PySpark

Watch as John Hogue walks through a practical example of a data pipeline to feed textual data for tagging with PySpark and ML. Learn to leverage great existing Python libraries in Spark such as NLTK and how to use some of Spark’s newer features. A GitHub Repo of source code, training and test sets of data will be provided for attendees to explore and play with.

About

A collection of machine learning examples using PySpark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published