Skip to content
#

emr

Here are 339 public repositories matching this topic...

An ETL logic is written in Spark for transforming the given data set present in S3, and query on the transformed data is run using AWS Redshift. The data sets are in json format. All the raw data in json format has to be first uploaded to an S3 source bucket. Using EMR, a Spark job is executed, which would fetch the source data from S3 source bu…

  • Updated Aug 23, 2022
  • Python

Improve this page

Add a description, image, and links to the emr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the emr topic, visit your repo's landing page and select "manage topics."

Learn more