Skip to content
#

aws-emr-clusters

Here are 40 public repositories matching this topic...

Daily Incremental load ETL pipeline for Ecommerce company using AWS Lambda and AWS EMR cluster, Deployed using Apache airflow in a docker container.

  • Updated Mar 17, 2023
  • Python

Credit defaulting results in a large profit loss to banks and other credit lenders. The success of the banking industry results in the ability to understand risk. This project uses big data technologies like Mapreduce, HDFS along with PySpark and AWS for analysis of credit history and its prediction

  • Updated May 5, 2021
  • Jupyter Notebook

A Cloud based Reddit stock sentiment analyzer that analyzes overall sentiment from a configurable selection of stock subreddits for each stock. The architecture utilizes AWS MSK (Kafka), AWS EMR (PySpark) and AWS Lambda (Python 3) for maximum scalability and the OpenAI API for sentiment analysis through prompt engineering.

  • Updated Jan 30, 2024
  • Python

Improve this page

Add a description, image, and links to the aws-emr-clusters topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-emr-clusters topic, visit your repo's landing page and select "manage topics."

Learn more