Skip to main content

A Big Data Hadoop and Spark project for absolute beginners

A Big Data Hadoop and Spark project for absolute beginners, Hadoop, Spark, Python, Scala, Dataproc, AWS S3 Data Lake, Glue, Athena, Machine Learning trough a real world use case

HOT & NEW, 4.3 (31 ratings), Created by FutureX Skill

Preview this course -.> GET COUPON CODE


A bank is launching a new credit card and wants to identify prospects it can target in its marketing campaign.

It has received prospect data from various internal and 3rd party sources. The data has various issues such as missing or unknown values in certain fields.The data needs to be cleansed before any kind of analysis can be done.

Since the data is in huge volume with billions of records, the bank has asked you to use Big Data Hadoop and Spark technology to cleanse, transform and analyze this data.

What you will learn :

Big Data, Hadoop concepts

How to create a free Hadoop and Spark cluster using Google Dataproc

Hadoop hands-on - HDFS, Hive

Why there was a need for Spark

Python basics

PySpark RDD - hands-on

PySpark SQL, DataFrame - hands-on

Project work using PySpark and Hive

Scala basics

Spark Scala DataFrame

Project working using Spark Scala

Google Colab environment

Bonus project - Applying spark transformation on data stored in AWS S3 using Glue and viewing data using Athena

Bonus project - Build your first Machine Learning model using Python, Scikit-learn to predict whether a customer will buy or not.

Prerequisites :

Some basic programming skills

Some knowledge of SQL queries

Who this course is for:
Beginners who want to learn Big Data or experienced people who want to transition to a Big Data role

Free Coupon Discount Udemy Courses
Comment Policy: Please write your comments that match the topic of this page's posts. Comments that contain links will not be displayed until they are approved.
Open Comments
Close comment