Spark and Python for Big Data with PySpark

Wishlist Share
Share Course
Page Link
Share On Social Media

About Course

Learn Big Data with Spark and Python (Free Course)

This free course will teach you the latest Big Data technology – Spark! You’ll learn how to use Spark with Python, one of the most popular programming languages. Spark is a powerful tool for analyzing huge datasets and is used by top technology companies like Google, Facebook, Netflix, Airbnb, Amazon, NASA, and more.

This course is designed to bring you up to speed on Spark, which can perform up to 100x faster than Hadoop MapReduce. You’ll learn the basics of Spark with a crash course in Python and then dive into using Spark DataFrames with the latest Spark 2.0 syntax. You’ll also learn how to use the MLlib Machine Learning library with the DataFrame syntax and Spark.

This course includes exercises and mock consulting projects that will put you in a real-world situation where you’ll need to use your new skills to solve real problems. You’ll also learn about the latest Spark technologies, like Spark SQL and Spark Streaming, and advanced models like Gradient Boosted Trees. After completing this course, you will feel confident adding Spark and PySpark to your resume.

**This course is completely free and is available on Udemy.**

**Keywords:** Spark, Python, Big Data, Apache Spark, DataFrames, MLlib, Machine Learning, Spark SQL, Spark Streaming, Gradient Boosted Trees, Data Analysis, Data Science, Programming, Software Development, Analytics, Data Engineer

Show More

What Will You Learn?

  • Use Python and Spark together to analyze Big Data
  • Learn how to use the new Spark 2.0 DataFrame Syntax
  • Work on Consulting Projects that mimic real world situations!
  • Classify Customer Churn with Logisitic Regression
  • Use Spark with Random Forests for Classification
  • Learn how to use Spark's Gradient Boosted Trees
  • Use Spark's MLlib to create Powerful Machine Learning Models
  • Learn about the DataBricks Platform!
  • Get set up on Amazon Web Services EC2 for Big Data Analysis
  • Learn how to use AWS Elastic MapReduce Service!
  • Learn how to leverage the power of Linux with a Spark Environment!
  • Create a Spam filter using Spark and Natural Language Processing!
  • Use Spark Streaming to Analyze Tweets in Real Time!

Course Content

Introduction to Course

  • A Message from the Professor
  • Introduction
    03:09
  • Course Overview
    07:55
  • What is Spark Why Python
    18:57
  • Course Material Download Link
    00:00

Setting up Python with Spark

Local VirtualBox Set-up

AWS EC2 PySpark Set-up

Databricks Setup

AWS EMR Cluster Setup

Python Crash Course

Spark DataFrame Basics

Spark DataFrame Project Exercise

Introduction to Machine Learning with MLlib

Linear Regression

Logistic Regression

Decision Trees and Random Forests

K-means Clustering

Collaborative Filtering for Recommender Systems

Natural Language Processing

Spark Streaming with Python

Earn a certificate

Add this certificate to your resume to demonstrate your skills & increase your chances of getting noticed.

selected template

Student Ratings & Reviews

No Review Yet
No Review Yet

Want to receive push notifications for all major on-site activities?

×