Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Spark Developer In Real World
Let's Get Started
Thank you and Welcome (11:35)
Tools and Setup (8:30)
Introduction To Spark
Hadoop vs. Spark - Who Wins (15:30)
Challenges Spark Tries To Address (12:24)
How Spark Is Faster Than Hadoop (8:39)
RDD - Core Of Spark
The Need For RDD (11:29)
What Is RDD (12:30)
What An RDD Is Not (7:31)
Execution In Spark (Behind the scenes)
First Program In Spark (16:04)
What are Dependencies and Why They are Important (11:11)
Program to Execution (Part 1) (13:01)
Program to Execution (Part 2) (19:10)
Caching Data In Spark (15:04)
Fault Tolerance (7:34)
Shuffle in Spark
Need for Shuffle (10:45)
Hash Shuffle Manager - Part 1 (11:44)
Hash Shuffle Manager - Part 2 (14:07)
Sort Shuffle Manager (8:15)
Spark Transformations
reduceByKey vs groupByKey (9:34)
Cogroup, Join and Avoiding Shuffle - Part 1 (14:19)
Cogroup, Join and Avoiding Shuffle - Part 2 (8:23)
Resizing Partitions (7:46)
PageRanking with RDDs
PageRanking Algorithm (7:33)
PageRank Walk-through (6:15)
Implementing PageRank with RDDs (6:31)
Beyond RDDs
What's the Problem with RDDs (11:53)
DataFrame vs DataSet vs SQL (12:25)
Simple Selects (8:26)
Filtering DataFrames (2:24)
Aggregating DataFrames (5:19)
Joining DataFrames (8:20)
PageRanking with DataFrames (16:39)
Spark with Other Datasources & File Formats
Spark & Hive (8:26)
Spark & Hive with XML, Parquet & ORC (14:23)
Spark & RDBMS (8:49)
Spark & HBase (Part - 1) (18:47)
Spark & HBase (Part - 2) (9:03)
Spark Optimizations
Number of Tasks (14:33)
Join Algorithms (16:57)
Picking a Join Algorithm (9:09)
Join Hints (4:13)
Spark - Under the Hood
Inside the Catalyst Optimizer (12:05)
Catalyst Optimizer - Plan Walkthrough (6:27)
Project Tungsten - Better Memory Management (13:09)
Project Tungsten - CPU Cache Aware Optimizations (11:05)
Resource Management
Spark Architecture (7:59)
Memory Layout In Executor (8:12)
Resource Management - Standalone (12:09)
Resource Management - YARN (14:07)
Dynamic Resource Allocation (7:47)
Cluster Installation
Spark Installation (5:28)
Hadoop Cluster Setup (Part 1) (23:43)
Hadoop Cluster Setup (Part 2) (25:35)
Hadoop Cluster Setup (Part 3) (18:01)
An end to end project (Spark, Elasticsearch, Kibana, REST and Angular)
End to End Project Introduction (8:09)
Elasticsearch (A quick introduction) (8:18)
Hands-on with Elasticsearch (10:45)
Stackoverflow Dataset (8:58)
Spark ETL (12:53)
Visualizations with Kibana (8:44)
REST Service with Spring framework (19:29)
Building an Angular application (12:28)
Introduction to Kafka
Kafka - The Why and the What (8:43)
Key Concepts (12:32)
Experiments with Kafka (19:18)
Machine Learning
Introduction to Machine Learning (11:38)
Machine Learning Blueprint (5:49)
Feature Engineering (10:39)
Linear Regression (8:17)
World Happiness Project (13:58)
Decision Trees (9:55)
Random Forest (3:14)
Predicting 2016 US Elections (11:46)
Predicting Yelp Ratings (+ve or -ve) (15:55)
Streaming with Spark
Why Streaming and How Spark Does Streaming (11:51)
Core Concepts in Streaming (8:36)
Output Modes With Non Aggregate Queries (13:40)
Output Modes With Aggregate Queries (8:50)
Event Time, Window and Late Events (10:39)
Handling Late Events In Streaming (10:47)
Late Events and Append Mode (8:05)
Streaming Meetup with Spark (Part 1) (5:31)
Streaming Meetup with Spark (Part 2) (8:53)
A Short Chapter On Scala
Introduction to Scala (12:05)
First Program in Scala (not HelloWorld) (11:45)
Scala Functions (11:43)
Teach online with
Spark ETL
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock