Playing with Spark RDDs
Learning Objectives: Get an insight of Spark – RDDs and other RDD related manipulations for implementing business logics (Transformations, Actions, and Functions performed on RDD).
Topics:
- Challenges in Existing Computing Methods
- Probable Solution & How RDD Solves the Problem
- What is RDD, It’s Operations, Transformations & Actions
- Data Loading and Saving Through RDDs
- Key-Value Pair RDDs
- Other Pair RDDs, Two Pair RDDs
- RDD Lineage
- RDD Persistence
- WordCount Program Using RDD Concepts
- RDD Partitioning & How It Helps Achieve Parallelization
- Passing Functions to Spark
Hands-on:
- Loading data in RDDs
- Saving data through RDDs
- RDD Transformations
- RDD Actions and Functions
- RDD Partitions
- WordCount through RDDs