Playing with Spark RDDs

Learning Objectives: Get an insight of Spark – RDDs and other RDD related manipulations for implementing business logics (Transformations, Actions, and Functions performed on RDD).
Topics:
  • Challenges in Existing Computing Methods
  • Probable Solution & How RDD Solves the Problem
  • What is RDD, It’s Operations, Transformations & Actions
  • Data Loading and Saving Through RDDs
  • Key-Value Pair RDDs
  • Other Pair RDDs, Two Pair RDDs
  • RDD Lineage
  • RDD Persistence
  • WordCount Program Using RDD Concepts
  • RDD Partitioning & How It Helps Achieve Parallelization
  • Passing Functions to Spark
 
Hands-on:
  • Loading data in RDDs
  • Saving data through RDDs
  • RDD Transformations
  • RDD Actions and Functions
  • RDD Partitions
  • WordCount through RDDs