allaboutspark
← Week 1: Foundation
Week 1 · Day 2 · 80-minute reading · 5 widgets · 15-question quiz

RDDs — The Foundation

What an RDD really is, the five properties, transformations, actions, lineage, and why fault tolerance is essentially free.

Locked

Pass Day 1 to unlock this.

Each day of the study path opens after you score 80% or higher on the previous day's quiz. It's not gatekeeping — later days build directly on the ones before, and the quiz is the cheapest way to find out whether the foundation is in place.

Go to Day 1

What you'll cover on Day 2

Once live, Day 2 runs roughly 80 minutes of reading paired with 5 interactive visualizations, followed by a 15-question self-check quiz. The reading is grounded in the official Apache Spark documentation — every claim cites the docs.

  • What RDD stands for and why immutability matters
  • The five internal properties of every RDD
  • Creating RDDs: parallelize, textFile, transformations
  • The reduceByKey vs groupByKey performance trap
  • Lineage — Spark's memory of how an RDD was made
  • Fault tolerance through recomputation

Why this day matters

By the end of Day 2 you'll be able to explain rdds — the foundation confidently — not just describe it, but reason about edge cases, predict performance, and read a Spark UI for the concepts it touches. That's the bar this study path aims for: not memorization, but the kind of working understanding that lets you debug real jobs.