From the course: Azure Spark Databricks Essential Training

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Understand Spark key components

Understand Spark key components

From the course: Azure Spark Databricks Essential Training

Start my 1-month free trial

Understand Spark key components

- Now as mentioned in the introduction to this course, if you're unfamiliar or new to working with Apache Spark, you may want to review some other courses in the library to get some foundational learning. If you have done some work with opensource Spark, you will quickly see how the integration really enhances the capability of Spark's key components, which are shown here. And just to review them, you'll see that the core of Spark is a user program and you can see on the left side you instantiate a Spark Context and then you set up a RDD or a structure which will hold the distributed data. That's a resilient data set. And the RDDs then can work to have operations performed on them in a distributed manner. And some of the examples shown here are a Cassandra table, which is a type of key value database, mapping, filtering, keying by, reducing by key, caching, so on and so forth. And you can see, once the data is loaded into the memory of the worker nodes, as RDDs or some higher level…

Contents