From the course: Azure Spark Databricks Essential Training
Unlock the full course today
Join today to access over 22,400 courses taught by industry experts or purchase this course individually.
Explore optimization control planes
From the course: Azure Spark Databricks Essential Training
Explore optimization control planes
- [Instructor] As I've previously mentioned, when sizing on Databricks there are a number of common Control Planes. And those include optimizations around the data, optimizations around the job, and the activities associated to the job, and optimizations around the cluster. So let's take a look at these in the context of our example. So in the world of data, the most common optimizations that I see in working with customers on Spark are compressing the input data. A number of compression formats are supported. We'll be working with a compression format called EZ-2. Partitioning the data, splitting input data files into files that are of smaller size so they can be distributed more quickly and easily. This strategy I've been using in combination with the next one which is converting to a format that is more optimized for the type of compute that you're performing. So in my particular use case where we evolved to was we started with basically CSV or text files. And we then moved to…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.