From the course: Azure Spark Databricks Essential Training

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Explore optimization control planes

Explore optimization control planes

From the course: Azure Spark Databricks Essential Training

Start my 1-month free trial

Explore optimization control planes

- [Instructor] As I've previously mentioned, when sizing on Databricks there are a number of common Control Planes. And those include optimizations around the data, optimizations around the job, and the activities associated to the job, and optimizations around the cluster. So let's take a look at these in the context of our example. So in the world of data, the most common optimizations that I see in working with customers on Spark are compressing the input data. A number of compression formats are supported. We'll be working with a compression format called EZ-2. Partitioning the data, splitting input data files into files that are of smaller size so they can be distributed more quickly and easily. This strategy I've been using in combination with the next one which is converting to a format that is more optimized for the type of compute that you're performing. So in my particular use case where we evolved to was we started with basically CSV or text files. And we then moved to…

Contents