From the course: Azure Spark Databricks Essential Training

Optimize data pipelines

From the course: Azure Spark Databricks Essential Training

Start my 1-month free trial

Optimize data pipelines

- [Lynn] Have you been working with data that's growing in volume and complexity and wondering how you're going to compute against this data? We'll be taking a look at managed Apache Spark clusters on Databricks Azure. We'll look at cluster set-up, different types of notebooks and a number of data workflows. These notebooks will include data processing with common scenarios such as Spark SQL, visualization and machine-learning scenarios with Spark ML, third-party libraries such as TensorFlow and Scikit-learn. We'll also look at a data pipelining and architectural patterns. I'm Lynn Langit. We have lots to cover, so let's get started.

Contents