From the course: Azure Spark Databricks Essential Training
Unlock the full course today
Join today to access over 22,500 courses taught by industry experts or purchase this course individually.
Understand Databricks Delta
From the course: Azure Spark Databricks Essential Training
Understand Databricks Delta
- [Instructor] As we're looking at running complex workloads, let me remind you that there are two primary methods of inputting the data: batch or stream. We've mostly been working with batch, we did have one earlier movie where we showed stream, but what I'm seeing as an architect in the real world is combinations of both, and that leads this kind of into where we're going next. So, batch, as a reminder, is a one-time run. You can partition the input data. You can use other optimizations, such as compression. And you can partition the output data. Stream is a continual stream. It can be partitioned. It can be compressed. And again, traditionally, you have set up those optimizations, which requires more work to set up the pipeline. Now as we start thinking about taking our workflows and putting them into pipelines, we of course are working with Azure Databricks. So let's consider the Azure data ecosystem. The ingest capability for data can be handled by services such as Azure Data…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
Use Databricks jobs and role-based control5m 37s
-
(Locked)
Use Databricks Runtime ML2m 52s
-
(Locked)
Understand ML Pipelines API4m 16s
-
(Locked)
Use ML Pipelines API8m 39s
-
(Locked)
Use distributed ML training9m 59s
-
(Locked)
Understand Databricks Delta3m 41s
-
(Locked)
Use Databricks Delta5m 10s
-
(Locked)
Use Azure Blob storage2m 41s
-
(Locked)
Understand MLflow7m 34s
-
-
-