From the course: Data Engineering Foundations
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Loading data into a DB
- [Tutor] So far we have extracted and transformed the data. And it is time to load this transform data back to the database in a new table. Before that, you're first going to wrap these extraction and transformation code snippets into separate functions, so that automation becomes easy and these function would be reusable as well. So here you can see I have first graded the spark session as we did earlier, then we moved on to actually define or wrap the movie extraction or the user data extraction code snippets into functions. So I've created this first function to extract movie table data which I've named, extract movies to df. So here I'm doing the same thing, all I've done is simply define a function and I'm returning the movies on the score data frame. Similarly for the users table as well, I have named it extract users to df and I am returning the user_dataframe that I have just read using the spark session.…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
(Locked)
Sources of data extraction4m 46s
-
(Locked)
Data extraction from a PostgreSQL database4m 51s
-
(Locked)
Challenge: Data extraction40s
-
(Locked)
Solution: Data extraction51s
-
(Locked)
Transforming data2m 3s
-
(Locked)
Challenge: Transforming data42s
-
(Locked)
Solution: Transforming data58s
-
(Locked)
Loading data into a DB4m 11s
-
(Locked)
Challenge: Loading data59s
-
(Locked)
Solution: Loading data1m
-
(Locked)
Scheduling ETL pipeline using Airflow9m 3s
-
(Locked)
-