From the course: Data Science on Google Cloud Platform: Designing Data Warehouses
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Table design considerations
From the course: Data Science on Google Cloud Platform: Designing Data Warehouses
Table design considerations
- Let's look over some best practices for designing and using data warehouses with BigQuery. Starting with Table Design. It is important to spend enough time designing tables the right way, since the design impacts database performance, and GCP usage costs. The first thing to consider is denormalization. Denormalization makes queries run faster, as they avoid joins. But they also impact performance of inserts and updates, as duplicate data is being inserted into fact tables because of denormalization. BigQuery is a data warehouse engine, which is primarily expected to be used as a write once, read many database. Hence, denormalization should be used as much as possible, to avoid performance issues. You can denormalize data to include dimension data into fact tables. You can also denormalize to include child records in parent tables. This is especially helpful if parent and child records are queried frequently together. The next element to consider during table design is partitioning…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.