From the course: Data Science on Google Cloud Platform: Designing Data Warehouses

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Optimize storage

Optimize storage

- [Instructor] While BigQuery is serverless and fully managed, it can still create a big bill at the end of the month if data is not stored optimally. Here are some of the best practices for managing data storage. In data warehouses, tables are created ad hoc and leftover forever with no one actually using them after the initial analysis. This leads to a number of stray tables sitting and generating bills. Decide on which tables are temporary and which ones are permanent during the time of creation. Isolate temporary tables into separate datasets so it is easy to identify them. Set expiration times for temporary tables so they automatically expire and get pruned than someone managing it. Partitions in permanent tables can also be expired individually. Set up expiration for partition tables after an expected period has elapsed. One of the key headaches with storing data in GCP is accumulating monthly costs. It needs to be managed before it becomes excessive. GCP provides multiple…

Contents