From the course: Advanced SQL for Data Scientists

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Partitioning data

Partitioning data

From the course: Advanced SQL for Data Scientists

Start my 1-month free trial

Partitioning data

- [Instructor] One of the most effective ways to deal with large data sets is to use partitioning. Now, what can happen when we're dealing with really large datasets is that large tables can be difficult to query effectively because they have so much data, and especially if you're scanning or you have to maintain very large indexes. So what partitioning does is it splits tables by either rows or columns into these subsections, which we call partitions. Now, horizontal partitioning is a way of limiting the amount of data we have to scan to a subset of a set of columns. We can have local indexes for each different partition, so in this example, we have a large table which is broken down into three different partitions, we could have three distinct sets of local indexes, and so our indexes would be smaller, scanning the indexes would be smaller, or if we needed to scan the entire partition, the amount of data would be…

Contents