From the course: Data Science on Google Cloud Platform: Building Data Pipelines

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Map

Map

- A map transform can be used to transform a key and an array of values into a key and a summary output. Map takes as input a Pcollection and returns a Pcollection of the same size. It can perform standard summaries like sum, length, min and max. You can also write custom summarization functions if needed. In this code example, we will average prices by product type of the computers. We will take as input the prodTypeGroups Pcollection, and update an average function. The result is the key and a single value for the key. Let us look at the code now. The average code is available in line number 85. It's a simple pipeline that takes prodTypeGroups Pcollection and then updates a lambda function to sum the values and divide them by the length of the array. This gives you the average for the entire array that belongs to each type of computer. We then print the contents of the Pcollection. Let us execute the code now. The summary, what type of PC, is actually printed along with the arrays…

Contents