From the course: Data Science on Google Cloud Platform: Building Data Pipelines
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Reading text files - Google Cloud Tutorial
From the course: Data Science on Google Cloud Platform: Building Data Pipelines
Reading text files
- [Voiceover] I will now start exploring various code operations that can be done with Apache Beam and Cloud Dataflow. This and the following videos will use a single example file called "Operations_Dataflow.by" . This contains a complete data transformation application demonstrating various Apache Beam capabilities. We will use this script step-by-step and see how Apache Beam transforms data. We will start off with reading text files. Apache Beam provides the Apache_Beam.io package library that contains pipeline I/O libraries supporting various data sources. We will use the text file reader function called "ReadFromText". This can read any file either on the local file system, your server, or cloud storage. In this example we will use a file called "sales_transactions.csv". This file for this example has already been loaded into Cloud Storage. This file contains information about computer sales transactions. It has four fields. The ID of the transaction, the type of customer, whether…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.