From the course: Introduction to Spark SQL and DataFrames
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Load data into DataFrames: CSV Files
From the course: Introduction to Spark SQL and DataFrames
Load data into DataFrames: CSV Files
- [Instructor] First thing I'm going to do is load a CSV file. And I have a file called location_temp, which is a time series file which contains locations of sensors and the temperatures taken at particular periods of time. So I'm going to create another variable called file_path, which is equal to my data_path plus the name of my file, which is location_temp.csv. And I'm just going to hit Return, so notice it does not execute that command yet. Now, I'm going to create a data frame, which I'll call df1. And I'm going to set df1 to the results of reading that file and I'm going to use a Spark read command called spark.read. And I'm going to specify the format and I'm going to specify CSV. Now, there are a number of different ways of expressing how to read from a CSV file. I'm using this particular format right now. And I'm going to pass in an option, which says the header is true. And I want to load from my file path.…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
Set up a Jupyter notebook2m 1s
-
Load data into DataFrames: CSV Files7m 26s
-
Load data into DataFrames: JSON Files3m 16s
-
Basic DataFrame operations3m 26s
-
Filter data with DataFrame API2m 13s
-
Aggregate data with DataFrame API3m 47s
-
Sample data from DataFrames5m 25s
-
Save data from DataFrames3m 27s
-
-
-
-