From the course: Introduction to Spark SQL and DataFrames

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Filtering DataFrames with SQL

Filtering DataFrames with SQL

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Filtering DataFrames with SQL

- [Instructor] Next, we're going to take a look at how to filter data frames using Spark SQL. So I've started a new workbook, I've imported the SQL library from PySpark, I've created a Spark session, and then loaded my data from the JSON file. Again, we're using the utilization data, which includes measurements on CPU utilization, free memory, and session count, and that's organized by time and by server ID. So the first thing I want to do, since I want to work with SQL, is to create or replace a temp view based on the data frame. So I'll specify the data frame, df, and then specify create or replace temp view, and specify the name of the table that I'd like to use. In this case we'll use utilization again. Now I can execute a Spark SQL statement, and I'm going to save the results as another data frame, and I'll call that df_sql, and to create that data frame, I will invoke Spark SQL with a SQL command, and I'm going…

Contents