Start free trial Sign in

From the course: Introduction to Spark SQL and DataFrames

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Aggregating Data with SQL

Aggregating Data with SQL

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Aggregating Data with SQL

“

- [Instructor] When we work with Sequel in databases, we often use Sequel to perform aggregations and the same holds true when working with Sequel in Spark. So once again, I've started a new Jupyter Notebook, and I've loaded data from our Utilization file, and that utilization includes CPU utilization, free memory, and session count, those are the measures, and we organize those by time and by server ID. So because I want to work with Sequel, the first thing I'm going to do is specify the name of the data frame that has our data and then apply the create or replace temp view and we'll call it Utilization, and let's do a very simple aggregation, let's get a count of the number of rows in the utilization table and we'll put that into a data frame called DF_Count and we'll execute a Spark Sequel statement and that statement is simply going to be Select Count star from utilization and let's show the results. OK, so we have…

Contents