From the course: Introduction to Spark SQL and DataFrames

Apache Spark SQL and data analysis

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Apache Spark SQL and data analysis

- [Dan] Apache Spark and SQL are both widely used for data analysis and data science. In this course we'll introduce data frames the foundational data structure in Apache Spark. We'll also see how to use SQL when working with data frames. In this course we'll learn about installing Spark, using Jupyter notebooks, and loading data from CSV and JSON files into Spark. You'll learn about basic operations like filtering and aggregating using both the data frame API and with SQL. You'll also learn more advanced techniques like joining data, eliminating duplicates, and understanding how to work with null values. We'll also develop techniques for exploratory data analysis including analyzing time series data, using clustering, and applying linear regression. So join us now to learn about Apache Spark, SQL, and how to do data analysis with the two together.

Contents