From the course: Introduction to Spark SQL and DataFrames

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Using Jupyter notebooks with PySpark

Using Jupyter notebooks with PySpark

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Using Jupyter notebooks with PySpark

- [Instructor] Now let's install Jupyter Notebook and to do that, we open a terminal and then we enter the command $ pyton3 -m pip install jupyter. Now I already have it installed, but if you don't, then this would download and install the Jupyter files for you. Okay, let's work with PySpark. So I've opened a terminal window and I've navigated to my working directory, which in this case, is in my home directory under LinkedIn Learning and I simply call it Spark SQL. I can start PySpark by typing the PySpark command and this will start Jupyter Notebook for me and you'll notice when Jupyter Notebooks open, it lists the contents of the directory, so there are some data files and some IPYNB files. IPYNB is a suffix used for Jupyter Notebooks and that comes from an earlier version of Jupyter Notebooks which were called iPython Notebooks. That's why it's called IPYNB. What I would like to do is just show you what a basic Jupyter Notebook looks like. So I'm going to mouse over to the upper…

Contents