From the course: Python Data Analysis

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Loading data sets

Loading data sets - Python Tutorial

From the course: Python Data Analysis

Start my 1-month free trial

Loading data sets

- [Instructor] You can download the Social Security name data set from their website. But I have also included the archive names.zip in your exercise files. We need to uncompress it, which we can do in Python using the ZipFile module. The interface is object oriented. You first create a ZipFile object, then called extract all in the current directory. That's the dot. Jupiter lets us browse the contents of the current directory with ls. We see that names.zip, unpacked into a directory with many text files, presumably one for every year. Let's have a look at one. We open it in read mode and print out the first few lines. It's a very simple comma separated format. Name, sex, presumably F or M, and then the number of babies born that year with that name. Pandas read CSV shouldn't have any problems. But we did do something wrong. The first record in the file, Sofia, was used to set the names of the columns. We need…

Contents