From the course: Faster pandas

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

What is vectorization?

What is vectorization?

From the course: Faster pandas

Start my 1-month free trial

What is vectorization?

- [Instructor] Vectorization, also known as array programming is the ability to work on a set of values in parallel. Modern computers support that, and pandas uses this ability to get huge performance gains. Let's have a look. So ipython and import pandas as pd and let's create a series, So pd equal, s equal pd.Series of range of 10,000 items. And let's run the series in the Python way. So timeit, total equals zero, and for val in s, total plus equal val. And this is 2.63 microseconds. And now let's do the pandas version. So timeit s.sum. And we get, 89.5 microseconds. So, we have 1000 microseconds in every milliseconds. So, 2630 divided by 89.5. And we got about 30 times faster by switching to vectorized operations. The people coming from non-scientific Python code, scientific Python code looks off. Where are the for loops? What does it mean for you? Every time you write a for loop, you should check if there's a vectorized option to do the same code.

Contents