Introduction to NumPy
In this lesson, we will discuss the NumPy library, its significance in Python data analysis, and how to create your first NumPy arrays.
In this lesson, we'll discuss NumPy, a popular Python library for handling numerical data.
What exactly is a library, though?
Simply put, a library is a collection of code designed to perform specific tasks or solve particular problems. You can download and reuse this code in your own programs to achieve your goals more quickly and efficiently.
There are hundreds of libraries out there that make your life easier. For example, important data analysis libraries such as Pandas, SciPy, and Matplotlib.
What these 3 libraries have in common is that they all heavily rely on NumPy, another Python library.
NumPy is the go-to library for numerical computing. It's widely used in scientific computing, data analysis, machine learning, and other domains where numerical operations are prevalent.
To use NumPy in your code, ensure it is installed in your Python environment.
Then, you need to import it using an import
statement.
Now you can access all of numpy's functionality under the name numpy
.
But, you can also define a custom alias for a library using the as
keyword:
Now we can access NumPy using the alias np.
This practice is widely used because it makes your code easier to understand for anyone who works with it.
Now that we have access to NumPy, it's time to start working with it.
At the core of NumPy lies its array and matrix data structures. They are used to make an ordered collection of homogeneous elements, typically numbers.
Let's create our first NumPy array:
Here, we used the np.array()
method to create a numpy array.
The argument we passed to np.array()
was the list [1,2,3]
.
Using the type()
function, we observe that the type of the resulting array is numpy.ndarray
.
When printed, the NumPy array output resembles that of a normal list.
So, why not use a normal list in the first place?
The simple answer: Lists are slow, whereas NumPy arrays are highly efficient.
One reason for their efficiency is that all array elements generally have the same type. This not only saves memory but also enables much faster computations.
Let's create a NumPy array from a list that contains a mix of integers and strings:
The integers have been transformed to strings.
NumPy will always try to coerce elements of different types into the same type.
What will be the output?
You can use indexing to get values from a NumPy array:
And slicing to get a subset of the array:
When creating a subset using slicing, NumPy will return a view.
We have already talked about views in the context of dictionaries and methods like dict.items()
A view is not a copy of the sliced data, but a dynamic representation of the original.
This saves memory and is faster.
But, you need to be careful because if you modify the data in a view, you will also modify the original array.
For example, here changing the first element of our slice will also modify the original array:
If you have to reuse the original array again, it's better to create a real copy of the data using array.copy()
:
Here, we slice the original array and then apply the array.copy()
method to the resulting sub-array:
Now, the original array remains unchanged.
What will be the output?
What will be the output?
What will be the output?
What will be the output?