In this Python and Pandas tutorial, we provide an easy introduction to the Pandas Series data structure. The YouTube tutorial accompanying this webpage is given below.
What is Pandas Series?
Pandas Series is a data structure used to represent a one-dimensional labeled array that is based on the NumPy “ndarray” data structure. Labels are used to index the elements of the array. Pandas Series supports integer or label-based indexing and access. Pandas Series can be created from NumPy arrays, dictionaries, or by manually specifying labels and values.
Construct Pandas Series in Python and Basic Operations
In this tutorial, you will need Pandas and NumPy libraries. To install pandas and numpy, open a terminal and type
pip install pandas
pip install numpy
First, let us explain how to construct Pandas Series from a NumPy array. We do it like this
import numpy as np
import pandas as pd
np.random.seed(1)
randomArray=np.random.randn(15)
series1=pd.Series(randomArray)
The object called series1 looks like this in the Python workspace:
0 1.624345
1 -0.611756
2 -0.528172
3 -1.072969
4 0.865408
5 -2.301539
6 1.744812
7 -0.761207
8 0.319039
9 -0.249370
10 1.462108
11 -2.060141
12 -0.322417
13 -0.384054
14 1.133769
dtype: float64
To access a particular element of a series we can use this
series1[5]
To access a list of values, we can use list indexing or slicing
series1[[1,6,9]]
series1[0:5]
To print the first and last seveal entries of a Pandas Series object, we can use Pandas head and tail functions
series1.head()
series1.tail()
To print the index and value sets defining the Pandas Series, we can type
series1.index
series1.values
Next, we can construct Pandas Series by directly specifying the index and values. To do that we call Pandas Series constructor called “Series”:
series2=pd.Series([2,3,2,4,7],index=['a','b','c','d','e'])
Then, we can index the entries and access them by using the assigned labels
series2['a']
series2[['a','c','e']]
We can also use index-based access by using iloc[] function:
series2.iloc[1]
To get the length and shape of the series, we can use the function len() and .shape
len(series2)
series2.shape
To get the set of unique values of Pandas Series and their count, we can use the function unique() and value_counts():
series2.unique()
series2.value_counts()
To construct Pandas Series from Python dictionaries, we use the approach presented below. First, we create a dictionary, and then we pass that dictionary to the Pandas Series constructor:
dict1={"John": 81, "Michael": 100, "Anna": 60}
series3=pd.Series(dict1)
Them we can access the stored values by using label-based indexing
series3['John']
We can redifine Pandas series by specifying a new set of indices:
newNames={"John","Michael","Anna","Alex","Emily"}
series4=pd.Series(dict1,index=newNames)
This will create some missing values. We can check for the missing values in Pandas Series like this:
series4.isnull()
Alex True
John False
Anna False
Michael False
Emily True
dtype: bool
We can check if certain indices are in the Pandas Series like this
"Patrick" in series4
"Alex" in series4
We can use Python list comprehensions on Pandas series like this:
list2=[a for a in series4]