September 19, 2024

Easy Introduction to Pandas Series in Python

In this Python and Pandas tutorial, we provide an easy introduction to the Pandas Series data structure. The YouTube tutorial accompanying this webpage is given below.

What is Pandas Series?

Pandas Series is a data structure used to represent a one-dimensional labeled array that is based on the NumPy “ndarray” data structure. Labels are used to index the elements of the array. Pandas Series supports integer or label-based indexing and access. Pandas Series can be created from NumPy arrays, dictionaries, or by manually specifying labels and values.

Construct Pandas Series in Python and Basic Operations

In this tutorial, you will need Pandas and NumPy libraries. To install pandas and numpy, open a terminal and type

pip install pandas
pip install numpy 

First, let us explain how to construct Pandas Series from a NumPy array. We do it like this


import numpy as np

import pandas as pd 

np.random.seed(1)

randomArray=np.random.randn(15)

series1=pd.Series(randomArray)

The object called series1 looks like this in the Python workspace:

0     1.624345
1    -0.611756
2    -0.528172
3    -1.072969
4     0.865408
5    -2.301539
6     1.744812
7    -0.761207
8     0.319039
9    -0.249370
10    1.462108
11   -2.060141
12   -0.322417
13   -0.384054
14    1.133769
dtype: float64

To access a particular element of a series we can use this

series1[5]

To access a list of values, we can use list indexing or slicing

series1[[1,6,9]]

series1[0:5]

To print the first and last seveal entries of a Pandas Series object, we can use Pandas head and tail functions

series1.head()

series1.tail()

To print the index and value sets defining the Pandas Series, we can type

series1.index
series1.values

Next, we can construct Pandas Series by directly specifying the index and values. To do that we call Pandas Series constructor called “Series”:

series2=pd.Series([2,3,2,4,7],index=['a','b','c','d','e'])

Then, we can index the entries and access them by using the assigned labels

series2['a']

series2[['a','c','e']]

We can also use index-based access by using iloc[] function:

series2.iloc[1]

To get the length and shape of the series, we can use the function len() and .shape

len(series2)

series2.shape

To get the set of unique values of Pandas Series and their count, we can use the function unique() and value_counts():

series2.unique()

series2.value_counts()

To construct Pandas Series from Python dictionaries, we use the approach presented below. First, we create a dictionary, and then we pass that dictionary to the Pandas Series constructor:

dict1={"John": 81, "Michael": 100, "Anna": 60}
series3=pd.Series(dict1)

Them we can access the stored values by using label-based indexing

series3['John']

We can redifine Pandas series by specifying a new set of indices:


newNames={"John","Michael","Anna","Alex","Emily"}

series4=pd.Series(dict1,index=newNames)

This will create some missing values. We can check for the missing values in Pandas Series like this:

series4.isnull()
Alex        True
John       False
Anna       False
Michael    False
Emily       True
dtype: bool

We can check if certain indices are in the Pandas Series like this

"Patrick" in series4
"Alex" in series4

We can use Python list comprehensions on Pandas series like this:

list2=[a for a in series4]