May 8, 2024

8. Download S&P 500 Stock Data and Make a Stock Screener in Python – Find Stocks with Minumum RSI values

In this Python time-series tutorial, we teach you how to make a stock screener in Python that computes Relative Strength Index (RSI) values of stocks belonging to the S&P 500 index. The RSI parameter can be used to determine if a certain stock is oversold or overbought. You can modify this code to compute other parameters of the downloaded stock data. Before reading this post, we strongly advise you to read this post on the computation of the RSI parameter and other posts on time series analysis and modeling in Python that we have created. The Youtube video accompanying this post is given below:

So, first, we are going to download the stock data belonging to the S&P 500 index. The following code lines are used to import all the necessary libraries, to define start and end dates for downloading the data, to obtain stock symbols from Yahoo Finance, and to download the data.

# -*- coding: utf-8 -*-
"""
Simple stock screener that downloads the stock data and computes the RSI parameters

Author:
    Aleksandar Haber

Date: March 07, 2021
"""

from pandas_datareader import data as pdr
from yahoo_fin import stock_info as si
#import yfinance as yf
import pandas as pd
#import datetime
#import time
import numpy as np

# variables
tickers = si.tickers_sp500()
tickers = [item.replace(".", "-") for item in tickers] # Yahoo Finance uses dashes instead of dots

start_date='2020-10-01'
#datetime.datetime(2020,5,5)   # year, month, day
end_date='2021-03-05'
#datetime.datetime(2021,2,22)

# function that downloads the data
def download_all_stock_data(all_stock_symbols, start_date, end_date):
    def download_stock_data(single_symbol):
        print(' Downloading '+single_symbol+' data ')
#       try:
        tmp1=pdr.get_data_yahoo(single_symbol,start=start_date,end=end_date)
#       except KeyError:
#            pass 
        return(tmp1)
    downloaded_data=map(download_stock_data,all_stock_symbols)
    return(pd.concat(downloaded_data,keys=all_stock_symbols, names=['Ticker','Date']))        

stock_data=download_all_stock_data(tickers, start_date, end_date)
# save the data to the local disk
fileName = 'downloadedData.pkl'
stock_data.to_pickle(fileName)

stock_data.loc['A']

This code is self-explanatory (if you do not understand this code, please go over our other tutorials given here), except maybe for code line 37. Code line 37 might look strange to people who have a strong background in other programming languages, such as C++ or MATLAB. Well, this code calls the function map. You can read more about this function here. Basically, this function calls “download_stock_data()” function many times. At every call, a value stored in “all_stock_symbols” is used. This function returns a map object (which is an iterator). This map object can easily be converted into a Pandas DataFrame object. This is performed on the code line 38 at the same time when returning the function value.

Now that we have all the stock data, we can proceed with the computation of the RSI parameters. The RSI parameter can be used to determine if a certain stock is oversold or overbought. The following code lines are used to define a function that computes the RSI parameter values and that finds the 80 stocks with the smallest RSI values. The code for computing the RSI values is based on the code explained in our previous post.

###############################################################################
# definition of the function for computing the RSI parameter
###############################################################################

def compute_RSI(data,period_RSI):
    
    differencePrice = data['Close'].diff()
    differencePriceValues=differencePrice.values

    positive_differences=0
    negative_differences=0
    current_average_positive=0
    current_average_negative=0
    price_index=0
    RSI=[]

    
    for difference in differencePriceValues[1:]:
    
        if difference>0:
            positive_difference=difference
            negative_difference=0                
        if difference<0:
            negative_difference=np.abs(difference)
            positive_difference=0
        if difference==0:
            negative_difference=0
            positive_difference=0
    
        # this if block is used to initialize the averages
        if (price_index<period_RSI):
        
            current_average_positive=current_average_positive+(1/period_RSI)*positive_difference
            current_average_negative=current_average_negative+(1/period_RSI)*negative_difference
              
            if(price_index==(period_RSI-1)):
                #safeguard against current_average_negative=0
                if current_average_negative!=0:
                    RSI.append(100 - 100/(1+(current_average_positive/current_average_negative)))           
                else:
                    RSI.append(100)
                # this is executed for the time steps > period_RSI, the averages are updated recursively        
        else:
        
            current_average_positive=((period_RSI-1)*current_average_positive+positive_difference)/(period_RSI)
            current_average_negative=((period_RSI-1)*current_average_negative+negative_difference)/(period_RSI)
        
            #safeguard against current_average_negative=0
            if current_average_negative!=0:
                RSI.append(100 - 100/(1+(current_average_positive/current_average_negative)))   
            else:
                RSI.append(100)
            
        price_index=price_index+1    

    
    RSI_series=pd.Series(data=RSI,index=differencePrice.index[period_RSI:])
    return(RSI_series)   
    
###############################################################################
#               end of function definition
###############################################################################



RSI_all_ticker=pd.Series(index=tickers)

for stock_symbol in tickers:
    test1=compute_RSI(stock_data.loc[stock_symbol],28)
    RSI_all_ticker.loc[stock_symbol]=test1[-1]
    
RSI_all_ticker.plot(figsize=(12,12))


RSI_all_ticker.idxmin()

RSI_all_ticker.nsmallest(80)

RSI_all_ticker['LMT']

A few comments are in order. The code is self-explanatory. The code line 77 is used to list stocks with 80 smallest RSI values. We can use this list to identify stocks that are oversold.