Howdy, this is the next installment of my articles on the CAPE <> F-Score Strategy. If you haven't already, be sure to go back and read the two previous posts.

As a quick recap, last time we created a list calledtickersfor all equities that make up the S&P 500. We then scraped financial data from Yahoo Finance for all of the tickers, and created a few pandas dataframes for each year of financial data that we scraped.

In this post we're going to parse through the financial data that we scraped and select the relevant fundamental data to compute an F-Score for each item in our list oftickers.

To start, lets create two lists, one calledstatsand one calledindx. In these lists, we'll specify the screening criteria of financial performance/health we need to assess in order to compute an F-Score for each equity.

# selecting relevant financial information for each stock using fundamental data
stats = ["Net income available to common shareholders",
         "Total assets",
         "Net cash provided by operating activities",
         "Long-term debt",
         "Other long-term liabilities",
         "Total current assets",
         "Total current liabilities",
         "Common stock",
         "Total revenue",
         "Gross profit"] 

indx = ["NetIncome","TotAssets","CashFlowOps","LTDebt","OtherLTDebt",
        "CurrAssets","CurrLiab","CommStock","TotRevenue","GrossProfit"]

Next, we'll define a few functions to filter the relevant screening criteria we just defined in the stats & indx lists, and to calculate an F-Score for each item in ourtickerslist (read that last part as 'for each equity in the S&P' ). We'll also have to do a bit of data cleansing to transform the string inputs to numeric so we can use them appropriately later on.

Here's the function to filter the relevant screening criteria:

def info_filter(df,stats,indx):
    """function to filter relevant financial information for each 
       equity and transforming string inputs to numeric"""
    tickers = df.columns
    all_stats = {}
    for ticker in tickers:
        try:
            temp = df[ticker]
            ticker_stats = []
            for stat in stats:
                ticker_stats.append(temp.loc[stat])
            all_stats['{}'.format(ticker)] = ticker_stats
        except:
            print("can't read data for ",ticker)

    all_stats_df = pd.DataFrame(all_stats,index=indx)

    # cleansing of fundamental data imported in dataframe
    all_stats_df[tickers] = all_stats_df[tickers].replace({',': ''}, regex=True)
    for ticker in all_stats_df.columns:
        all_stats_df[ticker] = pd.to_numeric(all_stats_df[ticker].values,errors='coerce')
    return all_stats_df

And here's the function to calculate the F-Score of eachtickersitem:

def piotroski_f(df_cy,df_py,df_py2):
    """function to calculate f-score of each equity and output information as dataframe"""
    f_score = {}
    tickers = df_cy.columns
    for ticker in tickers:
        ROA_FS = int(df_cy.loc["NetIncome",ticker]/((df_cy.loc["TotAssets",ticker]+df_py.loc["TotAssets",ticker])/2) > 0)
        CFO_FS = int(df_cy.loc["CashFlowOps",ticker] > 0)
        ROA_D_FS = int(df_cy.loc["NetIncome",ticker]/(df_cy.loc["TotAssets",ticker]+df_py.loc["TotAssets",ticker])/2 > df_py.loc["NetIncome",ticker]/(df_py.loc["TotAssets",ticker]+df_py2.loc["TotAssets",ticker])/2)
        CFO_ROA_FS = int(df_cy.loc["CashFlowOps",ticker]/df_cy.loc["TotAssets",ticker] > df_cy.loc["NetIncome",ticker]/((df_cy.loc["TotAssets",ticker]+df_py.loc["TotAssets",ticker])/2))
        LTD_FS = int((df_cy.loc["LTDebt",ticker] + df_cy.loc["OtherLTDebt",ticker])<(df_py.loc["LTDebt",ticker] + df_py.loc["OtherLTDebt",ticker]))
        CR_FS = int((df_cy.loc["CurrAssets",ticker]/df_cy.loc["CurrLiab",ticker])>(df_py.loc["CurrAssets",ticker]/df_py.loc["CurrLiab",ticker]))
        DILUTION_FS = int(df_cy.loc["CommStock",ticker] <= df_py.loc["CommStock",ticker])
        GM_FS = int((df_cy.loc["GrossProfit",ticker]/df_cy.loc["TotRevenue",ticker])>(df_py.loc["GrossProfit",ticker]/df_py.loc["TotRevenue",ticker]))
        ATO_FS = int(df_cy.loc["TotRevenue",ticker]/((df_cy.loc["TotAssets",ticker]+df_py.loc["TotAssets",ticker])/2)>df_py.loc["TotRevenue",ticker]/((df_py.loc["TotAssets",ticker]+df_py2.loc["TotAssets",ticker])/2))
        f_score[ticker] = [ROA_FS,CFO_FS,ROA_D_FS,CFO_ROA_FS,LTD_FS,CR_FS,DILUTION_FS,GM_FS,ATO_FS]
    f_score_df = pd.DataFrame(f_score,index=["PosROA","PosCFO","ROAChange","Accruals","Leverage","Liquidity","Dilution","GM","ATO"])
    return f_score_df

This function (:point_up:) is the meat & potatoes of everything we've done so far. Take some time to really go through this for loop and make sure you understand how I'm looping through all 10 screening criteria to assess the financial performance/health of thetickeritems. If you've been paying extra close attention, you'll notice that in stats & indx I selected 10 screening criteria, but when defining the components of an F-Score in the previous post, I only listed 9 screening criteria. Why is that? Because in order to calculate the 9 criteria that are essential to computing an F-Score, I actually need 10 criteria in total! Strange, right?

After defining the above functions, the next step is to filter through the results to select the tickers with the highest F-Score values, and arange them in alist in decending order. The way I did that is as follows:

# Selecting items with highest f-score
transformed_df_cy = info_filter(combined_financials_cy,stats,indx)
transformed_df_py = info_filter(combined_financials_py,stats,indx)
transformed_df_py2 = info_filter(combined_financials_py2,stats,indx)

f_score_df = piotroski_f(transformed_df_cy,transformed_df_py,transformed_df_py2)
tot_f_score = f_score_df.sum().sort_values(ascending=False)

And that, ladies & gents, is how to calculate Piotroski F-Scores for all of the equities in the S&P 500.

The next step in the trading strategy is to do a similar exercise and compute the Cyclically Adjusted Price-to-Earnings rations for alltickeritems as well. I'll dive into that in the next post. Until then, adiós!

DISCLAIMER: To be brutally honest, I do not know if this strategy can consistantly generate abnormal risk-adjusted rates of return. I am simply sharing this idea so others can contest or explore its potential usefulness. The information in this post does not constitute investment advice. I will not accept liability for any loss or damage, including without limitation any loss of profit, which may arise directly or indirectly from use of or reliance on such information.