Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Algorithmic Short Selling with Python

You're reading from   Algorithmic Short Selling with Python Refine your algorithmic trading edge, consistently generate investment ideas, and build a robust long/short product

Arrow left icon
Product type Paperback
Published in Sep 2021
Publisher Packt
ISBN-13 9781801815192
Length 376 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Laurent Bernut Laurent Bernut
Author Profile Icon Laurent Bernut
Laurent Bernut
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface The Stock Market Game 10 Classic Myths About Short Selling FREE CHAPTER Take a Walk on the Wild Short Side Long/Short Methodologies: Absolute and Relative Regime Definition The Trading Edge is a Number, and Here is the Formula Improve Your Trading Edge Position Sizing: Money is Made in the Money Management Module Risk is a Number Refining the Investment Universe The Long/Short Toolbox Signals and Execution Portfolio Management System Other Books You May Enjoy
Index
Appendix: Stock Screening

Data download and processing

We'll start by downloading the ticker lists from Wikipedia. This uses the powerful pd.read_html method we saw in Chapter 4, Long/Short Methodologies: Absolute and Relative:

web_df = pd.read_html(website)[0]
tickers_list =  list(web_df['Symbol'])
tickers_list = tickers_list[:]
print('tickers_list',len(tickers_list))
web_df.head()

tickers_list can be truncated by filling numbers in the bracket section of tickers_list[:].

Now, this is where the action is happening. There are a few nested loops in the engine room.

  1. Batch download: this is the high-level loop. OHLCV is downloaded in a multi-index dataframe in a succession of batches. The number of iterations is a function of the length of the tickers list and the batch size. 505 constituents divided by a batch size of 20 is 26 (the last batch being 6 tickers long).
  2. Drop level loop: this breaks the multi-index dataframe into single ticker OHLCV dataframes. The number of iterations equals the batch size. Regimes are processed at this level.
  3. Absolute/relative process: There are 2 passes. The first pass processes data in the absolute series. Variables are reset to the relative series at the end and then processed accordingly in the second pass. There is an option to save the ticker information as a CSV file. The last row dictionary is created at the end of the second pass.

Next, let's go through the process step-by-step:

  1. Benchmark download closing price and currency adjustment. This needs to be done once, so it is placed at the beginning of the sequence.
  2. Dataframes and lists instantiation.
  3. Loop size: number of iterations necessary to loop over the tickers_list.
  4. Outer loop: batch download:
    1. m,n: index along the batch_list.
    2. batch_download: download using yfinance.
    3. Print batch tickers, with a Boolean if you want to see the tickers names.
    4. Download batch.
    5. try/except: append failed list.
  5. Second loop: Single stock drop level loop:
    1. Drop level to ticker level.
    2. Calculate swings and regime: abs/rel.
  6. Third loop: absolute/relative series:
    1. Process regimes in absolute series.
    2. Reset variables to relative series and process regimes a second time.
  7. Boolean to provide a save_ticker_df option.
  8. Create a dictionary with last row values.
  9. Append list of dictionary rows.
  10. Create a dataframe last_row_df from dictionary.
  11. score column: lateral sum of regime methods in absolute and relative.
  12. Join last_row_df with web_df.
  13. Boolean save_regime_df.

Let's publish the code and give further explanations afterwards:

# Appendix: The Engine Room
 
bm_df = pd.DataFrame()
bm_df[bm_col] = round(yf.download(tickers= bm_ticker,start= start, end = end,interval = "1d",
                 group_by = 'column',auto_adjust = True, prepost = True, 
                 treads = True, proxy = None)['Close'],dgt)
bm_df[ccy_col] = 1
print('benchmark',bm_df.tail(1))
 
regime_df = pd.DataFrame()
last_row_df = pd.DataFrame()
last_row_list = []
failed = []
 
loop_size = int(len(tickers_list) // batch_size) + 2
for t in range(1,loop_size): 
    m = (t - 1) * batch_size
    n = t * batch_size
    batch_list = tickers_list[m:n]
    if show_batch:
        print(batch_list,m,n)
        
    try:
        batch_download = round(yf.download(tickers= batch_list,start= start, end = end, 
                            interval = "1d",group_by = 'column',auto_adjust = True, 
                                  prepost = True, treads = True, proxy = None),dgt)        
        
        for flat, ticker in enumerate(batch_list):
            df = yf_droplevel(batch_download,ticker)           
            df = swings(df,rel = False)
            df = regime(df,lvl = 3,rel = False)
            df = swings(df,rel = True)
            df = regime(df,lvl = 3,rel= True)            
            _o,_h,_l,_c = lower_upper_OHLC(df,relative = False)
 
            for a in range(2): 
                df['sma'+str(_c)[:1]+str(st)+str(lt)] = regime_sma(df,_c,st,lt)
                df['bo'+str(_h)[:1]+str(_l)[:1]+ str(slow)] = regime_breakout(df,_h,_l,window)
                df['tt'+str(_h)[:1]+str(fast)+str(_l)[:1]+ str(slow)] = turtle_trader(df, _h, _l, slow, fast)
                _o,_h,_l,_c = lower_upper_OHLC(df,relative = True)                
            try: 
                last_row_list.append(last_row_dictionary(df))
            except:
                failed.append(ticker) 
    except:
        failed.append(ticker)
last_row_df = pd.DataFrame.from_dict(last_row_list)
 
if save_last_row_df:
    last_row_df.to_csv('last_row_df_'+ str(last_row_df['date'].max())+'.csv', date_format='%Y%m%d')
print('failed',failed)
 
last_row_df['score']= last_row_df[regime_cols].sum(axis=1)
regime_df = web_df[web_df_cols].set_index('Symbol').join(
    last_row_df[last_row_df_cols].set_index('Symbol'), how='inner').sort_values(by='score')
 
if save_regime_df:
    regime_df.to_csv('regime_df_'+ str(last_row_df['date'].max())+'.csv', date_format='%Y%m%d')

last_row_list.append(last_row_dictionary(df)) happens at the end of the third loop once every individual ticker has been fully processed. This list automatically updates for every ticker and every batch. Once the three loops are finished, we create the last_row_df dataframe from this list of dictionaries using pd.DataFrame.from_dict(last_row_list). This process of creating a list of dictionaries and rolling it up into a dataframe is marginally faster than directly appending them to a dataframe. The score column is a lateral sum of all the regime methodologies. The last row dataframe is then sorted by score in ascending order. There is an option to save a datestamped version. The regime dataframe is created by joining the Wikipedia web dataframe and the last row dataframe. Note that the Symbol column is set as index. Again, there is an option to save a datestamped version.

Next, let's visualize what the market is doing with a few heatmaps.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image