DadOverflow.com

Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Page 5 of 57

Finding sub-ranges in my dataset

File this under: there-has-to-be-a-simpler-way-to-do-this-in-pandas-but-I-haven’t-found-what-that-is

Recently, I’ve been playing with some financial data to get a better understanding of the yield curve. Related to yield and inverted yield curves are the periods of recession in the US economy. In my work, I wanted to first build a chart that indicated the periods of recession and ultimately overlay that with yield curve data. Little did I realize the challenge of just coding that first part.

I downloaded a dataset of recession data, which contains a record for every calendar quarter from the 1960s to present day and a 0 or 1 to indicate whether the economy was in recession for that quarter–“1” indicating that it was. What I need to do was pull all the records with a “1” indicator and find the start and end times for each of those ranges so that I could paint them onto a chart.

I’ve heard it said before that any time you have to write a loop over your pandas dataframe, you’re probably doing it wrong. I’m certainly doing a loop here and I have a nagging suspicion there’s probably a more elegant way to achieve the solution. Nevertheless, here’s what I came up with to solve my recession chart problem:

Step 1: Bring in the necessary packages

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline  # for easy chart display in jupyter notebook

Step 2: Load in my downloaded recession dataset and take a peek

# recession dates: https://fred.stlouisfed.org/series/JHDUSRGDPBR
df_recessions = pd.read_csv('./data/JHDUSRGDPBR_20220327.csv')

df_recessions['DATE'] = pd.to_datetime(df_recessions.DATE)
df_recessions.head()
The first records of the Recession dataset
df_recessions[df_recessions.JHDUSRGDPBR==1.0].head()
The first records in the dataset where the economy was in recession

Step 3: Mark the start of every period of recession in the dataset

So, now I’m asking myself, “how do I extract the start and stop dates for every period of recession identified in the dataset? Let’s start with first just finding the start dates of recessions.” That shouldn’t be too difficult. If I can filter in just the recession quarters and calculate the date differences from one row to the next, if the difference is greater than three months (I estimated 93 days as three months), then I know there was a gap in quarters prior to the current record indicating that current record is the start of a new recession. Here’s what I came up with [one further note: my yield curve data only starts in 1990, so I filtered the recession data for 1990 to present]:

df_spans = df_recessions[(df_recessions.DATE.dt.year>=1990) & (df_recessions.JHDUSRGDPBR==1.0)].copy()
df_spans['days_elapsed'] = df_spans.DATE - df_spans.shift(1).DATE
df_spans['ind'] = df_spans.days_elapsed.dt.days.apply(lambda d: 's' if d > 93 else '')
df_spans.iloc[0, 3] = 's'  # mark first row as a recession start
df_spans
“s” indicates the start of a new recession

Step 4: Find the end date of each recession

Here’s where my approach starts to go off the rails a little. The only way I could think to find the end dates of each recession is to:

  1. Loop through a list of the start dates
  2. In each loop, get the next start date and then grab the date of the record immediately before that one
  3. When I hit the last loop, just consider the last record to be the end date of the most recent recession
  4. With every stop date, add three months since the stop date is only the first day of the quarter and, presumably, the recession more or less lasts the entire quarter

Confusing? Here’s my code:

start_stop_dates = []
start_dates = df_spans.loc[df_spans.ind=='s', ].DATE.tolist()

for i, start_date in enumerate(start_dates):
    if i < len(start_dates)-1:
        stop_date = df_spans.loc[df_spans.DATE < start_dates[i+1]].iloc[-1].DATE
    else:
        stop_date = df_spans.iloc[-1].DATE
        
    # add 3 months to the end of each stop date to stretch the value to the full quarter
    start_stop_dates.append((start_date, stop_date + np.timedelta64(3,'M')))
    
start_stop_dates
Recessions from 1990 to the beginning of 2022

Step 5: Build my chart

With that start/stop list, I can build my underlying recession chart:

fig, ax = plt.subplots(figsize=(12,6))

_ = ax.plot()
_ = ax.set_xlim([date(1990, 1, 1), date(2022, 4, 1)])
_ = ax.set_ylim([0, 10])

for st, sp in start_stop_dates:
    _ = ax.axvspan(st, sp, alpha=0.2, color='gray')
US Recessions: 1990 – 2021

Phew. All that work and I’m only at the starting point of my yield curve exploration, but that will have to wait for a future post. However, if you can think of a more elegant way to identify these date ranges without having to resort to looping, I’d love to hear it!

Thunder-striking Thunderstruck

I’ve been attending lots of high school sporting events of late–football games, basketball, wrestling, etc. Much of the time, the public announcement system will play random music prior to the start of the competition, but you’ll often know when the game’s about to kick-off when you hear that familiar riff from the band AC/DC:

You’ve been…thunderstruck!

I’m certainly happy to see the nods to songs of my youth–I’m always trying to get my children to appreciate the songs I grew up with. However, Thunderstruck can’t be the only song of its type to fire-up the fans for the upcoming contest. What are the attributes of the song that make it the go-to tune for these sort of events?

  • Upbeat, driving rhythm. Thunderstruck starts off at a fast tempo with driving guitar and drums and builds up into the opening verse and chorus, helping to energize competitor and fan alike.
  • Long introduction. The overall instrumental introduction is long, clocking in at 1 minute, 4 seconds until the first verse kicks in. This long introduction allows sports announcers time to welcome the crowd, introduce the athletes, and provide sundry details.
  • Somewhat innocuous subject matter. Given that these are school events, whatever entertainment material used should be “safe for school”. While some of the lyrics of Thunderstruck are certainly questionable, they are generally obscure and viewed by most in attendance as benign. To some degree, the song does extol the importance of personal fortitude, so there are probably bonus points to be had for lyrical subject matter that lauds competition and perseverance.

With these attributes in mind, what other song might be a decent substitute for this AC/DC classic? Perusing my own catalog, I’ve come up with several potential alternatives:

Guns n’ Roses — Welcome to the Jungle (45 second intro)

Let’s go ahead and get this one out of the way: Welcome to the Jungle is the common alternative you’ll hear at sporting events in place of AC/DC.

AC/DC — Hard as a Rock (43 second intro), Burnin’ Alive (49 second intro)

AC/DC has repeated their Thunderstruck formula multiple times throughout their career. Here are two examples from their 1995 album, Ballbreaker.

Aerosmith — Back in the Saddle (25 second intro)

This Aerosmith classic doesn’t have the a very long lead-in time until the chorus kicks in, but the musical build-up is sure to energize the crowd.

Armored Saint — Left Hook From Right Field (1:14 intro)

Armored Saint provides some great options for these occasions, although several of their songs can be a little explicit, so do your homework.

Cinderella — If You Don’t Like It (46 second intro)

I count Cinderella as one of my favorite 80s bands.

Helmet — Unsung (40 second intro)

While the intro’s a little shorter than ideal, the guitar and drums are driving and sure to set a nice, competitive atmosphere.

Led Zeppelin — Immigrant Song (18 second intro)

Like Guns n’ Roses, I’d be remiss if I didn’t mention this Led Zeppelin classic. My guess is AC/DC has found new popularity due to the band’s inclusion in the various Marvel movies. The same could be said of Led Zeppelin. Immigrant Song would likely strike a chord in the hearts of both the young and not-so-young in attendance.

Powerman 5000 — Top of the World (44 second intro)

Like their name, Powerman 5000 delivers high octane tunes sure to invigorate sports fans and participants alike.

Queen — One Vision (38 second intro)

A great classic from the Iron Eagle soundtrack that needs mentioning.

Saxon — Solid Ball of Rock (1:16 intro)

Saxon is the author of several, hard, driving ditties sure to rile up the masses.

Tesla — Modern Day Cowboy (37 second intro), Ez Come Ez Go (56 second intro)

Another of my favorite 80s bands, Tesla has produced some heavy, inspirational tunes worthy of sports competitions. An interesting note is the band’s mention of “the U.S.S.R.” in Modern Day Cowboy.

Honorable Mentions

So, these are my thoughts on potential Thunderstruck replacements from my own song collection. Thoughts? Other suggestions?

« Older posts Newer posts »

© 2024 DadOverflow.com

Theme by Anders NorenUp ↑