Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Month: June 2020

Family bingo

During the quarantine, one family activity we’ve begun is weekly virtual meetings with family members we’ve been prevented from seeing face-to-face. To add some structure and fun to the meetings, we play simple games like Bingo. It occurred to me that it might be even more fun and interesting to personalize our Bingo games.

For example, take my favorite TV family, The Bundys:

The Bundys

Now, suppose the Bundys were to reunite virtually for a family get together and decided to play a personalized game of Bingo in the manner I’m proposing. They might first create a list of their names: Al, Peggy, Kelly, and Bud. They might add other names to the list like Steve, Marcy, and Jefferson. They could add memorable events like “Polk High” and “Four Touchdowns”, family vacations including “Dumpwater, Florida” and “Lower Uncton, England” and possessions such as “the Dodge” and “Buck the dog”.

Based off a previous post of mine, they could generate personalized bingo cards like so:

import matplotlib.pyplot as plt
import matplotlib.style as style
import numpy as np
import random

%matplotlib inline
style.use('seaborn-poster')


bundy_data = ['Al', 'Peg', 'Kelly', 'Bud', 'Buck', 'Steve', 'Marcy', 'Jefferson', 'Griff', 'Gary\'s\nShoes', 'Polk High', 
              'Four\nTouchdowns', 'Shoe\nSalesman', 'Lucky', 'Dumpwater,\nFL', 'No Ma\'am', 'Wanker\nCounty', 'Dodge', 
              'Bob\nRooney', 'Officer\nDan', 'Psycho\nDad', 'Ike', 'Seven', 'Anthrax', 'Jim\nJupiter', 'Sticky\nthe Clown',
              'Love &\nMarriage', 'Grandmaster\nB', 'chicken', 'Lower\nUncton', '9674\nJeopardy Ln', 'Ferguson\ntoilets', 
              'Chicago']

rowlen = 5  # bingo cards are usually 5x5

fig = plt.figure(figsize=(8, 8))
ax = fig.gca()
ax.set_xticks(np.arange(0, rowlen + 1))
ax.set_yticks(np.arange(0, rowlen + 1))
plt.grid()
_ = ax.set_xticklabels([])
_ = ax.set_yticklabels([])

for i, ltr in enumerate('BUNDY'):
    x = (i % rowlen) + 0.4
    y = 5.0
    ax.annotate(ltr, xy=(x, y), xytext=(x, y), size=20, weight='bold')
    
random.shuffle(bundy_data)
for i, phrase in enumerate(bundy_data[:rowlen**2]):
    x = (i % rowlen) + 0.29
    y = int(i / rowlen) + 0.5
    ax.annotate(phrase, xy=(x, y), xytext=(x, y))
A personalized Bundy family bingo card

The host calling out the bingo squares to mark could simply run Python code like below to generate a random list of squares to call:

nbr_of_picks = 20  # generate, say, 20 squares to call

for i in np.arange(nbr_of_picks):
    print('{0} - {1}'.format(random.choice('BUNDY'), random.choice(bundy_data).replace('\n', ' ')))

This would generate a list like so:

Y - Marcy
Y - Steve
B - Dodge
N - Ike
U - Grandmaster B
U - Lucky
U - Gary's Shoes
N - Griff
U - Steve
U - Marcy
Y - Bud
B - Psycho Dad
B - Polk High
N - Officer Dan
B - Dodge
B - Wanker County
Y - Anthrax
U - chicken
Y - Shoe Salesman
B - Ferguson toilets

If your family name is not five characters long, you could of course use “BINGO” instead or make your cards larger or smaller accordingly. And, of course, come up with your own personal family names, events, and so on for the card data.

Cleaning up Stacked Bar Charts, Part 1

Stacked bar charts are a handy way of conveying a lot of information in a single visual and pandas makes it pretty easy to generate these charts by setting the stacked property to True in the plot function.

As great as this operation is, though, you still need to do some cleanup work on your chart afterwards. In this post, I’ll talk about how I clean up the legend generated in a stacked bar chart. For my example, I’ll take a small sample of email data I gathered recently from one of my email accounts.

Bring in the data and do some standard cleanup

import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline


df_email = pd.read_csv('./data/email_data.csv', names=['email_ts', 'subject', 'category'])

df_email['email_ts'] = pd.to_datetime(df_email.email_ts)
df_email['email_dt'] = df_email.email_ts.dt.date
df_email['dow'] = df_email.email_ts.dt.dayofweek
# just look at 30 or so days of data
df_email = df_email[(df_email.email_ts>'2020-04-14 00:00:00') & (df_email.email_ts<'2020-05-15 00:00:00')]

As a standard practice, whenever I have timestamp data, I always add a “date” column and a “day of week” column. If my data spans multiple months, I’ll even add a “month” column. These columns make is much easier to group the data by day, day of week, and month later on.

Chart the data

Here’s a quick glimpse of the data in my dataset:

fig, ax = plt.subplots(figsize=(12,8))
title = 'Email counts: {0:%d %b %Y} - {1:%d %b %Y}'.format(df_email.email_dt.min(), df_email.email_dt.max())

df_email[['email_dt','dow']].groupby('email_dt').count().plot(ax=ax)
_ = ax.set_title(title)
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Email Count')
_ = fig.autofmt_xdate()

Now, create a stacked bar chart

Here’s the type of code I normally write to generate a stacked bar chart:

fig, ax = plt.subplots(figsize=(12,8))
title = 'Email counts: {0:%d %b %Y} - {1:%d %b %Y}'.format(df_email.email_dt.min(), df_email.email_dt.max())

df_email[['email_dt','category','dow']].groupby(['email_dt','category']).count().unstack().\
    plot(stacked=True, kind='bar', ax=ax)

_ = ax.set_title(title)
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Email Count')
_ = ax.set_ylim([0,46])  # just to give some space to the legend
_ = fig.autofmt_xdate()
Decent chart, but what’s the deal with that legend?

So, this chart is pretty decent, but that legend needs work. The good news is that three lines of code will clean it up nicely. Here’s my better version:

fig, ax = plt.subplots(figsize=(12,8))
title = 'Email counts: {0:%d %b %Y} - {1:%d %b %Y}'.format(df_email.email_dt.min(), df_email.email_dt.max())

df_email[['email_dt','category','dow']].groupby(['email_dt','category']).count().unstack().\
    plot(stacked=True, kind='bar', ax=ax)

_ = ax.set_title(title)
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Email Count')
_ = fig.autofmt_xdate()
_ = ax.set_ylim([0,46])  # just to give some space to the legend

original_legend = [t.get_text() for t in ax.legend().get_texts()]
new_legend = [t.replace('(dow, ', '').replace(')', '') for t in original_legend]
_ = ax.legend(new_legend, title='Category')
Nicer looking legend

So, with stacked bar charts, this is one approach I take to make the end product look a little nicer. In upcoming posts, I’ll show even more techniques to clean up your charts.

© 2024 DadOverflow.com

Theme by Anders NorenUp ↑