I’m certainly a big fan of visualizing data. Often, I like to present multiple types of visualizations together to offer a variety of perspectives on the data. For example, I might provide both a bar chart and scatter plot together to provide deeper insight than a single visual would:

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

%matplotlib inline


# data from: https://www.kaggle.com/mysarahmadbhat/imdb-top-1000-movies
df = pd.read_csv('./data/regex_imdb.csv').fillna(0)

fig, ax = plt.subplots(1, 2, figsize=(10,6))

d1 = df[df.Year==2019][['Name', 'Gross']].sort_values('Gross').tail(10)
_ = ax[0].barh(d1.Name, d1.Gross)
_ = ax[0].set_xlabel('Gross Earnings')

d2 = df[df.Year==2019][['Run_time', 'Gross', 'Genre']].copy()
d2['Genre'] = d2.Genre.apply(lambda g: g.split(',')[0])

_ = sns.scatterplot(data=d2, x='Run_time', y='Gross', hue='Genre', ax=ax[1])
_ = ax[1].set_xlabel('Runtime (minutes)')

_ = fig.suptitle('Analysis of Movies from 2019')

In this sort of work, I will target specific axes to display specific charts. Thus, in my above example, I explicitly pushed a bar chart to ax[0] and a scatter plot to ax[1].

However, on occasion, circumstances demand that I write the same type of chart to multiple subplots where I change one variable for each. For example, suppose I want to get a quick view of the top 10 movies by gross earnings from 2010 to 2019:

I could write code to target each of these axes explicitly, but that would mean a lot of code and a lot of copy/paste. Instead, I’d rather just write a loop to iterate through the years and write the appropriate bar chart to the appropriate axis.

Looping and rendering the charts comes relatively easy to me. What usually trips me up in these efforts is targeting the right row and column. I often spend most of my time trying to remember how I solved this problem in the past.

Well no more! Hopefully this post will serve as a reference any time I need to do this type of work in the future. Ultimately, my solution is just three lines of code:

nbr_of_rows = 5
nbr_of_cols = 2

coords = [(r, c) for r in range(nbr_of_rows) for c in range(nbr_of_cols)]

Here, I set the number of rows and columns I want in my visual and do some list comprehension to pair those coordinates together in a list. Now, I have a nice, pre-built list of coordinates to leverage in my loop:

fig, ax = plt.subplots(nbr_of_rows, nbr_of_cols, figsize=(12,12))

for i, yr in enumerate(range(2010, 2020)):
    r,c = coords[i]  # grab the pre-built coordinates
    d = df[df.Year==yr][['Name', 'Gross']].sort_values('Gross').tail(10)
    _ = ax[r][c].barh(d.Name, d.Gross)
    _ = ax[r][c].set_title('Top 10 grossing movies in {0}'.format(yr))
    
fig.tight_layout()

Awesome: one less thing I have to think about!