Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Category: technology (Page 17 of 36)

Making Music with Pandas

This year I’ve started taking guitar lessons. While I’m anxious to jump into learning a bunch of songs, my instructor is keen on me developing foundational knowledge in music theory, scales, modes, and so forth–which I’m perfectly fine with, as well.

So far, we’ve covered several ways to play major scales, the pentatonic minor scale, and the natural minor scale. We also talked about scale “relatives:” how every major scale has a minor scale and every minor scale is a subset of a major scale, the two being relatives of each other.

My instructor then gave me this assignment: play any major scale from the low E string to the high E string, transition into the scale’s relative minor by dropping down three frets, and finish playing out the relative minor scale.

As I’ve been practicing this task, though, I often find myself off by a fret. I have to ask myself, “self, what major scale did you start in? C major? So why are you playing the G# natural minor scale?”

What would really help my practice is to have a handy cheatsheet to show me all the notes in each major scale and highlight the relative minor scale of each major. I could write it all out by hand, but why do that when I have Python and Pandas at my disposal! Here’s what I came up with:

Import my packages

I really only need pandas for this work:

import pandas as pd

Generate the twelve major scales

Here’s the code I came up with to calculate all the notes in each scale. Each scale consists of 15 notes spanning three octaves:

# make up my list of notes
chromatic_scale_ascending = ['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B']
# since I usually start on the low E string, rearrange the notes starting on E
scale_from_e = (chromatic_scale_ascending + chromatic_scale_ascending)[4:16]

# the scale pattern:
# root, whole step, whole step, half step, whole step, whole step, whole step, half step
key_steps = [2, 2, 1, 2, 2, 2]  # on the guitar, a whole step is two frets
major_keys = []
for root in scale_from_e:
    three_octaves = scale_from_e * 3
    steps_from_root = three_octaves.index(root)
    major_scale = [root]
    # construct the unique notes in the scale
    for step in key_steps:
        steps_from_root += step
        major_scale.append(three_octaves[steps_from_root])
        
    # span the scale across 3 octaves
    major_keys.append(major_scale * 2 + [root])

Drop the scales into Pandas for the looks

Writing my list of lists to a pandas dataframe and then writing that dataframe out in a jupyter notebook makes everything look nice. More importantly, I can use the style function in pandas to highlight the relative minor scales of each major scale:

df_major_keys = pd.DataFrame(major_keys)

# use this function to highlight the relative minor scales in orange
def highlight_natural_minor(data):
    df = data.copy()
    df.iloc[:,5:13] = 'background-color: orange'
    return df

df_major_keys.style.apply(highlight_natural_minor, axis=None)

…and here’s my handy major/minor scale cheatsheet:

My major-relative-minor-scale cheatsheet

Column 0 is the tonic/root of the major scale while columns 5 through 12 represent the relative minor scale of that major. So we can see that that the E major scale contains the C# minor scale. For example, Ozzy’s Crazy Train apparently moves between A major and F# minor scales which sound just great together–assuming you ignore Ozzy’s personal eccentricities.

So here’s a cool way to merge my interests in music and Python and Pandas into one large mash of goodness.

Dealing with missing dates in your dataframes

I work with a lot of time-based data and occasionally have to work with data where there are chunks of missing time. For example, consider my kindle “free time” reading data. May 2019 was a good reading month for me, so I’d like to take a closer look at that month and chart out the ebb and flow of my reading time by day. To start with, I’ll just take a look at my reading times by day for the month:

cols = ['accessdate', 'read_mins']
df_may = df_dayinfo[(df_dayinfo.accessdate >= '2019-05-01') & (df_dayinfo.accessdate <= '2019-05-31')][cols]
df_may
My reading minutes for May 2019: there are a few days when I didn’t read at all

If I were to chart that data out, it wouldn’t quite be accurate as there are gaps in the month:

fig, ax = plt.subplots(figsize=(15, 6))
_ = df_may.groupby('accessdate').sum().plot(ax=ax, marker='o')
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')

In the past, to accommodate for these missing days, I’d build a second dataframe of all the days in the month and 0.0 read minutes. Then, I’d merge the dataframes together so that I would have entries for all the days of the month:

# create a dataframe with all the days of the month and 0.0 read times
start = datetime(2019, 5, 1)
end = datetime(2019, 6, 1)
may_zeros = [[start + timedelta(days=x), 0.0] for x in range(0, (end-start).days)]
df_may_zeros = pd.DataFrame(may_zeros, columns=['accessdate', 'read_mins'])

# now, merge my 0.0 read time df with my actual data to get a full representation of the month
df_may1 = pd.concat([df_may, df_may_zeros]).groupby('accessdate').sum()

# finally, create the chart
fig, ax = plt.subplots(figsize=(15, 6))
_ = df_may1.groupby('accessdate').sum().plot(ax=ax)
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')
Reading times in May including missed days

So, problem solved, but it turns out Pandas has an even better way to solve this problem: use Pandas’ date_range function along with reindex:

# use date_range to create an index for every day in May 2019
idx = pd.date_range('05-01-2019', '05-31-2019')
# now, group my real data by day, reindex it with the days in May, and fill any missing values with 0
df_may2 = df_may.groupby('accessdate').sum().reindex(idx, fill_value=0)

# now, we can create the chart
fig, ax = plt.subplots(figsize=(15, 6))

# i can overlay my original chart to see the differences if I want
# _ = df_may.groupby('accessdate').sum().plot(ax=ax, color='r')
_ = df_may2.plot(ax=ax, marker='o')
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')
The same results with the aid of date_range

If you don’t want to assume your missing days are simply 0.0 values, reindex will, by default, fill the values with NaN. You could then run interpolate over your dataframe and calculate some other value.

In addition to date_range, Pandas has lots of other general-purpose functions worth checking out. So now you have two ways to fill in missing dates in your dataframes!

Windows PowerToys

There are a few good articles out there on Windows PowerToys–Microsoft’s return to authoring little, helpful utilities for those looking to work more effectively in their Windows environments.

As of this writing, three utilities come bundled with the project: FancyZones, PowerRename, and Shortcut Guide (although Shortcut Guide is less a utility and more a handy cheatsheet on the cool shortcuts you can trigger with the Windows key on your keyboard).

The tool I’m really looking forward to is PowerLauncher. This tool purports to be an application launcher on par with the likes of Launchy and WoX. Currently, I’m a big user of SlickRun. It’s handy and very flexible. If I can’t get SlickRun to perform a particular action, I can always wrap the action with a PowerShell or batch script and have SlickRun launch that. I’ve used SlickRun for years, but I think it’s high time Microsoft themselves provide a more flexible way to launch applications and other actions from the keyboard–there’s clearly a market for it.

I don’t see any timelines for the release of the utility, but I will be eagerly checking in on the github page for updates.

« Older posts Newer posts »

© 2025 DadOverflow.com

Theme by Anders NorenUp ↑