DadOverflow.com

Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Page 33 of 57

Bash in your notebook

#!/bin/bash

At work recently, I had to call an internal REST API for some data I needed to process. To try the API out, I fired up Ubuntu in my Windows Subsystem for Linux and ran a cURL command to try out the interface. That went well, so I created a new Jupyter Notebook in which to call the API–I wanted to take the response data, load them into a pandas dataframe, and create a chart. Easy, right?

So, I called the API with requests and promptly received a SSL “bad handshake” error. Like many others, I struggled to resolve this error. Clearly, the server hosting that API was misconfigured in some way. However, I didn’t own the server and had no real recourse to get the issue fixed, so I decided to call cURL directly from my notebook; this led me to the bash magic command.

With the bash magic command, you can tell Jupyter Notebook to run all the commands in your cell as if you were executing them at a bash command prompt…even if you’re running Jupyter Notebook on a Windows operating system. How cool is that?! Furthermore, with the out argument, you can pipe all your cell output to a variable for easy processing. Check this out:

(Note: I’m using the free lyrics API from lyrics.ovh in my examples below)

%%bash --out lyrics1
curl https://private-anon-cd823708f8-lyricsovh.apiary-proxy.com/v1/the%20beatles/here%20comes%20the%20sun
The Bash magic command returns output as a String

The Bash magic command returns output as a String. Since the output is really JSON, I can easily convert it to JSON with the loads function:

import json
json_lyrics1 = json.loads(lyrics1)
print(json_lyrics1['lyrics'])
That JSON String easily converts to standard JSON

What if you want to call Bash commands in a loop?

Suppose I wanted to get the lyrics of multiple Beatles songs…now what do I do? Well, here’s one hack I came up with: do in-line calls to the shell.

song_list = ['yesterday', 'yellow%20submarine', 'eleanor%20rigby']
song_lyrics = []

for song in song_list:
    lyric = !wsl curl -s 'https://private-anon-cd823708f8-lyricsovh.apiary-proxy.com/v1/the%20beatles/{song}'
    song_lyrics.append(lyric)
    
song_lyrics

Here are a few things to note with my shell operation:

  • Since my operating system is Windows 10, I’m actually shelling out to the Windows command shell, not bash. However, since I have WSL installed on my machine, I can use wsl.exe to run commands in that shell. So, I’m basically calling a shell within a shell to ultimately execute my bash operation.
  • With the braces syntax, I can pass the value of my song variable to my shell command.
  • I pass the silent argument (-s) to cURL to suppress the noise cURL would normally send back to Jupyter Notebook. This allows me to pass just the JSON response to my variable lyric.

One challenge with this approach is that the shell command returns a SList. Basically, a list of strings. I should be able to join those lists together, though, and then convert them to JSON with the loads function:

song_list = ['yesterday', 'yellow%20submarine', 'eleanor%20rigby']
song_lyrics = []

for song in song_list:
    lyric = !wsl curl -s 'https://private-anon-cd823708f8-lyricsovh.apiary-proxy.com/v1/the%20beatles/{song}'
    json_lyrics = json.loads(''.join(lyric))
    song_lyrics.append(json_lyrics)
    
song_lyrics

And now I have a list of JSON objects (or dictionaries) to work with. Awesome!

For more on the bash magic command, check out this excellent article. Go here to get all my example code.

Music to drive by, Part 4

Another post in my continuing saga to collect quality, offline music to play in my car. For more details, see my previous posts:

I’ve made a few revisions lately to the code I wrote to inventory my music collection–over 300 albums and over 4000 songs–and write select songs to a USB stick for playing in my car.

Dealing with odd characters unfriendly to PowerShell

Several of my filepaths including characters with brackets, ampersands, and the like that don’t seem particularly friendly to some of the PowerShell commands I use. In particular Select-Object -Unique does not seem fond of strings with these types of characters in them. To circumvent this issue, I’ve found that if I encode those strings before performing the Unique operation on them, I can avoid a certain amount of pain. However, I’ve also found that Shell’s Namespace method is not fond of escaped characters. So, immediately after I produce my unique set of folders to process, I unencode those strings:

$mp3_folders = dir "$music_folder\*.mp3" -Recurse | where {$dirs_to_exclude -notcontains $_.Directory.FullName} | select Directory | %{[RegEx]::Escape($_.Directory)} | select -Unique | %{[RegEx]::Unescape($_)}

Adding additional fields

There are a lot of ID3 tag fields to filter on

Previously, I was only collecting the “contributing artist”; however, there is also an “album artist” field, so I added that to my inventory, as well. Often, there might be multiple people or bands listed as contributing artists for a single song, however, usually, the album artist is just the band or the singer who released the CD. Differentiating these two values can be helpful when you start to filter down just what music you want on your thumb drive.

Writing my inventory to a CSV file instead of a JSON file

In my first version of the inventory script, I produced a JSON file representing my music inventory. That’s not necessarily a bad approach; however, if I write my inventory out as a CSV file instead, I can then easily open that file in Excel and start to quality check the metadata in my music files.

Making sure to encode my inventory file as UTF8

I have a fair amount of foreign artists in my collection using accented letters and the like. Encoding my inventory as UTF8 will properly preserve these characters.

Filtering on both album artists and albums

My music collection includes music my children enjoy–much of which I find annoying. For example, I don’t enjoy singing along to the Smurfs soundtrack while stuck in rush hour traffic. So, I’ve added logic to my “copy” script to not only filter out particular artists I don’t want to hear in the car but also filter out specific albums, as well.

Filtering on spoken word audio

I have a few CDs that include speeches or interviews of the artists, such as The Beatles Anthology. Personally, I’d rather hear The Beatles’s music over a speech or interview, so I do my best to filter such audio out of my playlist by excluding any song who’s title includes the word speech. I should probably extend that logic to include the word interview, also.

$mp3s_to_write_to_drive = $music_collection | where {$genres_to_include -contains $_.genre} | where {$album_artists_to_exclude -notcontains $_.album_artist} | 
    where {$albums_to_exclude -notcontains $_.album} | where {$_.song -notlike "*speech*"} | sort {random}

For these improvements and more, check out my new versions here.

Sorting your teammates randomly

You go first? No, you go first.

Every week, my team has a status meeting where we spend part of the time going around our virtual meeting room getting updates from each team member. To try to keep things fair, my manager attempts to randomly pick the next team member to give his or her update. If she really wanted to be fair, she’d simply run this PowerShell command every week:

"Larry","Moe","Curly","Doc","Sneazy","Grumpy"|sort {random}

Here, I start with an array of my team members.: Larry, Moe, Curly, Doc, Sneazy, and Grumpy. I pipe that array to the Sort-Object cmdlet. The “random” portion of the sort command basically tells the sort to sort the list according to the large random number that is generated. This doesn’t feel completely random to me and I’m not sure it would work well for teams larger than nine people, but it certainly works better than drawing names out of a hat–or the manager’s head, for that matter.

Hat tip to this blog post for the inspiration.

« Older posts Newer posts »

© 2025 DadOverflow.com

Theme by Anders NorenUp ↑