Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Category: technology (Page 21 of 36)

Ten things I like to do in Jupyter Markdown

One of the great things about Jupyter Notebook is how you can intersperse your code blocks with markdown blocks that you can use to add comments or simply more context around your code. Here are ten ways I like to use markdown in my Jupyter Notebooks.

1. Use hashes for easy titles

In your markdown cell, enter a line like this:

# This becomes a H1 header/title

That line will render as a header (h1 element).

2. Use asterisks and hyphens for bullet points

Try these lines in your markdown cell:

* this is one way to do a bullet point
- this is another way to do a bullet point

Both render as bullet point lists.

3. Use asterisks and underscores for emphasis

Next, try this:

*these words become italicized*
__these words become bold__
Wait…that didn’t render quite as expected

The phrase I wanted to italicize italicized and the phrase I wanted to bold went bold, but both phrases rendered on the same line. What gives? I’ve noticed that some markdown behaves like this, but here’s a simple solution: add a <br> (HTML for line break) at the end of each line where you want a line break. So, write this in your markdown cell:

*these words become italicized*<br>
__these words become bold__

4. Center my headers with some HTML

Instead of using the hashtag shortcut, code your header elements directly and style them to center:

<h1 style="text-align: center">This header is centered</h1>

Interestingly, I’ve noticed that my centering works in Jupyter Notebook, but not in Jupyter Lab.

5. Create thick dividing lines with HTML

My notebooks that do a lot of exploratory data analysis before jumping into data modeling can get quite lengthy. I find that a nice, thick dividing line between sections can be a great visual indicator of the changing focus of my notebook. In a markdown cell, give this a try:

<hr style="border-top: 5px solid purple; margin-top: 1px; margin-bottom: 1px"></hr>

6. Write mathematical formulas

I’m more coder than math guy, but a formula or two can sometimes be helpful explaining your solution to a problem. Jupyter markdown cells support LaTeX, so give this a whirl:

linear regression: $y = ax + b$
two dimensions: $y = a_{1}x_{1} + a_{2}x_{2} + b$
Ridge Regression: standard OLS loss function + $\alpha \times \sum_{i=1}^{n} a^{2}_i$

7. Create hyperlinks

Hyperlinks are easy in markdown:

[Google](https://google.com)

8. Drop in images with HTML

A picture is worth a thousand words:

<img src="mind_blown.gif" style="max-width:50%; max-height:50%"></img>

9. Create nice tables

Use pipes and dashes to create a table in your markdown:

|| sepal length (cm) | sepal width (cm) |
|----|----|----|
|0|5.1|3.5|
|1|4.9|3.0|

10. Escape text with three tick marks

Occasionally, I’ll want to show a code snippet in my markdown or other kind of escaped text. You can do that by surrounding your snippet with three back-tick characters:

```
sample code goes here
```

Bonus: change the background color of your markdown cells

It never occurred to me until recently, but Notebooks bring with them a variety of style classes that you can leverage in your own markdown. Here are four examples (note: this is yet another markdown trick that works in Jupyter Notebook, but not in Jupyter Lab…at least the version I’m presently running):

<div class="alert alert-block alert-info">
This is a blue background
</div>
<div class="alert alert-block alert-warning">
This is a yellow background
</div>
<div class="alert alert-block alert-success">
This is a green background
</div>
<div class="alert alert-block alert-danger">
This is a red background
</div>

For all of this code, check out my notebook here. Also, here are two other great posts on more markdown tips and tricks.

Python bingo

Have a road trip planned this summer? Want to keep the kids from driving you crazy as you drive to Walley World? How about playing the ol’ standard License Plate game but with a twist: License Plate Bingo!

Run this code a couple of times and print out the chart on separate pieces of paper. There are your bingo cards. Give one to each kid and/or adult. Now, hit the road! If you see a license plate from, say, Texas, and you have a “Texas” square, mark it off on your bingo card. If you can mark off a row horizontally, vertically, or diagonally, you win!

Step 1: Import your packages

import matplotlib.pyplot as plt
import matplotlib.style as style
import numpy as np
import random

%matplotlib inline
style.use('seaborn-poster')

Step 2: Get your State names

# compliments of this forum: https://gist.github.com/JeffPaine/3083347
states = ["AL - Alabama", "AK - Alaska", "AZ - Arizona", "AR - Arkansas", "CA - California", "CO - Colorado",
"CT - Connecticut", "DC - Washington DC", "DE - Deleware", "FL - Florida", "GA - Georgia",
"HI - Hawaii", "ID - Idaho", "IL - Illinios", "IN - Indiana", "IA - Iowa",
"KS - Kansas", "KY - Kentucky", "LA - Louisiana", "ME - Maine", "MD - Maryland",
"MA - Massachusetts", "MI - Michigan", "MN - Minnesota", "MS - Mississippi",
"MO - Missouri", "MT - Montana", "NE - Nebraska", "NV - Nevada", "NH - New Hampshire",
"NJ - New Jersey", "NM - New Mexico", "NY - New York", "NC - North Carolina",
"ND - North Dakota", "OH - Ohio", "OK - Oklahoma", "OR - Oregon", "PA - Pennsylvania",
"RI - Rhode Island", "SC - South Carolina", "SD - South Dakota", "TN - Tennessee",
"TX - Texas", "UT - Utah", "VT - Vermont", "VA - Virgina", "WA - Washington", "WV - West Virginia",
"WI - Wisconsin", "WY - Wyoming"]
state_names = [s.split('-')[1].strip() for s in states]

Step 3: Generate your bingo card

random.shuffle(state_names)
rowlen= 4  # make any size card you'd like

fig = plt.figure()
ax = fig.gca()
ax.set_xticks(np.arange(0, rowlen + 1))
ax.set_yticks(np.arange(0, rowlen + 1))
plt.grid()

for i, word in enumerate(state_names[:rowlen**2]):
    x = (i % rowlen) + 0.4
    y = int(i / rowlen) + 0.5
    ax.annotate(word, xy=(x, y), xytext=(x, y))
    
plt.show()
Python Bingo, FTW!

Grab my full source code here.

Choosing the best coffee

Here’s another post in my quest to recreate many of the charts from Machine Learning Plus’s Top 50 matplotlib visualizations:

The perfect cup of coffee

Back in March, TowardsDataScience.com published an article that analyzed a coffee dataset from the Coffee Quality Institute (sounds like a great place to work!). Since I’m always looking for cool datasets to work with and since I love coffee, I thought this would be a great dataset to pull down and visualize in some fashion.

In the article, the author visualizes median coffee data from several countries around the world in polar charts. The polar charts worked well to get all 11 features on the chart at the same time, but every polar chart–from Ethiopia to the United States–looked the same. It was difficult to see how one country’s coffee differed from another’s. I wonder if there might be a better way to show the subtle variations among each country’s coffee? Enter in another article I talked about previously: Top 50 matplotlib Visualizations. I thought one chart in particular from that article, the Diverging Bars Chart, might do the trick.

Since each country can produce tens of different brands of coffee, I followed the lead of the original article and grabbed the median value from each country. I then applied the Diverging Bars technique to plot how far each country’s coffee varied from the mean.

One thing that puzzles me, though: in several of the categories, Papua New Guinea comes out on top. Yet if you look at the original article, the author lists the median Ethiopian coffee as coming out on top more often than not. What’s the reason for this discrepancy? I’m not really sure. I think I calculated the medians correctly–my Ethiopian values certainly match the author’s. Perhaps I’m working from a newer dataset than he did?

At any rate, I accomplished my main goal of creating some cool diverging bar charts. Enjoy with your favorite cup of java!

Step 1: Load the data

# https://github.com/jldbc/coffee-quality-database
df_coffee = pd.read_csv('./data/arabica_data_cleaned.csv')
df_coffee.head()

Step 2: Code the chart

Since the dataset has multiple features, each of which I’d like to chart, I decided to place my chart-generation code in a function so that I could easily reuse it from feature to feature:

def generate_chart(feature_to_chart, xlabel, title):
    df_chart = df_coffee.groupby('Country.of.Origin').median().loc[:, [feature_to_chart]].reset_index()
    df_chart['z'] = (df_chart[feature_to_chart] - df_chart[feature_to_chart].mean()) / df_chart[feature_to_chart].std()

    df_chart['colors'] = ['red' if x < 0 else 'green' for x in df_chart['z']]
    df_chart.sort_values('z', inplace=True)
    df_chart.reset_index(inplace=True)

    # draw plot
    plt.figure(figsize=(14,10), dpi=80)
    plt.hlines(y=df_chart.index, xmin=0, xmax=df_chart.z, color=df_chart.colors, alpha=0.4, linewidth=5)

    # decorations
    plt.gca().set(ylabel='$Country$', xlabel=xlabel)
    plt.yticks(df_chart.index, df_chart['Country.of.Origin'], fontsize=12)
    plt.title(title, fontdict={'size':20})
    plt.grid(linestyle='--', alpha=0.5)
    plt.show()

Step 3: Generate the chart

Finally, I can call my function and generate the chart:

feature_to_chart = 'Flavor'
xlabel = '${0}$ $Variation$'.format(feature_to_chart)
title = 'Diverging Bars of Median Coffee {0} Rating'.format(feature_to_chart)

generate_chart(feature_to_chart, xlabel, title)
Median Coffee Flavors

Two other interesting charts:

Divergence of the “balance” feature
Divergence of the “acidity” feature

Check out my complete code here and look for more cool charts to come!

« Older posts Newer posts »

© 2025 DadOverflow.com

Theme by Anders NorenUp ↑