Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Month: February 2020 (Page 2 of 2)

Dealing with missing dates in your dataframes

I work with a lot of time-based data and occasionally have to work with data where there are chunks of missing time. For example, consider my kindle “free time” reading data. May 2019 was a good reading month for me, so I’d like to take a closer look at that month and chart out the ebb and flow of my reading time by day. To start with, I’ll just take a look at my reading times by day for the month:

cols = ['accessdate', 'read_mins']
df_may = df_dayinfo[(df_dayinfo.accessdate >= '2019-05-01') & (df_dayinfo.accessdate <= '2019-05-31')][cols]
df_may
My reading minutes for May 2019: there are a few days when I didn’t read at all

If I were to chart that data out, it wouldn’t quite be accurate as there are gaps in the month:

fig, ax = plt.subplots(figsize=(15, 6))
_ = df_may.groupby('accessdate').sum().plot(ax=ax, marker='o')
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')

In the past, to accommodate for these missing days, I’d build a second dataframe of all the days in the month and 0.0 read minutes. Then, I’d merge the dataframes together so that I would have entries for all the days of the month:

# create a dataframe with all the days of the month and 0.0 read times
start = datetime(2019, 5, 1)
end = datetime(2019, 6, 1)
may_zeros = [[start + timedelta(days=x), 0.0] for x in range(0, (end-start).days)]
df_may_zeros = pd.DataFrame(may_zeros, columns=['accessdate', 'read_mins'])

# now, merge my 0.0 read time df with my actual data to get a full representation of the month
df_may1 = pd.concat([df_may, df_may_zeros]).groupby('accessdate').sum()

# finally, create the chart
fig, ax = plt.subplots(figsize=(15, 6))
_ = df_may1.groupby('accessdate').sum().plot(ax=ax)
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')
Reading times in May including missed days

So, problem solved, but it turns out Pandas has an even better way to solve this problem: use Pandas’ date_range function along with reindex:

# use date_range to create an index for every day in May 2019
idx = pd.date_range('05-01-2019', '05-31-2019')
# now, group my real data by day, reindex it with the days in May, and fill any missing values with 0
df_may2 = df_may.groupby('accessdate').sum().reindex(idx, fill_value=0)

# now, we can create the chart
fig, ax = plt.subplots(figsize=(15, 6))

# i can overlay my original chart to see the differences if I want
# _ = df_may.groupby('accessdate').sum().plot(ax=ax, color='r')
_ = df_may2.plot(ax=ax, marker='o')
_ = ax.axhline(y=0.0, xmin=0.0, xmax=1.0, color='gray', ls='--')
_ = ax.set_title('My Reading Time for May 2019')
_ = ax.set_xlabel('Date')
_ = ax.set_ylabel('Reading Time (minutes)')
The same results with the aid of date_range

If you don’t want to assume your missing days are simply 0.0 values, reindex will, by default, fill the values with NaN. You could then run interpolate over your dataframe and calculate some other value.

In addition to date_range, Pandas has lots of other general-purpose functions worth checking out. So now you have two ways to fill in missing dates in your dataframes!

Dark Mode

On my workstations and phone, I’ve been trying to switch my applications over to “dark mode” themes where, instead of black letters on bright white backgrounds, you get white letters on dark backgrounds. This color change just seems to feel better on my eyes.

On the more recent versions of Windows 10, switching to dark mode is pretty easy. What’s nicer still is that many applications–like Firefox, Chrome, and Windows Explorer–will simply inherit that dark mode setting from the operating system, so you don’t have to go hunt around in the settings of these applications individually.

On my Android phone, many applications are also beginning to provide dark mode options and, for my background wallpaper, I simply used Zedge to find a free, dark space image to complete the theme.

Give dark mode a try on your devices and see if it helps reduce the strain on your eyes!

Newer posts »

© 2024 DadOverflow.com

Theme by Anders NorenUp ↑