Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Tag: python (Page 23 of 26)

Exploring chess tournament results

Back in March, my son competed in the 2018 Queen City Classic Chess Tournament. The tournament coordinators graciously provided the player results online, although those files no longer appear on the site. At the time, I posted on the challenge of downloading the match results and parsing the values. After that, I had intended to do some exploratory data analysis (EDA) on the data and, ideally, see what sort of machine learning models I might want to build against the data.

Well, I did do some EDA work, but since grew a little restless and moved on to other projects; so, I want to go ahead and publish the little bit of work I did do on the data. Maybe next year, I’ll get to more interesting data modeling.

The tournament was composed of 699 players from 134 teams. Kindergarteners through high school 12th graders competed. Rated and non-rated players competed. Here’s a visual of that distribution across the grades:

The largest team, Detroit City Chess Club, brought almost 100 players! Here’s a look at the top 10 largest teams:

The average team size, though, was 5.2 players:

There were 14 competition categories by age and rated and non-rated. Dragon Chess Center dominated most categories:

That’s all I’ll post here, but be sure to check out the notebook I put together that has a lot more analysis.

Parsing my DataCamp.com Accomplishments

I’m a big fan of DataCamp.com. I’m on my second year with the training site and learning valuable data analysis skills all the time.

When you complete a course on the site, they usually send you an email of congratulations, along with a link to a certificate of your accomplishment and a handy link to add your certificate to your LinkedIn profile. I’ve completed multiple courses and added several to my profile; however, I know I’ve missed a few here and there. If you go to your profile page in DataCamp, you’ll see a page listing the different topics your training has covered so far, the tracks you’ve completed, and the courses you’ve completed. Each completed course includes a LinkedIn button allowing you to easily attached that completed course to your LinkedIn profile. That’s all well and good, but I’d also like to be able to download my certificates of completion for each course. It’d be great if DataCamp had a single “download” button that would allow me to download all my certificates of accomplishment at once. No matter: I can use Python to do that. Here’s how I solved that problem:

Step 1: Download my profile page

I could write Python to log into DataCamp.com for me and download my profile page, but for this step, I’ll just do it manually. In the site, manually navigate to the “My Learning Progress” link and then save the profile page to disk.

Step 2: Load the packages we’ll need

For this work, I’ll use BeautifulSoup, urllib.parse, urlretrieve, and csv packages:


1
2
3
4
from bs4 import BeautifulSoup
import urllib.parse
from urllib.request import urlretrieve
import csv

Step 3: Open my saved profile page and load it into a “soup” object:


1
2
with open('DataCamp.html') as f:
    soup = BeautifulSoup(f, 'lxml')

Step 4: Do the hard work

The first thing I need to do is figure out where in the HTML live the list of completed courses with which I want to work. After some digging around in the HTML, I determined that I need to look for a section element containing a profile-courses class. Underneath that element will be article nodes–one for each completed course. So, I’ll use BeautifulSoup to get me the list of those article nodes. Next, I’ll iterate through that node list and peel off the two values I’m interested in: the course title and the link to the statement of accomplishment. The course title is easy enough to find: it’s in a h4 tag under the article. The link to the statement of accomplishment is a little dodgier, though. It’s actually part of the query string in the LinkedIn link. No problem. I’ll just grab that link and split out the accomplishment link part. Since the accomplishment link is part of the query string, it’s url encoded. So, to turn it back into a real boy, er, url, I’ll use the unquote function of urllib.parse; I’ll write these values to a list for easier processing later:


1
2
3
4
5
6
7
8
completed_courses = soup.find('section', {'class': 'profile-courses'}).findAll('article')
completed_courses_list = [['course_name', 'certificate_url']]

for completed_course in completed_courses:
    course_name = completed_course.find('h4').string
    linkedin_url = completed_course.find('a', {'class': 'dc-btn--linkedin'})['href']
    cert_url = linkedin_url.split('&url=')[1]
    completed_courses_list.append([course_name, urllib.parse.unquote(cert_url)])

Step 5: Download all my statements of accomplishment

Now that I have an easy list to work from, I’ll download all my certificates in one fell swoop:


1
2
for completed_course in completed_courses_list[1:]:
    urlretrieve(completed_course[1], '{0}.pdf'.format(completed_course[0]))

 

Easy peasy!

Watermarking Jupyter Notebooks, Part 2

In a previous post, I explored different ways in which I might be able to add a watermark to my Jupyter Notebooks.  I also referenced an interesting approach that would allow me to a) watermark all my notebooks at once and b) only have to write CSS once and write it entirely outside of any notebook.  Well, I decided to go ahead and give that approach a try.  Here are my results.

Step 1: Create your custom.css

The post I found on theming your notebooks suggested that you should place your custom CSS in a custom.css file here: ~/.jupyter/custom/custom.css

The tilde is supposed to represent your home directory; although, in different installations of Jupyter Notebooks that I’ve seen, I’ve found the “home” directory for Jupyter to not necessarily be your Windows home directory.  At any rate, on my home computer, I found my Jupyter folder here: %USERPROFILE%\.jupyter

Interestingly, when I navigated to the that folder, I found no “custom” directory.  So, I created one and then created a blank custom.css within it.

Step 2: Add the necessary CSS to custom.css

Next, I opened my blank custom.css in Notepad++ and added the following CSS code (apologies for the horizontal scrolling, but I’m not entirely sure how to properly format HTML code in a CSS file):


1
2
3
4
div#notebook {
    background-image: url("data:image/svg+xml;utf8,<svg xmlns='http://www.w3.org/2000/svg' version='1.1' height='600px' width='600px'><text x='-350' y='500' fill='lightgray' font-size='80' font-family='Arial' transform='translate(30) rotate(-45 0 0)'>DadOverflow.com</text></svg>");
    background-repeat: repeat;
}

Step 3: Start up Jupyter, create a new notebook, and profit!

Now, with your newly minted custom.css, launch Jupyter Notebook and start a new notebook…or even open up an existing one.  You’ll now see your watermark magically appear:

Pretty cool, eh?

« Older posts Newer posts »

© 2025 DadOverflow.com

Theme by Anders NorenUp ↑