Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Month: June 2018 (Page 2 of 3)

Why do genealogy?

Dick Eastman recently posted a thought-provoking piece on why people devote large portions of their lives to recording their family histories. Eastman correctly asserted that genealogy demands time and attention to detail. It can be expensive and require sacrifice to ensure you’re as accurate as possible. Meanwhile, the rest of our families sit bewildered as to why we would find seemingly mundane ancestors and events so fascinating.

In his post, Eastman postulated on the different motivations of amateur genealogists and identified a dichotomy in the endeavor between family historians and simple name gatherers. Name gatherers fill out pedigree charts with names and dates and little more. Family historians go much deeper, looking for the back stories and nuances that give our families character. The metaphor I’ve always used is one of human anatomy: your names and dates compose the skeleton of your family tree. But skeletons aren’t that interesting to look at. The pictures, biographical information, and stories add flesh to that skeleton and turn it into something truly attractive.

I have no issues with the name gatherer: at least, he’s taking the time to document that which might otherwise be lost. I started as a name gatherer, though I’ve since fallen into the deep end of amateur genealogy. Have you taken the genealogical red pill yet? If not, here are a few reasons why you should consider the hobby:

If not you, who?

I’m quite certain you and your family, living and dead, have achieved noteworthy accomplishments. These accomplishments may have been recorded in random newspapers, trophy cases, or elsewhere but they’ll never be collected in one compendium unless someone takes the time to do so. Who’s going to do that? It’s time to get off the bench and get into the game!

The ol’ Kid’s Family Report

If you have children, it’s almost a certainty that one day they’ll approach you with a school assignment to write about their family history. Imagine the hero you’ll be if you can, within minutes, generate a report for your child detailing the last five or six generations of your family.

Know your medical history

Is there a history of some particular illness in your family? That knowledge might help you adjust habits in your life to avoid participating in that history. You might help your children in that regard, as well. Your family’s medical history is a detail you can uncover as a family historian. Related to my earlier point, my daughter took an anatomy class this year in which one of her assignments was to produce a report of her family’s medical history. Because I had been documenting this information for years, I was able to quickly generate a report for her going back four or five generations of the different medical conditions endured by our ancestors.

Help you better understand your own strengths and weaknesses

Were your ancestors inventive and entrepreneurial? Were they consistent and hard working? Did they have a tendency to take risks or did they live more on the safe side? Understanding the general dispositions of your ancestors might help you feel more confident to start that new business or take on a rather risky task–or maybe just play it safe.

Write the history

If you write your family’s history, you control the narrative. If you have particular heroes in your family, you can choose to emphasize their achievements. You have the power to influence future generations. As Uncle Ben once said, “with great power comes great responsibility.” Obviously, honesty and fairness are critical to incorporate in your narrative. This is your opportunity to paint an appropriate picture of your family for future generations.

Meet heretofore unknown family members

Documenting your family tree will reveal family members you never knew you had–perhaps even in your own town. These people are sure to provide a perspective on your ancestors you may be unaware of.

Connect with a community of creative and hard-working genealogists

My fellow amateur genealogists always seem to surprise me with the creative and tech-savvy ways they solve their genealogical mysteries. Here I am–a professional technologist–and some retired grandmother is showing me a Google search technique I’ve never seen before or demonstrating a use of certain genealogical software I didn’t know existed. Genealogy really does require tenacity, energy, and an attention to detail that is commonplace in this community. It can be refreshing to brush elbows with such people.

Inject your own creativity into your work

Every family historian must be a name gatherer first. The family historian, though, goes well beyond name gathering to truly “flesh out” his tree. Here, I think, is a great opportunity for fun and creativity. For example, I video tape virtually every family event I attend, especially reunions, and will occasionally edit reunion video together for fun family artifacts. I’ve interviewed several family members and both recorded and transcribed those interviews. I try to record interesting details like each member’s job histories and even their childhood heroes. Here is a chance to be creative and really represent your relatives in interesting and unconventional ways.

Channel your inner data nerd

This one may not be for everyone, but if you have data nerdist tendencies, what better data to explore than that of your own family? Analyze family migrations or life expectancy or occupation choices. Your family data is probably completely untapped and filled with interesting trends and outliers.

Time is fleeting

I was lucky enough to catch the genealogy bug while still in college. For the next several years, I would spend one week every summer at my grandfather’s farm, computer in front of me, Grandpa to one side, keying in hundreds of handwritten pedigree charts he had created and maintained for the last several decades. I was truly a name gatherer at that point. My grandfather’s efforts saved me countless hours of research. He helped me answer questions that might have forever gone unanswered. Because I started as early as I did, I was able to interview three of my grandparents and several great aunts and uncles and collect information that may have been lost forever. At the very least, for time sake, start your name gathering today!

If you decide you want to get started, make sure to check out Dick Eastman’s excellent “getting started” guide.

Parsing my DataCamp.com Accomplishments

I’m a big fan of DataCamp.com. I’m on my second year with the training site and learning valuable data analysis skills all the time.

When you complete a course on the site, they usually send you an email of congratulations, along with a link to a certificate of your accomplishment and a handy link to add your certificate to your LinkedIn profile. I’ve completed multiple courses and added several to my profile; however, I know I’ve missed a few here and there. If you go to your profile page in DataCamp, you’ll see a page listing the different topics your training has covered so far, the tracks you’ve completed, and the courses you’ve completed. Each completed course includes a LinkedIn button allowing you to easily attached that completed course to your LinkedIn profile. That’s all well and good, but I’d also like to be able to download my certificates of completion for each course. It’d be great if DataCamp had a single “download” button that would allow me to download all my certificates of accomplishment at once. No matter: I can use Python to do that. Here’s how I solved that problem:

Step 1: Download my profile page

I could write Python to log into DataCamp.com for me and download my profile page, but for this step, I’ll just do it manually. In the site, manually navigate to the “My Learning Progress” link and then save the profile page to disk.

Step 2: Load the packages we’ll need

For this work, I’ll use BeautifulSoup, urllib.parse, urlretrieve, and csv packages:


1
2
3
4
from bs4 import BeautifulSoup
import urllib.parse
from urllib.request import urlretrieve
import csv

Step 3: Open my saved profile page and load it into a “soup” object:


1
2
with open('DataCamp.html') as f:
    soup = BeautifulSoup(f, 'lxml')

Step 4: Do the hard work

The first thing I need to do is figure out where in the HTML live the list of completed courses with which I want to work. After some digging around in the HTML, I determined that I need to look for a section element containing a profile-courses class. Underneath that element will be article nodes–one for each completed course. So, I’ll use BeautifulSoup to get me the list of those article nodes. Next, I’ll iterate through that node list and peel off the two values I’m interested in: the course title and the link to the statement of accomplishment. The course title is easy enough to find: it’s in a h4 tag under the article. The link to the statement of accomplishment is a little dodgier, though. It’s actually part of the query string in the LinkedIn link. No problem. I’ll just grab that link and split out the accomplishment link part. Since the accomplishment link is part of the query string, it’s url encoded. So, to turn it back into a real boy, er, url, I’ll use the unquote function of urllib.parse; I’ll write these values to a list for easier processing later:


1
2
3
4
5
6
7
8
completed_courses = soup.find('section', {'class': 'profile-courses'}).findAll('article')
completed_courses_list = [['course_name', 'certificate_url']]

for completed_course in completed_courses:
    course_name = completed_course.find('h4').string
    linkedin_url = completed_course.find('a', {'class': 'dc-btn--linkedin'})['href']
    cert_url = linkedin_url.split('&url=')[1]
    completed_courses_list.append([course_name, urllib.parse.unquote(cert_url)])

Step 5: Download all my statements of accomplishment

Now that I have an easy list to work from, I’ll download all my certificates in one fell swoop:


1
2
for completed_course in completed_courses_list[1:]:
    urlretrieve(completed_course[1], '{0}.pdf'.format(completed_course[0]))

 

Easy peasy!

More handy PowerShell snippets

In another installment of “handy PowerShell snippets“, I offer a few more I’ve used on occasion:

Comparing documents in PowerShell

WinMerge is a great tool for identifying differences between files, but if you want to automate such a process, PowerShell’s Compare-Object is an excellent choice.

Step 1: Load the documents you wish to compare


1
2
$first_doc = cat "c:\somepath\file1.txt"
$second_doc = cat "c:\somepath\file2.txt"

 Step 2: Perform your comparison.
Note that Compare-Object will return a “<=” indicating that a given value was found in the first file but not the second, a “=>” indicating a given value was found in the second file but not the first, or a “==” indicating that a given value was found in both files.


1
$items_in_first_list_not_found_in_second = ( Compare-Object -ReferenceObject $first_doc -DifferenceObject $second_doc | where { $_.SideIndicator -eq "<=" } | % { $_.InputObject } )

 Step 3: Analyze your results and profit!

One note of warning: In my experience, Compare-Object doesn’t do well comparing nulls. To avoid these circumstances, when I import the files I wish to compare, I’ll explicitly remove such troublesome values.


1
$filtered_doc = ( Import-Csv "c:\somepath\somedoc.csv" | where { $null -ne $_.SomeCol } | % { $_.SomeCol } )

 

Join a list of items into a single, comma-delimited line

Sometimes I’ll have a list of items in a file that I’ll need to collapse into a single, delimited line. Here’s a one-liner that will do that:


1
(cat "c:\somepath\somefile.csv") -join ","

 

Use a configuration file with a PowerShell script

A lot of times, PowerShell devs will either declare all their variables at the top of their scripts or in some sort of a custom configuration file that they load into their scripts. Here’s another option: how about leveraging the .NET framework’s configuration system?

If you’ve ever developed a .NET application, you’re already well aware of how to use configuration files. You can actually use that same strategy with PowerShell. For example, suppose you’ve built up a configuration file like so:


1
2
3
4
5
6
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <add key="test_key" value="dadoverflow.com is awesome and i'm going to tell my friends all about it"/>
  </appSettings>
</configuration>

You can then load that config file into your PowerShell script with the following:


1
2
3
4
$script_path =$MyInvocation.MyCommand.Path

$my_config =[System.Configuration.ConfigurationManager]::OpenExeConfiguration($script_path)
$my_config_val = $my_config.AppSettings.Settings.Item("test_key").Value

One note: your PowerShell script and config file will need to share the same name. If your PowerShell script is called dadoverflow_is_awesome.ps1, then you’ll want to name your config file dadoverflow_is_awesome.ps1.config.

Here’s a bonus: Yes, it might be easier to just declare your variables at the top of your file and forgo the extra work of crafting such a config file. However, what if one of your configuration values is a password? By leveraging .NET’s configuration system you also get the power to encrypt values in your config file and hide them from prying eyes…but that’s a discussion that merits its own blog post, so stay tuned.

« Older posts Newer posts »

© 2024 DadOverflow.com

Theme by Anders NorenUp ↑