Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Tag: conferences

Scraping the PyOhio Schedule

The twelfth annual PyOhio conference was held on July 27-28 and…it. was. awesome!

Now, when it comes to planning for a conference, I must admit that I’m a bit “old school.” A day or two before the gathering, I like to print out the schedule and carefully research each session so that I can choose the ones that best meet my work and personal objectives. Often, a conference will let you download a printable schedule; however, I didn’t find any such file on the PyOhio website. No matter, I can write some Python to scrape the schedule from the website and create my own CSV for printing. Here’s what I did:

Step 1: Import the requisite packages

import requests
from bs4 import BeautifulSoup
import csv

Step 2: Grab the schedule page

result = requests.get('https://www.pyohio.org/2019/events/schedule/')
soup = BeautifulSoup(result.content, 'lxml')

Step 3: Parse out the sessions

Unfortunately, I can only attend Saturday, so my code just focuses on pulling the Saturday sessions:

day_2_list = [['start_end', 'slot1', 'slot2', 'slot3', 'slot4']]
day_2 = soup.select('div.day')[1]  # get just Saturday
talks_section = day_2.find('h3', string='Keynotes, Talks, & Tutorials').parent

# iterate across each time block
for time_block in talks_section.select('div.time-block'):
    start_end = time_block.find('div', {'class': 'time-wrapper'}).get_text().replace('to', ' - ')
    time_rec = [start_end, '', '', '', '']
    # now, iterate across each slot within a time block.  a time block can have 1-4 time slots
    for slot in time_block.select('div.time-block-slots'):
        for i, card in enumerate(slot.select('div.schedule-item')):
            class_title = card.select_one('h3').get_text()
            presenter = (card.select('p')[0]).get_text()
            location = (card.select('p')[1]).get_text()
            time_rec[i+1] = '{0}\n{1}\n{2}'.format(class_title, presenter, location)
    day_2_list.append(time_rec)  # after grabbing each slot, write the block to my "day 2" list

Step 4: Write the scraped results to a CSV

csv_file = 'pyohio_20190727_schedule.csv'

with open(csv_file, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(day_2_list)

Sweet! Now I can choose just the right sessions to attend. Get my complete code here.

Taking Care of your Family Heirlooms

As a technologist (and nerd), I love going to technology conferences. Over the years, I’ve been to numerous Microsoft gatherings and a variety of other development and security seminars. Fortunately, all those events were paid for by employers past and present. I did attend one day at the Ohio Genealogical Conference a few years ago, but on my own dime and only after burning a day of vacation. In general, it’s hard to justify the high ticket and travel fees, not to mention finite vacation days, to attend genealogy and other non-work related, but interesting events.

All that said, I don’t know why it never occurred to me before, but several months ago, one of the genealogical podcasts I listen to alerted me to the fact that some of these conferences publish their sessions online, sometimes for free. So, while backing up several gigabytes of media files, I decided to take in the session “Taking Care of your Family Heirlooms” from the 2017 National Archives Virtual Genealogy Fair (I guess there are worse things on which the US government can squander tax payer money). If you have an hour, check out the session–although the audio isn’t great; but for a cliff notes version, here are my notes:

Assess your stuff

Over the years, my ancestors have amassed lots of paraphernalia and a lot of that material has filtered down to me. At some point, though, you have to make some hard choices about your artifacts: for every artifact, you should decide whether to keep it, sell it, throw it away, or give it away.

Document your stuff

For all the material you keep, you should document as much as possible. Ask about each object as all those “W” (and “H”) questions: who, what, when, where, how. As in: who owned this artifact? What is it? What makes it so valuable? When was it acquired? Where was it acquired? How was it acquired? Etc. All artifacts should be documented in multiple ways. For example, photographs should be scanned and the electronic file documented in some fashion. Likewise, the physical photograph should be safely labeled. The presenter said that if you must apply a label directly on the artifact, use pencil instead of pen as pencil does less harm.

Store your stuff sensibly

Wet basements and hot attics can do great harm to your heirlooms as can sunlight, ultraviolet light, pests, family pets, and even dust. Ideally, you store your items safely in, say, the first floor of your home where your environment is a little more consistently controlled. What sort of containers should you choose? For photos, documents, and such, the presenter recommended PAT tested, alkaline buffered containers and, even though such containers are expensive, avoid overfilling them as that might compromise the contents. Old books like family Bibles–I have several of these–should be stored flat instead of upright like you’d find in a library, as gravity can be a harsh mistress to these worn tomes.

With metal objects, your greatest enemy is rust and you seem to exacerbate this problem if your metal heirlooms touch one another, so keep your metal heirlooms dry, wrapped in acid free paper, and stored so that they’re not touching other metal objects. During the question-and-answer period of the presentation, someone asked about preserving tin photos, of which I personally have a lot. Apparently, “tin” photos aren’t tin at all, but iron. The rule still applies, though: keep the photos from touching one another, keep them dry, stick them in a box, etc.

Display your stuff?

In general, the speaker wasn’t too keen on displaying one’s heirlooms given the damage sunlight and other environmental factors can inflict on your items (reminds me of how the US Declaration of Independence was hung on the wall of a patent office in DC for over 30 years degrading it badly). She recommended only displaying them “on special occasions” or, alternatively, displaying a photocopy of the artifact. Photos could be mounted on PAT approved paper and slipped into polyester sleaving for display. I actually did this with one of my grandmother’s old scrapbooks. I took the further step of printing out labels that I stuck to the photo paper underneath each photo…so I safely stored, displayed, and labeled these precious heirlooms all in one fell swoop!

How to handle your stuff

The basic rule of thumb here is to wash your hands before handling your artifacts. Some experts advocate cotton gloves, others advocate latex free gloves. One disadvantage to gloves, though, is that they dull your sensitivity to the artifact you’re handling possibly allowing you to damage it without realizing. Have stiff boards on hand on which you can lay flat your precious documents and make sure you have clean, de-cluttered surfaces on which to work.

Digitizing your stuff

I strongly recommend digitizing as much of your keepsakes as possible. Certainly photos and slides. I do have several large photos, paintings, and posters for which I struggle to find solutions since I’ve not been able to find a retail scanner for documents larger than letter-size. In the past, I’ve had to settle for scanning these items a section at a time and then using software to stitch the images together. FedEx stores have oversized scanners that I’ve used in the past, but I also want to look into building a rig for these purposes, as well.

The presenter noted that sometimes, you’ll get the best digital product by farming out the work to a capable vendor–you may not even own the equipment necessary to digitize some of your material. Here is a list of questions the speaker recommended asking potential vendors:

  1. Will you perform a pilot test for me on one of my artifacts so I can be sure of the quality of your work?
  2. Do you do the work in-house or do you send the items out to another location?
  3. What type of file will I get back? With audio,for example, WAV audio files are excellent for archival purposes whereas MP3 files are ideal for sharing with other family members.
  4. Do you adjust your equipment to get the best quality product? If you’re digitizing vinyl albums, ask what needle sizes the vendor uses. If you’re digitizing audio tape, ask the vendor if they adjust the asymyth to get a better reproduction. You may not know these terms, but quality vendors should.

Further Resources

For further questions, you can always email the Archives folks at preservation@nara.org or inquire@nara.gov. Other sites the speaker recommended include https://www.archives.gov/ and http://www.conservation-us.org/. For good backup strategies of your digital assets, take a look at this site: https://www.lockss.org/.

Going to CodeMash 2018

Who wouldn’t want to be in Sandusky, Ohio in January for some geeky learnin’? Plus: it sounds like that Polar Vortex may be retreating a little for the week, so, bonus!

It’s been something like 7-8 years since I last attended CodeMash. I’ve heard the venue has changed a lot and the conference has expanded significantly. Used to be that the thing would sell out within an hour of going live!

Anyway, I’m looking forward to attending. As with most conferences I patronize, I like to plan ahead by pre-selecting as many sessions as I can: often, I agonize over my decisions because I tend to find multiple, interesting sessions happening at the same time. I want to do that agonizing ahead of time so I don’t waste a lot of time making decisions during the conference itself.

Unfortunately, I’m not finding the CodeMash Schedule page too helpful. I’m used to conference schedules where the time slots are on the Y axis and sessions listed across the X axis. That way, for, say the 8am session, I can scroll horizontally across all the scheduled sessions and pick the most appealing one. The CodeMash Scheduling page lists all the sessions vertically by their time slots. The data’s there: there just ought to be a better way to display it for my needs.

Well, there is: thanks to Python. I wrote a Jupyter Notebook to ingest the session data via the CodeMash API and then used pandas to pivot the data into a dataframe helpful to my needs. Then, I just exported this dataframe out to Excel and, voilà, I have my solution!

Now, off to spend a few hours deliberating over the offerings!

© 2024 DadOverflow.com

Theme by Anders NorenUp ↑