Musings of a dad with too much time on his hands and not enough to do. Wait. Reverse that.

Tag: python (Page 26 of 26)

Working around Excel

Every-so-often I find myself having to do some processing with a Microsoft Excel file–grouping data, performing different statistical operations, etc. Doing much of this work within Excel itself is just a pain, so I would always open the file and save the applicable worksheets to CSV files.  But even manually saving the worksheets to CSV was a pain, so I devised a way to do the whole thing with PowerShell.

Unfortunately, PowerShell still has no convenient Convert-ExcelToCSV cmdlet…so I wrote my own. For added convenience, I built my function as a module and even added an Import-Module statement in my profile. Here are some quick steps on how I did that:

  1. Under your %userprofile%\documents directory, make sure you have a WindowsPowerShell directory. If not, create one.
  2. Under that WindowsPowerShell directory, create a modules directory, if one doesn’t already exist.
  3. Finally, under the modules directory, create a directory called Convert-ExcelToCSV.
  4. In that Convert-ExcelToCSV directory, drop a copy of my Convert-ExcelToCSV.psm1 module.
  5. Back under the WindowsPowerShell folder, create a file named Microsoft.PowerShell_profile.ps1.
  6. Edit Microsoft.PowerShell_profile.ps1 and add the statement Import-Module Convert-ExcelToCSV

(For more on PowerShell profiles, check out this Scripting Guy post)

So, that’s pretty cool. Now, you can open up a PowerShell command shell and save your Excel files to CSV in a one-liner:

Convert-ExcelToCSV -FileName "C:\some\path\my_excel_file.xlsx"

But we can still make this even more convenient: how about, when I right-click on an Excel file, I get the option to convert the sucker to CSV right there? Well, we can do that by editing the Windows Registry.

According to my research, in order to cover the older XLS file versions as well as the newer XLSX versions, you have to edit two slightly different paths under the HKEY_CLASSES_ROOT hive. Check out my registry file for enabling this feature for the older file type and my other file for enabling this feature on new versions of Excel files.

To make these registry modifications, you can download my REG files and run them on your machine.

Warning: editing your Registry is a big deal. I’ve tested these on my system and they work, but make sure you review the modifications first and make sure you’re comfortable with everything before proceeding.

With these modifications, you can now convert your Excel files to CSVs with ease.

 

Python makes it even easier

So, I had to do a lot of work to get to a point where I can easily convert my Excel files to CSVs for easier processing down the line. Well, guess what? Python makes it even simpler with pandas.  Check out the function read_excel.  That is just awesome!

So, either way you go–PowerShell or Python–you have options to free yourself from some of the frustrations of Excel!

Going to CodeMash 2018

Who wouldn’t want to be in Sandusky, Ohio in January for some geeky learnin’? Plus: it sounds like that Polar Vortex may be retreating a little for the week, so, bonus!

It’s been something like 7-8 years since I last attended CodeMash. I’ve heard the venue has changed a lot and the conference has expanded significantly. Used to be that the thing would sell out within an hour of going live!

Anyway, I’m looking forward to attending. As with most conferences I patronize, I like to plan ahead by pre-selecting as many sessions as I can: often, I agonize over my decisions because I tend to find multiple, interesting sessions happening at the same time. I want to do that agonizing ahead of time so I don’t waste a lot of time making decisions during the conference itself.

Unfortunately, I’m not finding the CodeMash Schedule page too helpful. I’m used to conference schedules where the time slots are on the Y axis and sessions listed across the X axis. That way, for, say the 8am session, I can scroll horizontally across all the scheduled sessions and pick the most appealing one. The CodeMash Scheduling page lists all the sessions vertically by their time slots. The data’s there: there just ought to be a better way to display it for my needs.

Well, there is: thanks to Python. I wrote a Jupyter Notebook to ingest the session data via the CodeMash API and then used pandas to pivot the data into a dataframe helpful to my needs. Then, I just exported this dataframe out to Excel and, voilà, I have my solution!

Now, off to spend a few hours deliberating over the offerings!

Newer posts »

© 2025 DadOverflow.com

Theme by Anders NorenUp ↑