Lately, I’ve been challenged with performing calculations and charting of file size values in different units of measure. For example, I’ll have file size values in gigabytes but will have to plot those values against terabytes of disk capacity. I’m a little surprised that Python doesn’t have a ready way to solve this problem. There is the hurry.filesize package, but that requires that you pass a bytes value into the function. What if you only have a gigabytes value to pass? Well, I came up with my own solution, largely inspired by similar solutions. Here’s my function:

import re


def convert_filesize(size, desired_uom, factor=1024):
    """Converts a provided computer data storage value to a different unit of measure.
    
    Keyword arguments:
    size -- a string of the current size (ex. '1.5 GB')
    desired_uom -- a string of the new unit of measure (ex. 'TB')
    factor -- the factor used in the conversion (default 1024)
    """
    uom_options = ['B', 'KB', 'MB', 'GB', 'TB']
    supplied_uom = re.search(r'[a-zA-Z]+', size)
    if supplied_uom:
        supplied_uom = supplied_uom.group()
    else:
        raise ValueError('size argument did not contain expected unit of measure')
        
    supplied_size = float(size.replace(supplied_uom, ''))
    supplied_size_in_bytes = supplied_size * (factor ** (uom_options.index(supplied_uom)))
    converted_size = supplied_size_in_bytes / (factor ** (uom_options.index(desired_uom)))
    return converted_size, '{0:,.2f} {1}'.format(converted_size, desired_uom)

Then, you’ll use it like so:

# non SI conversion
print('Using default conversion factor of 1024:')
print(convert_filesize('1024 B', 'KB'))
print(convert_filesize('1.5 GB', 'MB'))
print(convert_filesize('59.3 GB', 'TB'))

print('\nUsing this IEC/SI conversion factor of 1000:')
# conversion recommended by IEC (https://www.convertunits.com/from/MB/to/GB)
print(convert_filesize('1024 B', 'KB', factor=1000))
print(convert_filesize('1.5 GB', 'MB', factor=1000))
print(convert_filesize('59.3 GB', 'TB', factor=1000))

Which produces the following results:

Using default conversion factor of 1024:
(1.0, '1.00 KB')
(1536.0, '1,536.00 MB')
(0.05791015625, '0.06 TB')

Using this IEC/SI conversion factor of 1000:
(1.024, '1.02 KB')
(1500.0, '1,500.00 MB')
(0.0593, '0.06 TB')

I’m sure there’s much room for improvement, but this routine seems to meet my needs for now.