Python3 - TypeError: encoding without a string argument

I thought I'd document this as although the cause/fix is fairly simple, searching for the error string encoding without a string argument gives a lot of hits for a similarly structured but different error - string argument without an encoding.

An example backtrace might be:

Traceback (most recent call last):
  File "./profiler.py", line 346, in 
    meta['config_files']['pdns'] = zip_and_compress(read_file_content('/etc/powerdns/pdns.conf'))
  File "./profiler.py", line 289, in zip_and_compress
    gz = gzip.compress(bytes(s,"utf-8"))
TypeError: encoding without a string argument

With the example code being fairly simple

def read_file_content(path):
    ''' Read the entirety of a file into a variable
    '''
    file_content = None
    with open(path, 'rb') as content_file:
        file_content = content_file.read()

    return file_content

def zip_and_compress(s):
    ''' Config files can get quite sizeable. To keep the size of our output DB down
    we gzip and then ascii armour them
    '''
    gz = gzip.compress(bytes(s,"utf-8"))
    return base64.b64encode(gz).decode("utf-8")


zip_and_compress(read_file_content('/etc/powerdns/pdns.conf'))

 

The cause of the TypeError encoding without a string argument is that we're telling bytes to encode a variable into a bytes object, and it's expecting a string as input.

However, in read_file_content we're opening the file for reading in binary mode (the codebase this is sourced from uses the same function to read in a sqlite database, which will fail if you let read() try and decode it):

with open(path, 'rb') as content_file:

There are two ways to fix this. In the example above, the file being read is a simple text file, so we could switch to just reading

with open(path, 'r') as content_file:

However, doing this means that zip_and_compress would still fail in the same manner if it were later passed a bytes object (perhaps from later reading a file in binary mode).

So, it's better to adjust zip_and_compress() to include a simple type check

def zip_and_compress(s):
    ''' Config files can get quite sizeable. To keep the size of our output DB down
    we gzip and then ascii armour them
    '''
    if isinstance(s, bytes):
        gz = gzip.compress(s)
    else:
        gz = gzip.compress(bytes(s,"utf-8"))

    return base64.b64encode(gz).decode("utf-8")

 Now, we won't try and decode a bytes object into a bytes object, and all will work happily.