Python3 - TypeError: encoding without a string argument
I thought I'd document this as although the cause/fix is fairly simple, searching for the error string encoding without a string argument
gives a lot of hits for a similarly structured but different error - string argument without an encoding.
An example backtrace might be:
Traceback (most recent call last): File "./profiler.py", line 346, in meta['config_files']['pdns'] = zip_and_compress(read_file_content('/etc/powerdns/pdns.conf')) File "./profiler.py", line 289, in zip_and_compress gz = gzip.compress(bytes(s,"utf-8")) TypeError: encoding without a string argument
With the example code being fairly simple
def read_file_content(path):
''' Read the entirety of a file into a variable
'''
file_content = None
with open(path, 'rb') as content_file:
file_content = content_file.read()
return file_content
def zip_and_compress(s):
''' Config files can get quite sizeable. To keep the size of our output DB down
we gzip and then ascii armour them
'''
gz = gzip.compress(bytes(s,"utf-8"))
return base64.b64encode(gz).decode("utf-8")
zip_and_compress(read_file_content('/etc/powerdns/pdns.conf'))
The cause of the TypeError encoding without a string argument
is that we're telling bytes
to encode a variable into a bytes object, and it's expecting a string as input.
However, in read_file_content
we're opening the file for reading in binary mode (the codebase this is sourced from uses the same function to read in a sqlite database, which will fail if you let read()
try and decode it):
with open(path, 'rb') as content_file:
There are two ways to fix this. In the example above, the file being read is a simple text file, so we could switch to just reading
with open(path, 'r') as content_file:
However, doing this means that zip_and_compress
would still fail in the same manner if it were later passed a bytes object (perhaps from later reading a file in binary mode).
So, it's better to adjust zip_and_compress()
to include a simple type check
def zip_and_compress(s):
''' Config files can get quite sizeable. To keep the size of our output DB down
we gzip and then ascii armour them
'''
if isinstance(s, bytes):
gz = gzip.compress(s)
else:
gz = gzip.compress(bytes(s,"utf-8"))
return base64.b64encode(gz).decode("utf-8")
Now, we won't try and decode a bytes object into a bytes object, and all will work happily.