4.12. Help With Bytes On Python 3

The official stance of natsort is to not support bytes for sorting; there is just too much that can go wrong when trying to automate conversion between bytes and str. But rather than completely give up on bytes, natsort provides three functions that make it easy to quickly decode bytes to str so that sorting is possible.

natsort.decoder(encoding)

Return a function that can be used to decode bytes to unicode.

Parameters:encoding (str) – The codec to use for decoding. This must be a valid unicode codec.
Returns:A function that takes a single argument and attempts to decode it using the supplied codec. Any UnicodeErrors are raised. If the argument was not of bytes type, it is simply returned as-is.
Return type:decode_function

See also

as_ascii(), as_utf8()

Examples

>>> f = decoder('utf8')
>>> f(b'bytes') == 'bytes'
True
>>> f(12345) == 12345
True
>>> # On Python 3, without decoder this would return [b'a10', b'a2']
>>> natsorted([b'a10', b'a2'], key=decoder('utf8')) == [b'a2', b'a10']
True
>>> # On Python 3, without decoder this would raise a TypeError.
>>> natsorted([b'a10', 'a2'], key=decoder('utf8')) == ['a2', b'a10']
True
natsort.as_ascii(s)

Function to decode an input with the ASCII codec, or return as-is.

Parameters:s – Any object.
Returns:If the input was of type bytes, the return value is a str decoded with the ASCII codec. Otherwise, the return value is identically the input.
Return type:output

See also

decoder()

natsort.as_utf8(s)

Function to decode an input with the UTF-8 codec, or return as-is.

Parameters:s – Any object.
Returns:If the input was of type bytes, the return value is a str decoded with the UTF-8 codec. Otherwise, the return value is identically the input.
Return type:output

See also

decoder()