1. The natsort module

Simple yet flexible natural sorting in Python.

natsort is a general utility for sorting lists naturally; the definition of “naturally” is not well-defined, but the most common definition is that numbers contained within the string should be sorted as numbers and not as you would other characters. If you need to present sorted output to a user, you probably want to sort it naturally.

natsort was initially created for sorting scientific output filenames that contained signed floating point numbers in the names. There was a lack of algorithms out there that could perform a natural sort on floats but plenty for ints; check out this StackOverflow question and its answers and links therein, this ActiveState forum, and of course this great article on natural sorting from CodingHorror.com for examples of what I mean. natsort was created to fill in this gap, but has since expanded to handle just about any definition of a number, as well as other sorting customizations.

1.1. Quick Description

When you try to sort a list of strings that contain numbers, the normal python sort algorithm sorts lexicographically, so you might not get the results that you expect:

>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> sorted(a)
['1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '2 ft 7 in', '7 ft 6 in']

Notice that it has the order (‘1’, ‘10’, ‘2’) - this is because the list is being sorted in lexicographical order, which sorts numbers like you would letters (i.e. ‘b’, ‘ba’, ‘c’).

natsort provides a function natsorted() that helps sort lists “naturally” (“naturally” is rather ill-defined, but in general it means sorting based on meaning and not computer code point).. Using natsorted() is simple:

>>> from natsort import natsorted
>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> natsorted(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']

natsorted() identifies numbers anywhere in a string and sorts them naturally. Below are some other things you can do with natsort (please see the Examples and Recipes for a quick start guide, or the natsort API for more details).

Note

natsorted() is designed to be a drop-in replacement for the built-in sorted() function. Like sorted(), natsorted() does not sort in-place. To sort a list and assign the output to the same variable, you must explicitly assign the output to a variable:

>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> natsorted(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']
>>> print(a)  # 'a' was not sorted; "natsorted" simply returned a sorted list
['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> a = natsorted(a)  # Now 'a' will be sorted because the sorted list was assigned to 'a'
>>> print(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']

Please see Generating a Reusable Sorting Key and Sorting In-Place for an alternate way to sort in-place naturally.

1.1.1. Sorting Versions

This is handled properly by default (as of natsort version >= 4.0.0):

>>> a = ['version-1.9', 'version-2.0', 'version-1.11', 'version-1.10']
>>> natsorted(a)
['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0']

If you need to sort release candidates, please see Sorting with Alpha, Beta, and Release Candidates for a useful hack.

1.1.2. Sorting by Real Numbers (i.e. Signed Floats)

This is useful in scientific data analysis and was the default behavior of natsorted() for natsort version < 4.0.0. Use the realsorted() function:

>>> from natsort import realsorted, ns
>>> # Note that when interpreting as signed floats, the below numbers are
>>> #            +5.10,                -3.00,            +5.30,              +2.00
>>> a = ['position5.10.data', 'position-3.data', 'position5.3.data', 'position2.data']
>>> natsorted(a)
['position2.data', 'position5.3.data', 'position5.10.data', 'position-3.data']
>>> natsorted(a, alg=ns.REAL)
['position-3.data', 'position2.data', 'position5.10.data', 'position5.3.data']
>>> realsorted(a)  # shortcut for natsorted with alg=ns.REAL
['position-3.data', 'position2.data', 'position5.10.data', 'position5.3.data']

1.1.3. Locale-Aware Sorting (or “Human Sorting”)

This is where the non-numeric characters are ordered based on their meaning, not on their ordinal value, and a locale-dependent thousands separator and decimal separator is accounted for in the number. This can be achieved with the humansorted() function:

>>> a = ['Apple', 'apple15', 'Banana', 'apple14,689', 'banana']
>>> natsorted(a)
['Apple', 'Banana', 'apple14,689', 'apple15', 'banana']
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> natsorted(a, alg=ns.LOCALE)
['apple15', 'apple14,689', 'Apple', 'banana', 'Banana']
>>> from natsort import humansorted
>>> humansorted(a)
['apple15', 'apple14,689', 'Apple', 'banana', 'Banana']

You may find you need to explicitly set the locale to get this to work (as shown in the example). Please see Possible Issues with humansorted() or ns.LOCALE and the Installation section below before using the humansorted() function.

1.1.4. Further Customizing Natsort

If you need to combine multiple algorithm modifiers (such as ns.REAL, ns.LOCALE, and ns.IGNORECASE), you can combine the options using the bitwise OR operator (|). For example,

>>> a = ['Apple', 'apple15', 'Banana', 'apple14,689', 'banana']
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE)
['Apple', 'apple15', 'apple14,689', 'Banana', 'banana']
>>> # The ns enum provides long and short forms for each option.
>>> ns.LOCALE == ns.L
True
>>> # You can also customize the convenience functions, too.
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE) == realsorted(a, alg=ns.L | ns.IC)
True
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE) == humansorted(a, alg=ns.R | ns.IC)
True

All of the available customizations can be found in the documentation for the ns enum ns.

1.1.5. Sorting Mixed Types

You can mix and match int, float, and str (or unicode) types when you sort:

>>> a = ['4.5', 6, 2.0, '5', 'a']
>>> natsorted(a)
[2.0, '4.5', '5', 6, 'a']
>>> # On Python 2, sorted(a) would return [2.0, 6, '4.5', '5', 'a']
>>> # On Python 3, sorted(a) would raise an "unorderable types" TypeError

1.1.6. Handling Bytes on Python 3

natsort does not officially support the bytes type on Python 3, but convenience functions are provided that help you decode to str first:

>>> from natsort import as_utf8
>>> a = [b'a', 14.0, 'b']
>>> # On Python 2, natsorted(a) would would work as expected.
>>> # On Python 3, natsorted(a) would raise a TypeError (bytes() < str())
>>> natsorted(a, key=as_utf8) == [14.0, b'a', 'b']
True
>>> a = [b'a56', b'a5', b'a6', b'a40']
>>> # On Python 2, natsorted(a) would would work as expected.
>>> # On Python 3, natsorted(a) would return the same results as sorted(a)
>>> natsorted(a, key=as_utf8) == [b'a5', b'a6', b'a40', b'a56']
True

1.1.7. Generating a Reusable Sorting Key and Sorting In-Place

Under the hood, natsorted() works by generating a custom sorting key using natsort_keygen() and then passes that to the built-in sorted(). You can use the natsort_keygen() function yourself to generate a custom sorting key to sort in-place using the list.sort() method.

>>> from natsort import natsort_keygen
>>> natsort_key = natsort_keygen()
>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> natsorted(a) == sorted(a, key=natsort_key)
True
>>> a.sort(key=natsort_key)
>>> a
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']

All of the algorithm customizations mentioned in the Further Customizing Natsort section can also be applied to natsort_keygen() through the alg keyword option.

1.1.8. Other Useful Things

1.2. Installation

Installation of natsort is ultra-easy. Simply execute from the command line:

pip install natsort

You can also download the source from https://pypi.org/project/natsort/, or browse the git repository at https://github.com/SethMMorton/natsort.

If you choose to install from source, you can unzip the source archive and enter the directory, and type:

python setup.py install

If you wish to run the unit tests, enter:

python setup.py test

If you want to build this documentation, enter:

python setup.py build_sphinx

natsort requires Python version 2.6 or greater or Python 3.2 or greater.

The most efficient sorting can occur if you install the fastnumbers package (it helps with the string to number conversions.) natsort will still run (efficiently) without the package, but if you need to squeeze out that extra juice it is recommended you include this as a dependency. natsort will not require (or check) that fastnumbers is installed.

It is recommended that you install PyICU if you wisht to sort in a locale-dependent manner, see Possible Issues with humansorted() or ns.LOCALE for an explanation why.

natsort comes with a shell script called natsort, or can also be called from the command line with python -m natsort. The command line script is only installed onto your PATH if you don’t install via a wheel. There is apparently a known bug with the wheel installation process that will not create entry points.