Alpha-numeric sorting

The original version (release 1.2)

This is a sort method I devised ca. late 1999 or early 2000 and fixes a really annoying thing about lexicographical sorting: numbers aren't usually sorted in a useful manner. One has to name files such that numbers in them have the same length, i.e. by adding zeroes at the start of a lower number (ugly, unnatural), to get a good sort order. In February 2000 I provided patches for FreeBSD's and OpenBSD's 'ls' to prevent having to name files this way. There wasn't much interest at the time, but since then the interest in this sort (or rather, compare) function has grown a lot and I've seen lots of references to it (I had not seen anything when I introduced it, although it may have been proposed elsewhere, one can never be sure :)).

Updates and coming version 2

2013-12-26: It works on utf-8 names too, but there is an issue with the choice of 'alpha is bigger than numeric' or vice versa, and possibly stuff like if a character set spans a boundary so that some of it into the first n bytes, the rest needs the next byte. I will have a look at that soon.

2014-1-11: I have an idea for sub-ordering, again treating alpha and numeric separately. Description and patch to come.

2014-1-27: I will further finalise the original idea with a small tweak that fixes an issue with the original sort. With both additions the sort/compare function will be released as 'alphanumeric v2'.

Examples, for version 1.2:

1. output from 'ls':

2. output from 'ls -la':

standard:

drwxr-xr-x  2 wouter  user  512 Apr  6 19:48 .
drwxr-xr-x  4 wouter  user  512 Apr  6 19:48 ..
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.1.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.10.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.2.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.20.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.9.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:48 rev-0.3.10.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:47 rev-0.3.4.tgz

alpha-numeric:

drwxr-xr-x  2 wouter  user  512 Apr  6 19:48 .
drwxr-xr-x  4 wouter  user  512 Apr  6 19:48 ..
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.1.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.2.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.9.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.10.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:46 rev-0.0.20.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:47 rev-0.3.4.tgz
-rw-r--r--  1 wouter  user    0 Apr  6 19:48 rev-0.3.10.tgz

3. Output of 'ls -1' with a long list of files:

Standard:

Brahms__hongaarse_dansen-1.mp3
Brahms__hongaarse_dansen-10.mp3
Brahms__hongaarse_dansen-11.mp3
Brahms__hongaarse_dansen-12.mp3
Brahms__hongaarse_dansen-13.mp3
Brahms__hongaarse_dansen-14.mp3
Brahms__hongaarse_dansen-15.mp3
Brahms__hongaarse_dansen-16.mp3
Brahms__hongaarse_dansen-17.mp3
Brahms__hongaarse_dansen-18.mp3
Brahms__hongaarse_dansen-19.mp3
Brahms__hongaarse_dansen-2.mp3
Brahms__hongaarse_dansen-20.mp3
Brahms__hongaarse_dansen-21.mp3
Brahms__hongaarse_dansen-3.mp3
Brahms__hongaarse_dansen-4.mp3
Brahms__hongaarse_dansen-5.mp3
Brahms__hongaarse_dansen-6.mp3
Brahms__hongaarse_dansen-7.mp3
Brahms__hongaarse_dansen-8.mp3
Brahms__hongaarse_dansen-9.mp3

alpha-numeric:

Brahms__hongaarse_dansen-1.mp3
Brahms__hongaarse_dansen-2.mp3
Brahms__hongaarse_dansen-3.mp3
Brahms__hongaarse_dansen-4.mp3
Brahms__hongaarse_dansen-5.mp3
Brahms__hongaarse_dansen-6.mp3
Brahms__hongaarse_dansen-7.mp3
Brahms__hongaarse_dansen-8.mp3
Brahms__hongaarse_dansen-9.mp3
Brahms__hongaarse_dansen-10.mp3
Brahms__hongaarse_dansen-11.mp3
Brahms__hongaarse_dansen-12.mp3
Brahms__hongaarse_dansen-13.mp3
Brahms__hongaarse_dansen-14.mp3
Brahms__hongaarse_dansen-15.mp3
Brahms__hongaarse_dansen-16.mp3
Brahms__hongaarse_dansen-17.mp3
Brahms__hongaarse_dansen-18.mp3
Brahms__hongaarse_dansen-19.mp3
Brahms__hongaarse_dansen-20.mp3
Brahms__hongaarse_dansen-21.mp3

This is really useful for e.g. ftp servers, where finding the latest package is often a nightmare and it makes kludges like 'latest is xx.xx.xx' to help you look, unnecessary. Also, it's really a natural sort order which means you are not having to add leading zeroes to give the proper sort outcome, which is e.g. useful for a list of files, such as photos that one can name in natural order, such as vacation-1.jpg, vacation-2.jpg,...,vacation-11.jpg etc., which will be shown in the right order with my sort.

Source code of version 1.2

To email me go to the email page



valid html 4.01valid css 2.1 FreeBSD
Well, I don't always check and everyone makes mistakes, so mostly valid html :) This site is made using open source software, in particular FreeBSD.