[ email | About W.H.Scholten | WHS: programming projects » Alpha - numeric sorting | ] |
This is a sort method I devised ca. late 1999 or early 2000 and fixes a really annoying thing about lexicographical sorting: numbers aren't usually sorted in a useful manner. One has to name files such that numbers in them have the same length, i.e. by adding zeroes at the start of a lower number (ugly, unnatural), to get a good sort order. In February 2000 I provided patches for FreeBSD's and OpenBSD's 'ls' to prevent having to name files this way. There wasn't much interest at the time, but since then the interest in this sort (or rather, compare) function has grown a lot and I've seen lots of references to it (I had not seen anything when I introduced it, although it may have been proposed elsewhere, one can never be sure :)).
2013-12-26: It works on utf-8 names too, but there is an issue with the choice of 'alpha is bigger than numeric' or vice versa, and possibly stuff like if a character set spans a boundary so that some of it into the first n bytes, the rest needs the next byte. I will have a look at that soon.
2014-1-11: I have an idea for sub-ordering, again treating alpha and numeric separately. Description and patch to come.
2014-1-27: I will further finalise the original idea with a small tweak that fixes an issue with the original sort. With both additions the sort/compare function will be released as 'alphanumeric v2'.
rev-0.0.1.tgz rev-0.0.2.tgz rev-0.0.9.tgz rev-0.3.4.tgz rev-0.0.10.tgz rev-0.0.20.tgz rev-0.3.10.tgz
rev-0.0.1.tgz rev-0.0.9.tgz rev-0.0.20.tgz rev-0.3.10.tgz rev-0.0.2.tgz rev-0.0.10.tgz rev-0.3.4.tgz
standard:
drwxr-xr-x 2 wouter user 512 Apr 6 19:48 . drwxr-xr-x 4 wouter user 512 Apr 6 19:48 .. -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.1.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.10.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.2.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.20.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.9.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:48 rev-0.3.10.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:47 rev-0.3.4.tgz
alpha-numeric:
drwxr-xr-x 2 wouter user 512 Apr 6 19:48 . drwxr-xr-x 4 wouter user 512 Apr 6 19:48 .. -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.1.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.2.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.9.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.10.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:46 rev-0.0.20.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:47 rev-0.3.4.tgz -rw-r--r-- 1 wouter user 0 Apr 6 19:48 rev-0.3.10.tgz
Standard:
Brahms__hongaarse_dansen-1.mp3 Brahms__hongaarse_dansen-10.mp3 Brahms__hongaarse_dansen-11.mp3 Brahms__hongaarse_dansen-12.mp3 Brahms__hongaarse_dansen-13.mp3 Brahms__hongaarse_dansen-14.mp3 Brahms__hongaarse_dansen-15.mp3 Brahms__hongaarse_dansen-16.mp3 Brahms__hongaarse_dansen-17.mp3 Brahms__hongaarse_dansen-18.mp3 Brahms__hongaarse_dansen-19.mp3 Brahms__hongaarse_dansen-2.mp3 Brahms__hongaarse_dansen-20.mp3 Brahms__hongaarse_dansen-21.mp3 Brahms__hongaarse_dansen-3.mp3 Brahms__hongaarse_dansen-4.mp3 Brahms__hongaarse_dansen-5.mp3 Brahms__hongaarse_dansen-6.mp3 Brahms__hongaarse_dansen-7.mp3 Brahms__hongaarse_dansen-8.mp3 Brahms__hongaarse_dansen-9.mp3
alpha-numeric:
Brahms__hongaarse_dansen-1.mp3 Brahms__hongaarse_dansen-2.mp3 Brahms__hongaarse_dansen-3.mp3 Brahms__hongaarse_dansen-4.mp3 Brahms__hongaarse_dansen-5.mp3 Brahms__hongaarse_dansen-6.mp3 Brahms__hongaarse_dansen-7.mp3 Brahms__hongaarse_dansen-8.mp3 Brahms__hongaarse_dansen-9.mp3 Brahms__hongaarse_dansen-10.mp3 Brahms__hongaarse_dansen-11.mp3 Brahms__hongaarse_dansen-12.mp3 Brahms__hongaarse_dansen-13.mp3 Brahms__hongaarse_dansen-14.mp3 Brahms__hongaarse_dansen-15.mp3 Brahms__hongaarse_dansen-16.mp3 Brahms__hongaarse_dansen-17.mp3 Brahms__hongaarse_dansen-18.mp3 Brahms__hongaarse_dansen-19.mp3 Brahms__hongaarse_dansen-20.mp3 Brahms__hongaarse_dansen-21.mp3
This is really useful for e.g. ftp servers, where finding the latest package is often a nightmare and it makes kludges like 'latest is xx.xx.xx' to help you look, unnecessary. Also, it's really a natural sort order which means you are not having to add leading zeroes to give the proper sort outcome, which is e.g. useful for a list of files, such as photos that one can name in natural order, such as vacation-1.jpg, vacation-2.jpg,...,vacation-11.jpg etc., which will be shown in the right order with my sort.
To email me go to the email page |
Well, I don't always check and everyone makes mistakes, so mostly valid html :) This site is made using open source software, in particular FreeBSD.
Last modified: Mon Jan 27 22:20:41 CET 2014