Sorting strings: OpenOffice.org Calc, Explorer vs. Nautilus, dir vs. ls
(Written on 9:26 PM 9/11/2006 GMT+7)

While sorting my Japanese words in OOo Calc, I noticed that the katakana ア is between the hiragana あ. After a curious investigation, I concluded that OOo Calc doesn’t distinguish between hiragana and its corresponding katakana for sorting purposes. Uppercase and lowercase latin letters are also regarded as the same.
Therefore, the starting condition will determine the "sorted" condition. For example, the following column won’t change if sorted:
| A |
| a |
But the same is true for this column:
| a |
| A |
Explorer works the same way as OOo Calc, treating capitals the same as its small counterparts and hiragana the same as katakana:

However, the dir program treats katakana after hiragana which is inconsistent with Explorer:
E:\Temp\sorting test>dir Volume in drive E is Archive Volume Serial Number is A809-0E48 Directory of E:\Temp\sorting test 09/08/2006 08:13 PM <DIR> . 09/08/2006 08:13 PM <DIR> .. 09/08/2006 07:47 PM 0 Aa 09/08/2006 07:47 PM 0 ab 09/08/2006 07:47 PM 0 ba 09/08/2006 07:47 PM 0 Bb 09/08/2006 07:47 PM 0 あa 09/08/2006 07:47 PM 0 いb 09/08/2006 07:47 PM 0 アb 09/08/2006 07:47 PM 0 イa
But the behavior will change if we use /o:n (sort by name):
E:\Temp\sorting test>dir /o:n Volume in drive E is Archive Volume Serial Number is A809-0E48 Directory of E:\Temp\sorting test 09/08/2006 08:13 PM <DIR> . 09/08/2006 08:13 PM <DIR> .. 09/08/2006 07:47 PM 0 Aa 09/08/2006 07:47 PM 0 ab 09/08/2006 07:47 PM 0 ba 09/08/2006 07:47 PM 0 Bb 09/08/2006 07:47 PM 0 あa 09/08/2006 07:47 PM 0 アb 09/08/2006 07:47 PM 0 イa 09/08/2006 07:47 PM 0 いb
This is weird because by default dir already sorts latin alphabets by name (in other words, the default behavior should match /o:n).
So how does Ubuntu 6.06 fare? I booted the Live CD and here’s Nautilus in action:

Total mess! Why are kana interspersed between latin alphabets? I couldn’t figure out how that program sorts…
ls (the console command "el-es") is no better:
ubuntu@ubuntu:/media/ntfs/Temp/sorting test$ ls -l total 0 -r-xr-xr-x 1 root root 0 2006-09-08 12:47 あa -r-xr-xr-x 1 root root 0 2006-09-08 12:47 イa -r-xr-xr-x 1 root root 0 2006-09-08 12:47 Aa -r-xr-xr-x 1 root root 0 2006-09-08 12:47 ab -r-xr-xr-x 1 root root 0 2006-09-08 12:47 いb -r-xr-xr-x 1 root root 0 2006-09-08 12:47 アb -r-xr-xr-x 1 root root 0 2006-09-08 12:47 ba -r-xr-xr-x 1 root root 0 2006-09-08 12:47 Bb
I’ve reported those bugs to Ubuntu’s Launchpad.










