[Zebralist] Sorting vowel on Zebra by ICUchain

Natapone Charsombut natapone at gmail.com
Wed Jun 24 19:31:28 CEST 2009


Dear Team

I am using Koha 3 and Zebra with ICU chain. I have a problem about
sorting order in Thai language. In Thai alphabetical order, there are
letter and vowel.The order of result set from Zebra return letter and
followed by vowel. But the correct one, We sort a letter first and
omit a vowel even it come first. For example:

I will use set of english alphabet and numeric as letter and vowel to
prevent font problem.
[A-Z] as Thai letter
[1-9] as Thai vowel

Current Order from Zebra
ABCD
ACDV
B1AB
1ABC
2ACD

The Correct Order should be like this
ABCD
ACDV
1ABC
2ACD
B1AB

ICUchain should transform Sorting order like belowed:
1ABC -> A1BC
2ACD -> A2CD

***I would like to ask, How can I write transformation rule for this case?

Thank yous
Natapone

PS. my current chain file look like this
===========thai_sorting.xml====================
<icu_chain locale="th">
	transform rule="[:Control:] Any-Remove"/>
	<tokenize rule="c"/>
	<transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/>
	<transform rule="NFD; [:Nonspacing Mark:] Remove; NFC"/>
	<display/>
	<casemap rule="l"/>
</icu_chain>
==========================================



More information about the Zebralist mailing list