[Yazlist] Hangul conversion

Adam Dickmeiss adam at indexdata.dk
Wed Mar 21 08:49:11 CET 2007


Gary Anderson wrote:
> The YAZ library converts UCS value 0x9234 to the triple 0x21 0x5D 0x58.  
> The LOC code tables identify this as a variant of the hangul character 
> 0x4B 0x5D 0x58 which is also represented as 0x9234 in UCS.  Is there a 
> reason that YAZ is selecting the variant instead of the non-variant form?
The short answer is: the XML parser which generates conversion code does 
not read XML comments to get the details. For this particular case, the 
fragment reads:
           <code>
              <marc>215D58</marc>
              <ucs>9234</ucs>
              <utf-8>E988B4</utf-8>
              <name>East Asian ideograph (variant of EACC 4B5D58)</name>
          </code>

Does anybody have suggestions to better ways (less dirty) than reading 
XML comments with regexp's?

Doesn't sound right to me:-)

/ Adam


> 
> Gary
> 
> _______________________________________________
> Yazlist mailing list
> Yazlist at lists.indexdata.dk
> http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yazlist




More information about the Yazlist mailing list