[Yazlist] Another MARC8 conversion problem.
adam at indexdata.dk
Wed Mar 21 18:47:50 CET 2007
Gary Anderson wrote:
> Can you point me to a URL for the CVS?
CVSROOT=:pserver:cvs at cvs.indexdata.dk:/cvs
password is anonymous
cvs co yaz
> Adam Dickmeiss wrote:
>> Gary Anderson wrote:
>>> I am passing the following UTF8 string (Values are hangul characters
>>> given in hex. Ignore spaces) to the converter:
>>> E8 87 BA E7 81 A3 E5 9C B0 E5 8D 80 E5 9C 8B E6 B0 91 E6 89 80
>>> E5 BE 97.
>>> YAZ correctly translates this string to (output in MARC8, hex, ignore
>>> 1B 28 42 21 54 2B 21 49 43 21 37 79 21 34 55 21 37 6f 21 46
>>> 4d 21 3F 75 21 30 6A
>>> esc $ 1
>>> Notice that the ending escape sequence (ESC ( B) was not appended to
>>> this string. It appeared at the beginning of my
>>> next string.
>> How did you test this? With yaz-iconv?
>> A call to
>> yaz_iconv(cd, 0, 0, &outp, &outbytesleft);
>> will set the conversion to the inital state and generate the ESC(B .
>> I can tell you this: yaz-iconv did not do it . And that's a mistake.
>>> I'm thinking that the yaz_write_marc8_page_chr module you sent in the
>>> patch isn't working, or it needs to be called from somewhere else.
>> Yesterday major changes to siconv.c were made. The new code is
>> simpler, IMHO. I really suggest you check YAZ out via CVS. One thing
>> you'll notice is that the last parameter is gone.
>> The yaz_flush_marc8, yaz_flush_ISO8859_1 does the flushing.. And are
>> called when yaz_iconv(cd, 0,0, &outp, &outbytesleft), is used.
>> You may ask: why this flushing? And why get rid of the last parameter?
>> The last parameter was set(to 1) when for the last byte/character in a
>> call to yaz_iconv (with inbuf != 0). Problem is that it may not be
>> the last of the whole input byte sequence.
>> The last is a problematic. Conversion of (large) files require
>> multiple calls to iconv anyway with chunks of input. Not necessarily
>> complete input sequences.. We must therefore flush in the end anyway.
>> More importantly: we want yaz_iconv to have iconv semantics.
>> In case of MARC we want each field data to self-contained. And hence
>> to ensure this, we flush for each field data. For YAZ' MARC utility
>> that's done in marc_iconv_reset (src/marcdisp.c).
>> / Adam
>>> Yazlist mailing list
>>> Yazlist at lists.indexdata.dk
>> Yazlist mailing list
>> Yazlist at lists.indexdata.dk
> Yazlist mailing list
> Yazlist at lists.indexdata.dk
More information about the Yazlist