From 73b18f5211143781fdbd9dde8582a29905c85b33 Mon Sep 17 00:00:00 2001 From: Adam Dickmeiss Date: Thu, 17 Sep 2009 11:24:56 +0200 Subject: [PATCH] Update WRT settings, relevance etc. --- doc/pazpar2_conf.xml | 235 +++++++++++++++++++++++++++++--------------------- 1 file changed, 138 insertions(+), 97 deletions(-) diff --git a/doc/pazpar2_conf.xml b/doc/pazpar2_conf.xml index 9f0272d..c2d2227 100644 --- a/doc/pazpar2_conf.xml +++ b/doc/pazpar2_conf.xml @@ -101,80 +101,30 @@ - - - relevance - - - Specifies ICU tokenization and transformation rules - for tokens that are used in Pazpar2's relevance ranking. The 'id' - attribute is currently not used, and the 'locale' - attribute must be set to one of the locale strings - defined in ICU. The child elements listed below can be - in any order, except the 'index' element which logically - belongs to the end of the list. The stated tokenization, - transformation and charmapping instructions are performed - in order from top to bottom. - - - casemap - - - The attribute 'rule' defines the direction of the - per-character casemapping, allowed values are "l" - (lower), "u" (upper), "t" (title). - - - - transform - - - Normalization and transformation of tokens follows - the rules defined in the 'rule' attribute. For - possible values we refer to the extensive ICU - documentation found at the - ICU - transformation home page. Set filtering - principles are explained at the - ICU set and - filtering page. - - - - tokenize - - - Tokenization is the only rule in the ICU chain - which splits one token into multiple tokens. The - 'rule' attribute may have the following values: - "s" (sentence), "l" (line-break), "w" (word), and - "c" (character), the later probably not being - very useful in a pruning Pazpar2 installation. - - - - - - - sort + relevance / sort / mergekey - Specifies ICU tokenization and transformation rules - for tokens that are used in Pazpar2's sorting. The contents - is similar to that of relevance. + Specifies character set normalization for relevancy / sorting + and the mergekey - for the server. These definitions serves as + default for services that don't have these given. For the meaning + of these settings refer to the "relevance" element inside service. - mergekey + settings - Specifies ICU tokenization and transformation rules - for tokens that are used in Pazpar2's mergekey. The contents - is similar to that of relevance. + Specifies target settings for the server.. These settings serves + as default for all services which don't have these given. + The settings element requires one attribute 'src' which specifies + a settings file or a directory . If a directory is given all + files with suffix .xml is read from this + directory. Refer to + for more information. @@ -322,7 +272,6 @@ - setting @@ -337,22 +286,117 @@ the value to decide how to deal with other data values. - The purpose of using settings in this way can either be to control the behavior of normalization stylesheet in a database- dependent way, or to easily make database-dependent values available to display-logic in your user interface, without having to implement complicated interactions between the user interface and your configuration system. + + + + + relevance + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's relevance ranking. + The 'id' attribute is currently not used, and the 'locale' + attribute must be set to one of the locale strings + defined in ICU. The child elements listed below can be + in any order, except the 'index' element which logically + belongs to the end of the list. The stated tokenization, + transformation and charmapping instructions are performed + in order from top to bottom. + + + casemap + + + The attribute 'rule' defines the direction of the + per-character casemapping, allowed values are "l" + (lower), "u" (upper), "t" (title). + + + + transform + + + Normalization and transformation of tokens follows + the rules defined in the 'rule' attribute. For + possible values we refer to the extensive ICU + documentation found at the + ICU + transformation home page. Set filtering + principles are explained at the + ICU set and + filtering page. + + + + tokenize + + + Tokenization is the only rule in the ICU chain + which splits one token into multiple tokens. The + 'rule' attribute may have the following values: + "s" (sentence), "l" (line-break), "w" (word), and + "c" (character), the later probably not being + very useful in a pruning Pazpar2 installation. + + + + + + From Pazpar2 version 1.1 the ICU wrapper from YAZ is used. + Refer to the yaz-icu + utility for more information. + + + + + + sort + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's sorting. The contents + is similar to that of relevance. + + + + + + mergekey + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's mergekey. The contents + is similar to that of relevance. + + + + + + settings + + + Specifies target settings for this service. Refer to + . + + + + + @@ -360,38 +404,35 @@ EXAMPLE Below is a working example configuration: - - - - - - - - - - - - - - - - - - - - - -]]> + + + + + + + + + + + + + + + + + + + + + + + + + ]]> -- 1.7.10.4