doc/querymodel.xml

   1  <chapter id="querymodel">
   2   <!-- $Id: querymodel.xml,v 1.6 2006-06-15 13:41:49 marc Exp $ -->
   3   <title>Query Model</title>
   4
   5   <sect1 id="querymodel-overview">
   6    <title>Query Model Overview</title>
   7
   8
   9    <sect2 id="querymodel-query-languages">
  10     <title>Query Languages</title>
  11
  12     <para>
  13      Zebra is born as a networking Information Retrieval engine adhering
  14      to the international standards
  15      <ulink url="&url.z39.50;">Z39.50</ulink> and
  16      <ulink url="&url.sru;">SRU</ulink>,
  17      and implement the query model defined there.
  18      Unfortunately, the Z39.50 query model has only defined a binary
  19      encoded representation, which is used as transport packaging in
  20      the Z39.50 protocol layer. This representation is not human
  21      readable, nor defines any convenient way to specify queries.
  22     </para>
  23    <!-- tell about RPN - include link to YAZ
  24         url.yaz.pqf -->
  25
  26    <sect3 id="querymodel-query-languages-pqf">
  27     <title>Prefix Query Format (PQF)</title>
  28
  29    <para>
  30      Index Data has defined a textual representaion in the
  31      <literal>Prefix Query Format</literal>, short
  32      <literal>PQF</literal>, which then has been adopted by other
  33      parties developing Z39.50 software. It is also often referred to as
  34      <literal>Prefix Query Notation</literal>, or in short
  35      <literal>PQN</literal>, and is thoroughly explained in
  36      <xref linkend="querymodel-pqf"/>.
  37     </para>
  38    </sect3>
  39
  40
  41    <!-- PQF/RPN is natively supported. CQL is NOT . So we need a map -->
  42    <sect3 id="querymodel-query-languages-cql">
  43     <title>Common Query Language (CQL)</title>
  44    <para>
  45      In addition, Zebra can be configured to understand and map the
  46      <literal>Common Query Language</literal>
  47      (<ulink url="&url.cql;">CQL</ulink>)
  48      to PQF. See an introduction on the mapping to the internal query
  49      representation in
  50      <xref linkend="querymodel-cql-to-pqf"/>.
  51     </para>
  52    </sect3>
  53
  54    </sect2>
  55
  56    <sect2 id="querymodel-query-types">
  57     <title>Query types</title>
  58     <para>
  59     </para>
  60
  61     <sect3 id="querymodel-query-type-explain">
  62      <title>Explain Queries</title>
  63      <para>
  64      </para>
  65     </sect3>
  66
  67     <sect3 id="querymodel-query-type-search">
  68      <title>Search Queries</title>
  69      <para>
  70      </para>
  71     </sect3>
  72
  73     <sect3 id="querymodel-query-type-scan">
  74      <title>Scan Queries</title>
  75      <para>
  76      </para>
  77     </sect3>
  78
  79    </sect2>
  80
  81  </sect1>
  82
  83
  84   <sect1 id="querymodel-pqf">
  85    <title>Prefix Query Format structure and syntax</title>
  86    <para>
  87     The <ulink url="&url.yaz.pqf;">PQF grammer</ulink>
  88     is documented in the YAZ manual, and shall not be
  89     repeated here. This textual PQF representation
  90     is always during search mapped to the equivalent Zebra internal
  91     query parse tree.
  92    </para>
  93
  94    <sect2 id="querymodel-pqf-tree">
  95     <title>PQF tree structure</title>
  96     <para>
  97      The PQF parse tree - or the equivalent textual representation -
  98      may start with one specification of the
  99      <emphasis>attribute set</emphasis> used. Following is a query
 100      tree, which
 101      consists of <emphasis>atomic query parts</emphasis>, eventually
 102      paired by <emphasis>boolean binary operators</emphasis>, and
 103      finally  <emphasis>recursively combined </emphasis> into
 104      complex query trees.
 105     </para>
 106
 107     <sect3 id="querymodel-attribute-sets">
 108      <title>Attribute sets</title>
 109      <para>
 110       Attribute sets define the exact meaning and semantics of queries
 111       issued. Zebra comes with some predefined attribute set
 112       definitions, others can easily be defined and added to the
 113       configuration.
 114       <note>
 115        The Zebra internal query procesing is modeled after
 116        the <literal>Bib1</literal> attribute set, and the non-use
 117        attributes type 2-6 are hard-wired in. It is therefore essential
 118        to be familiar with <xref linkend="querymodel-bib1"/>.
 119       </note>
 120      </para>
 121
 122      <table id="querymodel-attribute-sets-table">
 123       <caption>Attribute sets predefined in Zebra</caption>
 124        <!--
 125        <thead>
 126        <tr><td>one</td><td>two</td></tr>
 127       </thead>
 128        -->
 129        <tbody>
 130         <tr>
 131          <td><emphasis>exp-1</emphasis></td>
 132          <td><literal>Explain</literal> attribute set</td>
 133          <td>Special attribute set used on the special automagic
 134           <literal>IR-Explain-1</literal> database to gain information on
 135           server capabilities, database names, and database
 136           and semantics.</td>
 137         </tr>
 138         <tr>
 139          <td><emphasis>bib-1</emphasis></td>
 140          <td><literal>Bib1</literal> attribute set</td>
 141          <td>Standard PQF query language attribute set which defines the
 142           semantics of Z39.50 searching. In addition, all of the
 143           non-use attributes (type 2-9) define the Zebra internal query
 144           processing</td>
 145         </tr>
 146         <tr>
 147          <td><emphasis>gils</emphasis></td>
 148          <td><literal>GILS</literal> attribute set</td>
 149          <td>Extention to the <literal>Bib1</literal> attribute set.</td>
 150         </tr>
 151        </tbody>
 152      </table>
 153     </sect3>
 154
 155     <sect3 id="querymodel-boolean-operators">
 156      <title>Boolean operators</title>
 157      <para>
 158       A pair of subquery trees, or of atomic queries, is combined
 159       using the standard boolean operators into new query trees.
 160      </para>
 161
 162      <table id="querymodel-boolean-operators-table">
 163       <caption>Boolean operators</caption>
 164        <!--
 165        <thead>
 166        <tr><td>one</td><td>two</td></tr>
 167       </thead>
 168        -->
 169        <tbody>
 170         <tr><td><emphasis>@and</emphasis></td>
 171          <td>binary <literal>AND</literal> operator</td>
 172          <td>Set intersection of two atomic queries hit sets</td>
 173         </tr>
 174         <tr><td><emphasis>@or</emphasis></td>
 175          <td>binary <literal>OR</literal> operator</td>
 176          <td>Set union of two atomic queries hit sets</td>
 177         </tr>
 178         <tr><td><emphasis>@not</emphasis></td>
 179          <td>binary <literal>AND NOT</literal> operator</td>
 180          <td>Set complement of two atomic queries hit sets</td>
 181         </tr>
 182         <tr><td><emphasis>@prox</emphasis></td>
 183          <td>binary <literal>PROXIMY</literal> operator</td>
 184          <td>Set intersection of two atomic queries hit sets. In
 185           addition, the intersection set is purged for all
 186           documents which do not satisfy the requested query
 187           term proximity. Usually a proper subset of the AND
 188           operation.</td>
 189         </tr>
 190        </tbody>
 191      </table>
 192
 193      <para>
 194       For example, we can combine the terms
 195       <emphasis>information</emphasis> and <emphasis>retrieval</emphasis>
 196       into different searches in the default index of the default
 197       attribute set as follows.
 198       Querying for the union of all documents containing the
 199       terms <emphasis>information</emphasis> OR
 200       <emphasis>retrieval</emphasis>:
 201       <screen>
 202        Z> find @or information retrieval
 203       </screen>
 204      </para>
 205      <para>
 206       Querying for the intersection of all documents containing the
 207       terms <emphasis>information</emphasis> AND
 208       <emphasis>retrieval</emphasis>:
 209       The hit set is a subset of the coresponding
 210       OR query.
 211       <screen>
 212        Z> find @and information retrieval
 213       </screen>
 214      </para>
 215      <para>
 216       Querying for the intersection of all documents containing the
 217       terms <emphasis>information</emphasis> AND
 218       <emphasis>retrieval</emphasis>, taking proximity into account:
 219       The hit set is a subset of the coresponding
 220       AND query.
 221       <screen>
 222        Z> find @prox information retrieval
 223       </screen>
 224      </para>
 225      <para>
 226       Querying for the intersection of all documents containing the
 227       terms <emphasis>information</emphasis> AND
 228       <emphasis>retrieval</emphasis>, in the same order and near each
 229       other as described in the term list
 230       The hit set is a subset of the coresponding
 231       PROXIMY query.
 232       <screen>
 233        Z> find "information retrieval"
 234       </screen>
 235      </para>
 236     </sect3>
 237
 238
 239     <sect3 id="querymodel-atomic-queries">
 240      <title>Atomic queries</title>
 241      <para>
 242       Atomic queries are the query parts which work on one acess point
 243       only. These consist of <literal>an attribute list</literal>
 244       followed by a <literal>single term</literal> or a
 245       <literal>quoted term list</literal>.
 246      </para>
 247      <para>
 248       Unsupplied non-use attributes type 2-9 are either inherited from
 249       higher nodes in the query tree, or are set to Zebra's default values.
 250       See <xref linkend="querymodel-bib1"/> for details.
 251      </para>
 252
 253      <table id="querymodel-atomic-queries-table">
 254       <caption>Atomic queries</caption>
 255        <!--
 256        <thead>
 257        <tr><td>one</td><td>two</td></tr>
 258       </thead>
 259        -->
 260        <tbody>
 261         <tr><td><emphasis>attribute list</emphasis></td>
 262          <td>List of <literal>orthogonal</literal> attributes</td>
 263          <td>Any of the orthogonal attribute types may be omitted,
 264           these are inherited from higher query tree nodes, or if not
 265           inherited, are set to the default Zebra configuration values.
 266          </td>
 267         </tr>
 268         <tr><td><emphasis>term</emphasis></td>
 269          <td>single <literal>term</literal>
 270           or <literal>quoted term list</literal>   </td>
 271          <td>Here the search terms or list of search terms is added
 272           to the query</td>
 273         </tr>
 274        </tbody>
 275      </table>
 276      <para>
 277       Querying for the term <emphasis>information</emphasis> in the
 278       default index using the default attribite set, the server choice
 279       of access point/index, and the default non-use attributes.
 280       <screen>
 281        Z> find "information"
 282       </screen>
 283      </para>
 284      <para>
 285       Equivalent query fully specified:
 286       <screen>
 287        Z> find @attrset bib-1 @attr 1=1017 @attr 2=3 @attr 3=3 @attr 4=1 @attr 5=100 @attr 6=1 "information"
 288       </screen>
 289      </para>
 290
 291      <para>
 292       Finding all documents which have empty titles. Notice that the
 293       empty term must be quoted, but is otherwise legal.
 294       <screen>
 295        Z> find @attr 1=4 ""
 296       </screen>
 297      </para>
 298
 299     </sect3>
 300
 301     <sect3 id="querymodel-use-string">
 302      <title>Zebra's special use attribute type 1 of form 'string'</title>
 303      <para>
 304       The numeric <literal>use (type 1)</literal> attribute is usually
 305       refered to from a given
 306       attribute set. In addition, Zebra let you use
 307       <emphasis>any internal index
 308        name defined in your configuration</emphasis>
 309       as use atribute value. This is a great feature for
 310       debugging, and when you do
 311       not need the complecity of defined use attribute values. It is
 312       the preferred way of accessing Zebra indexes directly.
 313      </para>
 314      <para>
 315       Finding all documents which have the term list "information
 316       retrieval" in an Zebra index, using it's internal full string name.
 317       <screen>
 318        Z> find @attr 1=sometext "information retrieval"
 319       </screen>
 320      </para>
 321      <para>
 322       Searching the bib-1 use attribute 54 using it's string name:
 323       <screen>
 324        Z> find @attr 1=Code-language eng
 325       </screen>
 326      </para>
 327      <para>
 328       Searching in any silly string index - if it's defined in your
 329       indexation rules and can be parsed by the PQF parser.
 330       This is definitely not the recommended use of
 331       this facility, as it might confuse your users with some very
 332       unexpected results.
 333       <screen>
 334        Z> find @attr 1=silly/xpath/alike[@index]/name "information retrieval"
 335       </screen>
 336      </para>
 337      <para>
 338       See <xref linkend="querymodel-bib1-mapping"/> for details, and
 339       <xref linkend="server-sru"/>
 340       for the SRU PQF query extention using string names as a fast
 341       debugging facility.
 342      </para>
 343     </sect3>
 344
 345     <sect3 id="querymodel-use-xpath">
 346      <title>Zebra's special use attribute type 1 of form 'XPath'
 347       for GRS filters</title>
 348      <para>
 349       As we have seen above, it is possible (albeit seldom a great
 350       idea) to emulate
 351       <ulink url="http://www.w3.org/TR/xpath">XPath 1.0</ulink> based
 352       search by defining <literal>use (type 1)</literal>
 353       <emphasis>string</emphasis> attributes which in appearence
 354       <emphasis>resemble XPath queries</emphasis>. There are two
 355       problems with this approach: first, the XPath-look-alike has to
 356       be defined at indexation time, no new undefined
 357       XPath queries can entered at search time, and second, it might
 358       confuse users very much that an XPath-alike index name in fact
 359       gets populated from a possible entirely different XML element
 360       than it pretends to acess.
 361      </para>
 362      <para>
 363       When using the <literal>GRS Record Model</literal>
 364       (see  <xref linkend="record-model-grs"/>), we have the
 365       possibility to embed <emphasis>life</emphasis>
 366       XPath expressions
 367       in the PQF queries, which are here called
 368       <literal>use (type 1)</literal> <emphasis>xpath</emphasis>
 369       attributes. You must enable the
 370       <literal>xpath enable</literal> directive in your
 371       <literal>.abs</literal> config files.
 372      </para>
 373      <note>
 374       Only a <emphasis>very</emphasis> restricted subset of the
 375       <ulink url="http://www.w3.org/TR/xpath">XPath 1.0</ulink>
 376       standard is supported as the GRS record model is simpler than
 377       a full XML DOM structure. See the following examples for
 378       possibilities.
 379      </note>
 380      <para>
 381       Finding all documents which have the term "content"
 382       inside a text node found in a specific XML DOM
 383       <emphasis>subtree</emphasis>, whose starting element is
 384       adressed by XPath.
 385       <screen>
 386        Z> find @attr 1=/root content
 387        Z> find @attr 1=/root/first content
 388       </screen>
 389       <emphasis>Notice that the
 390        XPath must be absolute, i.e., must start with '/', and that the
 391        XPath <literal>decendant-or-self</literal> axis followed by a
 392        text node selection <literal>text()</literal> is implicitly
 393        appended to the stated XPath.
 394       </emphasis>
 395       It follows that the above searches are interpreted as:
 396       <screen>
 397        Z> find @attr 1=/root//text() content
 398        Z> find @attr 1=/root/first//text() content
 399       </screen>
 400      </para>
 401
 402      <para>
 403       Filter the adressing XPath by a predicate working on exact
 404       string values in
 405       attributes (in the XML sense) can be done: return all those docs which
 406       have the term "english" contained in one of all text subnodes of
 407       the subtree defined by the XPath
 408       <literal>/record/title[@lang='en']</literal>
 409       <screen>
 410        Z> find @attr 1=/record/title[@lang='en'] english
 411       </screen>
 412      </para>
 413
 414      <para>
 415       Combining numeric indexes, boolean expressions,
 416       and xpath based searches is possible:
 417       <screen>
 418        Z> find @attr 1=/record/title @and foo bar
 419        Z> find @and @attr 1=/record/title foo @attr 1=4 bar
 420       </screen>
 421      </para>
 422      <para>
 423       Escaping PQF keywords and other non-parseable XPath constructs
 424       with <literal>'{ }'</literal> to prevent syntax errors:
 425       <screen>
 426        Z> find @attr {1=/root/first[@attr='danish']} content
 427        Z> find @attr {1=/root/second[@attr='danish lake']}
 428        Z> find @attr {1=/root/third[@attr='dansk s\xc3\xb8']}
 429       </screen>
 430      </para>
 431      <warning>
 432       It is worth mentioning that these dynamic performed XPath
 433       queries are a performance bottelneck, as no optimized
 434       specialized indexes can be used. Therefore, avoid the use of
 435       this facility when speed is essential, and the database content
 436       size is medium to large.
 437      </warning>
 438     </sect3>
 439
 440    </sect2>
 441
 442    <sect2 id="querymodel-exp1">
 443     <title>Explain Attribute Set</title>
 444     <para>
 445      The Z39.50 standard defines the
 446      <ulink url="&url.z39.50.explain;">Explain</ulink>attribute set
 447      <literal>exp-1</literal>, which is used to discover information
 448      about a server's search semantics and functional capabilities
 449      Zebra exposes a  "classic"
 450      Explain database by base name <literal>IR-Explain-1</literal>, which
 451      is populated with system internal information.
 452     </para>
 453    <para>
 454      The attribute-set <literal>exp-1</literal> consists of a single
 455      <literal>Use (type 1)</literal> attribute.
 456     </para>
 457     <para>
 458      In addition, the non-Use
 459      <literal>bib-1</literal> attributes, that is, the types
 460      <literal>Relation</literal>, <literal>Position</literal>,
 461      <literal>Structure</literal>, <literal>Truncation</literal>,
 462      and <literal>Completeness</literal> are imported from
 463      the <literal>bib-1</literal> attribute set, and may be used
 464      within any explain query.
 465     </para>
 466
 467     <sect3 id="querymodel-exp1-use">
 468     <title>Use Attributes (type = 1)</title>
 469      <para>
 470       The following Explain search atributes are supported:
 471       <literal>ExplainCategory</literal> (@attr 1=1),
 472       <literal>DatabaseName</literal> (@attr 1=3),
 473       <literal>DateAdded</literal> (@attr 1=9),
 474       <literal>DateChanged</literal>(@attr 1=10).
 475      </para>
 476      <para>
 477       A search in the use attribute  <literal>ExplainCategory</literal>
 478       supports only these predefined values:
 479       <literal>CategoryList</literal>, <literal>TargetInfo</literal>,
 480       <literal>DatabaseInfo</literal>, <literal>AttributeDetails</literal>.
 481      </para>
 482      <para>
 483       See <filename>tab/explain.att</filename> and the
 484       <ulink url="&url.z39.50;">Z39.50</ulink> standard
 485       for more information.
 486      </para>
 487     </sect3>
 488
 489     <sect3>
 490      <title>Explain searches with yaz-client</title>
 491      <para>
 492       Classic Explain only defines retrieval of Explain information
 493       via ASN.1. Pratically no Z39.50 clients supports this. Fortunately
 494       they don't have to - Zebra allows retrieval of this information
 495       in other formats:
 496       <literal>SUTRS</literal>, <literal>XML</literal>,
 497       <literal>GRS-1</literal> and  <literal>ASN.1</literal> Explain.
 498      </para>
 499
 500      <para>
 501       List supported categories to find out which explain commands are
 502       supported:
 503       <screen>
 504        Z> base IR-Explain-1
 505        Z> find @attr exp1 1=1 categorylist
 506        Z> form sutrs
 507        Z> show 1+2
 508       </screen>
 509      </para>
 510
 511      <para>
 512       Get target info, that is, investigate which databases exist at
 513       this server endpoint:
 514       <screen>
 515        Z> base IR-Explain-1
 516        Z> find @attr exp1 1=1 targetinfo
 517        Z> form xml
 518        Z> show 1+1
 519        Z> form grs-1
 520        Z> show 1+1
 521        Z> form sutrs
 522        Z> show 1+1
 523       </screen>
 524      </para>
 525
 526      <para>
 527       List all supported databases, the number of hits
 528       is the number of databases found, which most commonly are the
 529       following two:
 530       the <literal>Default</literal> and the
 531       <literal>IR-Explain-1</literal> databases.
 532       <screen>
 533        Z> base IR-Explain-1
 534        Z> find @attr exp1 1=1 databaseinfo
 535        Z> form sutrs
 536        Z> show 1+2
 537       </screen>
 538      </para>
 539
 540      <para>
 541       Get database info record for database <literal>Default</literal>.
 542       <screen>
 543        Z> base IR-Explain-1
 544        Z> find @and @attr exp1 1=1 databaseinfo @attr exp1 1=3 Default
 545       </screen>
 546       Identical query with explicitly specified attribute set:
 547       <screen>
 548        Z> base IR-Explain-1
 549        Z> find @attrset exp1 @and @attr 1=1 databaseinfo @attr 1=3 Default
 550       </screen>
 551      </para>
 552
 553      <para>
 554       Get attribute details record for database
 555       <literal>Default</literal>.
 556       This query is very useful to study the internal Zebra indexes.
 557       If records have been indexed using the <literal>alvis</literal>
 558       XSLT filter, the string representation names of the known indexes can be
 559       found.
 560       <screen>
 561        Z> base IR-Explain-1
 562        Z> find @and @attr exp1 1=1 attributedetails @attr exp1 1=3 Default
 563       </screen>
 564       Identical query with explicitly specified attribute set:
 565       <screen>
 566        Z> base IR-Explain-1
 567        Z> find @attrset exp1 @and @attr 1=1 attributedetails @attr 1=3 Default
 568       </screen>
 569      </para>
 570     </sect3>
 571
 572    </sect2>
 573
 574    <sect2 id="querymodel-bib1">
 575     <title>Bib1 Attribute Set</title>
 576     <para>
 577      Something about querying to be written ..
 578     </para>
 579     <para>
 580      Most of the information contained in this section is an excerpt of
 581      the <literal>ATTRIBUTE SET BIB-1 (Z39.50-1995)
 582       SEMANTICS</literal>,
 583      found at  <ulink url="&url.z39.50.attset.bib1.1995;">. The BIB-1
 584       Attribute Set Semantics</ulink> from 1995, also in an updated
 585      <ulink url="&url.z39.50.attset.bib1;">Bib-1
 586       Attribute Set</ulink>
 587      version from 2003. Index Data is not the copyright holder of this
 588      information.
 589     </para>
 590
 591
 592    <sect3 id="querymodel-bib1-use">
 593      <title>Use Attributes (type 1)</title>
 594     </sect3>
 595
 596     <para>
 597      A use attribute specifies an access point for any atomic query.
 598      These acess points are highly dependent on the attribute set used
 599      in the query, and are user configurable using the following
 600      default configuration files:
 601      <filename>tab/bib1.att</filename>,
 602      <filename>tab/dan1.att</filename>,
 603      <filename>tab/explain.att</filename>, and
 604      <filename>tab/gils.att</filename>.
 605      New attribute sets can be added by adding new
 606      <filename>tab/*.att</filename> configuration files, which need to
 607      be sourced in the main configuration <filename>zebra.cfg</filename>.
 608      </para>
 609
 610     <para>
 611      In addition, Zebra allows the acess of
 612      <emphasis>internal index names</emphasis> and <emphasis>dynamic
 613      XPath</emphasis> as use attributes.
 614      See  <xref linkend="querymodel-use-string and  "/>
 615      <xref linkend="querymodel-use-xpath"/> for
 616      alternative acess to the Zebra internal index names and XPath queries.
 617     </para>
 618
 619     <para>
 620      Phrase search for <emphasis>information retrieval</emphasis> in
 621      the title-register:
 622      <screen>
 623       Z> find @attr 1=4 "information retrieval"
 624      </screen>
 625     </para>
 626
 627
 628     <sect3 id="querymodel-bib1-relation">
 629      <title>Relation Attributes (type 2)</title>
 630
 631      <para>
 632       Relation attributes describe the relationship of the access
 633       point (left side
 634       of the relation) to the search term as qualified by the attributes (right
 635       side of the relation), e.g., Date-publication &lt;= 1975.
 636       </para>
 637
 638      <table id="querymodel-bib1-relation-table">
 639       <caption>Relation Attributes (type 2)</caption>
 640       <thead>
 641         <tr>
 642          <td>Relation</td>
 643          <td>Value</td>
 644          <td>Notes</td>
 645         </tr>
 646        </thead>
 647        <tbody>
 648         <tr>
 649          <td> Less than</td>
 650          <td>1</td>
 651          <td>supported</td>
 652         </tr>
 653         <tr>
 654          <td>Less than or equal</td>
 655          <td>2</td>
 656          <td>supported</td>
 657         </tr>
 658         <tr>
 659          <td>Equal</td>
 660          <td>3</td>
 661          <td>default</td>
 662         </tr>
 663         <tr>
 664          <td>Greater or equal</td>
 665          <td>4</td>
 666          <td>supported</td>
 667         </tr>
 668         <tr>
 669          <td>Greater than</td>
 670          <td>5</td>
 671          <td>supported</td>
 672         </tr>
 673         <tr>
 674          <td>Not equal</td>
 675          <td>6</td>
 676          <td>unsupported</td>
 677         </tr>
 678         <tr>
 679          <td>Phonetic</td>
 680          <td>100</td>
 681          <td>unsupported</td>
 682         </tr>
 683         <tr>
 684          <td>Stem</td>
 685          <td>101</td>
 686          <td>unsupported</td>
 687         </tr>
 688         <tr>
 689          <td>Relevance</td>
 690          <td>102</td>
 691          <td>supported</td>
 692         </tr>
 693         <tr>
 694          <td>AlwaysMatches</td>
 695          <td>103</td>
 696          <td>supported</td>
 697         </tr>
 698        </tbody>
 699      </table>
 700
 701      <para>
 702       The relation attribute
 703       <literal>relevance (102)</literal> is supported, see
 704       <xref linkend="administration-ranking"/> for full information.
 705       <!-- always-matches (103) not supported for all indexes -->
 706      </para>
 707
 708     <para>
 709      All ordering operations are based on a lexicographical ordering,
 710      <emphasis>expect</emphasis> when the
 711      structure attribute <literal>numeric (109)</literal> is used. In
 712      this case, ordering is numerical. See
 713       <xref linkend="querymodel-bib1-structure"/>.
 714     </para>
 715
 716      <para>
 717      Ranked search for <emphasis>information retrieval</emphasis> in
 718      the title-register
 719      (see <xref linkend="administration-ranking"/> for the glory details):
 720      <screen>
 721       Z> find @attr 1=4 @attr 2=102 "information retrieval"
 722      </screen>
 723     </para>
 724     </sect3>
 725
 726     <sect3 id="querymodel-bib1-position">
 727      <title>Position Attributes (type 3)</title>
 728
 729      <para>
 730       The position attribute specifies the location of the search term
 731       within the field or subfield in which it appears.
 732      </para>
 733
 734      <table id="querymodel-bib1-position-table">
 735       <caption>Position Attributes (type 3)</caption>
 736       <thead>
 737         <tr>
 738          <td>Position</td>
 739          <td>Value</td>
 740          <td>Notes</td>
 741         </tr>
 742        </thead>
 743        <tbody>
 744         <tr>
 745          <td>First in field </td>
 746          <td>1</td>
 747          <td>unsupported</td>
 748         </tr>
 749         <tr>
 750          <td>First in subfield</td>
 751          <td>2</td>
 752          <td>unsupported</td>
 753         </tr>
 754         <tr>
 755          <td>Any position in field</td>
 756          <td>3</td>
 757          <td>default</td>
 758         </tr>
 759        </tbody>
 760      </table>
 761
 762     <para>
 763       The position attribute values <literal>first in field (1)</literal>,
 764       and <literal>first in subfield(2)</literal> are unsupported.
 765       Using them does not trigger an error, but silent defaults to
 766       <literal>any position in field (3)</literal>.
 767       <!-- It should -->
 768       </para>
 769     </sect3>
 770
 771     <sect3 id="querymodel-bib1-structure">
 772      <title>Structure Attributes (type 4)</title>
 773
 774      <para>
 775       The structure attribute specifies the type of search
 776       term. This causes the search to be mapped on
 777       different Zebra internal indexes, which must have been defined
 778       at index time.
 779      </para>
 780
 781      <para>
 782       The possible values of the
 783       <literal>structure attribute (type 4)</literal> can be defined
 784       using the configuraiton file <filename>
 785       tab/default.idx</filename>.
 786       The default configuration is summerized in this table.
 787      </para>
 788
 789      <table id="querymodel-bib1-structure-table">
 790       <caption>Structure Attributes (type 4)</caption>
 791       <thead>
 792         <tr>
 793          <td>Structure</td>
 794          <td>Value</td>
 795          <td>Notes</td>
 796         </tr>
 797        </thead>
 798        <tbody>
 799         <tr>
 800          <td>Phrase </td>
 801          <td>1</td>
 802          <td>default</td>
 803         </tr>
 804         <tr>
 805          <td>Word</td>
 806          <td>2</td>
 807          <td>supported</td>
 808         </tr>
 809         <tr>
 810          <td>Key</td>
 811          <td>3</td>
 812          <td>supported</td>
 813         </tr>
 814         <tr>
 815          <td>Year</td>
 816          <td>4</td>
 817          <td>supported</td>
 818         </tr>
 819         <tr>
 820          <td>Date (normalized)</td>
 821          <td>5</td>
 822          <td>supported</td>
 823         </tr>
 824         <tr>
 825          <td>Word list</td>
 826          <td>6</td>
 827          <td>supported</td>
 828         </tr>
 829         <tr>
 830          <td>Date (un-normalized)</td>
 831          <td>100</td>
 832          <td>unsupported</td>
 833         </tr>
 834         <tr>
 835          <td>Name (normalized) </td>
 836          <td>101</td>
 837          <td>unsupported</td>
 838         </tr>
 839         <tr>
 840          <td>Name (un-normalized) </td>
 841          <td>102</td>
 842          <td>unsupported</td>
 843         </tr>
 844         <tr>
 845          <td>Structure</td>
 846          <td>103</td>
 847          <td>unsupported</td>
 848         </tr>
 849         <tr>
 850          <td>Urx</td>
 851          <td>104</td>
 852          <td>supported</td>
 853         </tr>
 854         <tr>
 855          <td>Free-form-text</td>
 856          <td>105</td>
 857          <td>supported</td>
 858         </tr>
 859         <tr>
 860          <td>Document-text</td>
 861          <td>106</td>
 862          <td>supported</td>
 863         </tr>
 864         <tr>
 865          <td>Local-number</td>
 866          <td>107</td>
 867          <td>supported</td>
 868         </tr>
 869         <tr>
 870          <td>String</td>
 871          <td>108</td>
 872          <td>unsupported</td>
 873         </tr>
 874         <tr>
 875          <td>Numeric string</td>
 876          <td>109</td>
 877          <td>supported</td>
 878         </tr>
 879        </tbody>
 880      </table>
 881     </sect3>
 882
 883     <para>
 884      The structure attribute value <literal>local-number
 885       (107)</literal>
 886      is supported, and maps always to the Zebra internal document ID.
 887      </para>
 888
 889     <para>
 890      For example, in
 891      the GILS schema (<literal>gils.abs</literal>), the
 892      west-bounding-coordinate is indexed as type <literal>n</literal>,
 893      and is therefore searched by specifying
 894      <emphasis>structure</emphasis>=<emphasis>Numeric String</emphasis>.
 895      To match all those records with west-bounding-coordinate greater
 896      than -114 we use the following query:
 897      <screen>
 898       Z> find @attr 4=109 @attr 2=5 @attr gils 1=2038 -114
 899      </screen>
 900     </para>
 901
 902     <sect3 id="querymodel-bib1-truncation">
 903      <title>Truncation Attributes (type = 5)</title>
 904
 905      <para>
 906       The truncation attribute specifies whether variations of one or
 907       more characters are allowed between serch term and hit terms, or
 908       not. Using non-default truncation attributes will broaden the
 909       document hit set of a search query.
 910      </para>
 911
 912      <table id="querymodel-bib1-truncation-table">
 913       <caption>Truncation Attributes (type 5)</caption>
 914       <thead>
 915         <tr>
 916          <td>Truncation</td>
 917          <td>Value</td>
 918          <td>Notes</td>
 919         </tr>
 920        </thead>
 921        <tbody>
 922         <tr>
 923          <td>Right truncation </td>
 924          <td>1</td>
 925          <td>supported</td>
 926         </tr>
 927         <tr>
 928          <td>Left truncation</td>
 929          <td>2</td>
 930          <td>supported</td>
 931         </tr>
 932         <tr>
 933          <td>Left and right truncation</td>
 934          <td>3</td>
 935          <td>supported</td>
 936         </tr>
 937         <tr>
 938          <td>Do not truncate</td>
 939          <td>100</td>
 940          <td>default</td>
 941         </tr>
 942         <tr>
 943          <td>Process # in search term</td>
 944          <td>101</td>
 945          <td>supported</td>
 946         </tr>
 947         <tr>
 948          <td>RegExpr-1 </td>
 949          <td>102</td>
 950          <td>supported</td>
 951         </tr>
 952         <tr>
 953          <td>RegExpr-2</td>
 954          <td>103</td>
 955          <td>supported</td>
 956         </tr>
 957        </tbody>
 958      </table>
 959
 960      <para>
 961       Truncation attribute value
 962       <literal>Process # in search term (100)</literal> is a
 963       poor-man's regular expression search. It maps
 964       each <literal>#</literal> to <literal>.*</literal>, and
 965       performes then a <literal>Regexp-1 (102)</literal> regular
 966       expression search.
 967      </para>
 968      <para>
 969       Truncation attribute value
 970        <literal>Regexp-1 (102)</literal> is a normal regular search,
 971       see.
 972      </para>
 973      <para>
 974        Truncation attribute value
 975       <literal>Regexp-2 (103) </literal> is a Zebra specific extention
 976       which allows <emphasis>fuzzy</emphasis> matches. One single
 977       error in spelling of search terms is allowed, i.e., a document
 978       is hit if it includes a term which can be mapped to the used
 979       search term by one character substitution, addition, deletion or
 980       change of posiiton.
 981       </para>
 982       <!--
 983       Special 104, 105, 106 are deprecated and will be removed! -->
 984     </sect3>
 985
 986     <sect3 id="querymodel-bib1-completeness">
 987     <title>Completeness Attributes (type = 6)</title>
 988      <para>
 989       This attribute is ONLY used if structure w, p is to be
 990       chosen. completeness is ignorned if not w, p is to be
 991       used..
 992       Incomplete field(1) is the default and makes Zebra use
 993       register type w.
 994       complete subfield(2) and complete field(3) both triggers
 995       search field type p.
 996      </para>
 997     </sect3>
 998    </sect2>
 999
1000
1001    <sect2 id="querymodel-zebra-attr-search">
1002     <title>Zebra specific Search Extentions to all Attribute Sets</title>
1003     <para>
1004      Zebra extends the Bib1 attribute types, and these extentions are
1005      recognized regardless of attribute
1006      set used in a <literal>search</literal> operation query.
1007     </para>
1008
1009      <table id="querymodel-zebra-attr-search-table">
1010       <caption>Zebra Search Attribute Extentions</caption>
1011        <thead>
1012         <tr>
1013          <td>Name</td>
1014          <td>Value</td>
1015          <td>Operation</td>
1016          <td>Zebra version</td>
1017         </tr>
1018       </thead>
1019        <tbody>
1020         <tr>
1021          <td>Embedded Sort</td>
1022          <td>7</td>
1023          <td>search</td>
1024          <td>1.1</td>
1025         </tr>
1026         <tr>
1027          <td>Term Set</td>
1028          <td>8</td>
1029          <td>search</td>
1030          <td>1.1</td>
1031         </tr>
1032         <tr>
1033          <td>Rank Weight</td>
1034          <td>9</td>
1035          <td>search</td>
1036          <td>1.1</td>
1037         </tr>
1038         <tr>
1039          <td>Approx Limit</td>
1040          <td>9</td>
1041          <td>search</td>
1042          <td>1.4</td>
1043         </tr>
1044         <tr>
1045          <td>Term Reference</td>
1046          <td>10</td>
1047          <td>search</td>
1048          <td>1.4</td>
1049         </tr>
1050        </tbody>
1051       </table>
1052
1053     <sect3 id="querymodel-zebra-attr-sorting">
1054      <title>Zebra Extention Embedded Sort Attribute (type 7)</title>
1055     </sect3>
1056     <para>
1057      The embedded sort is a way to specify sort within a query - thus
1058      removing the need to send a Sort Request separately. It is both
1059      faster and does not require clients to deal with the Sort
1060      Facility.
1061     </para>
1062     <para>
1063      The possible values after attribute <literal>type 7</literal> are
1064      <literal>1</literal> ascending and
1065      <literal>2</literal> descending.
1066      The attributes+term (APT) node is separate from the
1067      rest and must be <literal>@or</literal>'ed.
1068      The term associated with APT is the sorting level in integers,
1069      where <literal>0</literal> means primary sort,
1070      <literal>1</literal> means secondary sort, and so forth.
1071      See also <xref linkend="administration-ranking"/>.
1072     </para>
1073     <para>
1074      For example, searching for water, sort by title (ascending)
1075      <screen>
1076       Z> find @or @attr 1=1016 water @attr 7=1 @attr 1=4 0
1077      </screen>
1078     </para>
1079     <para>
1080      Or, searching for water, sort by title ascending, then date descending
1081      <screen>
1082       Z> find @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1
1083      </screen>
1084     </para>
1085
1086     <sect3 id="querymodel-zebra-attr-estimation">
1087      <title>Zebra Extention Term Set Attribute (type 8)</title>
1088     </sect3>
1089     <para>
1090      The Term Set feature is a facility that allows a search to store
1091      hitting terms in a "pseudo" resultset; thus a search (as usual) +
1092      a scan-like facility. Requires a client that can do named result
1093      sets since the search generates two result sets. The value for
1094      attribute 8 is the name of a result set (string). The terms in
1095      the named term set are returned as SUTRS records.
1096     </para>
1097     <para>
1098      For example, searching  for u in title, right truncated, and
1099      storing the result in term set named 'aset'
1100      <screen>
1101       Z> find @attr 5=1 @attr 1=4 @attr 8=aset u
1102      </screen>
1103     </para>
1104     <warning>
1105      The model has one serious flaw: we don't know the size of term
1106      set. Experimental. Do not use in production code.
1107     </warning>
1108
1109     <sect3 id="querymodel-zebra-attr-weight">
1110      <title>Zebra Extention Rank Weight Attribute (type 9)</title>
1111     </sect3>
1112     <para>
1113      Rank weight is a way to pass a value to a ranking algorithm - so
1114      that one APT has one value - while another as a different one.
1115      See also <xref linkend="administration-ranking"/>.
1116     </para>
1117     <para>
1118      For example, searching  for utah in title with weight 30 as well
1119      as any with weight 20:
1120      <screen>
1121       Z> find @attr 2=102 @or @attr 9=30 @attr 1=4 utah @attr 9=20 utah
1122      </screen>
1123     </para>
1124
1125     <sect3 id="querymodel-zebra-attr-limit">
1126      <title>Zebra Extention Approximative Limit Attribute (type 9)</title>
1127     </sect3>
1128     <para>
1129      Newer Zebra versions normally estemiates hit count for every APT
1130      (leaf) in the query tree. These hit counts are returned as part of
1131      the searchResult-1 facility in the binary encoded Z39.50 search
1132      response packages.
1133     </para>
1134     <para>
1135      By setting a limit for the APT we can make Zebra turn into
1136      approximate hit count when a certain hit count limit is
1137      reached. A value of zero means exact hit count.
1138     </para>
1139     <para>
1140      For example, we might be intersted in exact hit count for a, but
1141      for b we allow hit count estimates for 1000 and higher.
1142      <screen>
1143       Z> find @and a @attr 9=1000 b
1144      </screen>
1145     </para>
1146     <note>
1147      The estimated hit count fascility makes searches faster, as one
1148      only needs to process large hit lists partially.
1149     </note>
1150     <warning>
1151      This facility clashes with rank weight, because there all
1152      documents in the hit lists need to be examined for scoring and
1153      re-sorting.
1154      It is an experimental
1155      extention. Do not use in production code.
1156     </warning>
1157
1158     <sect3 id="querymodel-zebra-attr-termref">
1159      <title>Zebra Extention Term Reference Attribute (type 10)</title>
1160     </sect3>
1161     <para>
1162      Zebra supports the searchResult-1 facility. If attribute 10 is
1163      given, that specifies a subqueryId value returned as part of the
1164      search result. It is a way for a client to name an APT part of a
1165      query.
1166     </para>
1167     <!--
1168     <para>
1169      <screen>
1170      </screen>
1171     </para>
1172     -->
1173     <warning>
1174      Experimental. Do not use in production code.
1175     </warning>
1176
1177
1178    </sect2>
1179
1180
1181    <sect2 id="querymodel-zebra-attr-scan">
1182     <title>Zebra specific Scan Extentions to all Attribute Sets</title>
1183     <para>
1184      Zebra extends the Bib1 attribute types, and these extentions are
1185      recognized regardless of attribute
1186      set used in a <literal>scan</literal> operation query.
1187     </para>
1188      <table id="querymodel-zebra-attr-scan-table">
1189       <caption>Zebra Scan Attribute Extentions</caption>
1190        <thead>
1191         <tr>
1192          <td><emphasis>Name and Type</emphasis></td>
1193          <td>Operation</td>
1194          <td>Zebra version</td>
1195         </tr>
1196       </thead>
1197        <tbody>
1198         <tr>
1199          <td><emphasis>Result Set Narrow (type 8)</emphasis></td>
1200          <td>scan</td>
1201          <td>1.3</td>
1202         </tr>
1203         <tr>
1204          <td><emphasis>Approximative Limit (type 9)</emphasis></td>
1205          <td>scan</td>
1206          <td>1.4</td>
1207         </tr>
1208        </tbody>
1209       </table>
1210
1211     <sect3 id="querymodel-zebra-attr-xyz">
1212      <title>Zebra Extention Result Set Narrow (type 8)</title>
1213     </sect3>
1214     <para>
1215      If attribute 8 is given for scan, the value is the name of a
1216      result set. Each hit count in scan is @and'ed with the result set
1217      given.
1218     </para>
1219     <!--
1220     <para>
1221      <screen>
1222      </screen>
1223     </para>
1224     -->
1225     <warning>
1226      Experimental and buggy. Definitely not to be used in production code.
1227     </warning>
1228
1229     <sect3 id="querymodel-zebra-attr-xyz">
1230      <title>Zebra Extention Approximative Limit (type 9)</title>
1231     </sect3>
1232     <para>
1233      The approximative limit (as for search) is a way to enable approx
1234      hit counts for scan hit counts.
1235     </para>
1236     <!--
1237     <para>
1238      <screen>
1239      </screen>
1240     </para>
1241     -->
1242     <warning>
1243      Experimental. Do not use in production code.
1244     </warning>
1245
1246
1247    </sect2>
1248
1249
1250    <sect2 id="querymodel-bib1-mapping">
1251     <title>Mapping from Bib1 Attributes to Zebra internal
1252      register indexes</title>
1253     <para>
1254      TO-DO
1255      </para>
1256
1257
1258      <!-- see in util/zebramap.c
1259       int zebra_maps_attr
1260
1261   if (completeness_value == 2 || completeness_value == 3)
1262         *complete_flag = 1;
1263     else
1264         *complete_flag = 0;
1265     *reg_id = 0;
1266
1267     *sort_flag =(sort_relation_value > 0) ? 1 : 0;
1268     *search_type = "phrase";
1269     strcpy(rank_type, "void");
1270     if (relation_value == 102)
1271     {
1272         if (weight_value == -1)
1273             weight_value = 34;
1274         sprintf(rank_type, "rank,w=%d,u=%d", weight_value, use_value);
1275     }
1276     if (relation_value == 103)
1277     {
1278         *search_type = "always";
1279         *reg_id = 'w';
1280         return 0;
1281     }
1282     if (*complete_flag)
1283         *reg_id = 'p';
1284     else
1285         *reg_id = 'w';
1286     switch (structure_value)
1287     {
1288     case 6:   /* word list */
1289         *search_type = "and-list";
1290         break;
1291     case 105: /* free-form-text */
1292         *search_type = "or-list";
1293         break;
1294     case 106: /* document-text */
1295         *search_type = "or-list";
1296         break;
1297     case -1:
1298     case 1:   /* phrase */
1299     case 2:   /* word */
1300     case 108: /* string */
1301         *search_type = "phrase";
1302         break;
1303    case 107: /* local-number */
1304         *search_type = "local";
1305         *reg_id = 0;
1306         break;
1307     case 109: /* numeric string */
1308         *reg_id = 'n';
1309         *search_type = "numeric";
1310         break;
1311     case 104: /* urx */
1312         *reg_id = 'u';
1313         *search_type = "phrase";
1314         break;
1315     case 3:   /* key */
1316         *reg_id = '0';
1317         *search_type = "phrase";
1318         break;
1319     case 4:  /* year */
1320         *reg_id = 'y';
1321         *search_type = "phrase";
1322         break;
1323     case 5:  /* date */
1324         *reg_id = 'd';
1325         *search_type = "phrase";
1326         break;
1327     default:
1328         return -1;
1329     }
1330     return 0;
1331
1332      -->
1333
1334
1335     <para>
1336      <emphasis>Use</emphasis> attributes are interpreted according to the
1337      attribute sets which have been loaded in the
1338     <literal>zebra.cfg</literal> file, and are matched against specific
1339      fields as specified in the <literal>.abs</literal> file which
1340      describes the profile of the records which have been loaded.
1341      If no Use attribute is provided, a default of Bib-1 Any is assumed.
1342     </para>
1343
1344     <para>
1345      If a <emphasis>Structure</emphasis> attribute of
1346      <emphasis>Phrase</emphasis> is used in conjunction with a
1347      <emphasis>Completeness</emphasis> attribute of
1348      <emphasis>Complete (Sub)field</emphasis>, the term is matched
1349      against the contents of the phrase (long word) register, if one
1350      exists for the given <emphasis>Use</emphasis> attribute.
1351      A phrase register is created for those fields in the
1352      <literal>.abs</literal> file that contains a
1353      <literal>p</literal>-specifier.
1354      <!-- ### whatever the hell _that_ is -->
1355     </para>
1356
1357     <para>
1358      If <emphasis>Structure</emphasis>=<emphasis>Phrase</emphasis> is
1359      used in conjunction with <emphasis>Incomplete Field</emphasis> - the
1360      default value for <emphasis>Completeness</emphasis>, the
1361      search is directed against the normal word registers, but if the term
1362      contains multiple words, the term will only match if all of the words
1363      are found immediately adjacent, and in the given order.
1364      The word search is performed on those fields that are indexed as
1365      type <literal>w</literal> in the <literal>.abs</literal> file.
1366     </para>
1367
1368     <para>
1369      If the <emphasis>Structure</emphasis> attribute is
1370      <emphasis>Word List</emphasis>,
1371      <emphasis>Free-form Text</emphasis>, or
1372      <emphasis>Document Text</emphasis>, the term is treated as a
1373      natural-language, relevance-ranked query.
1374      This search type uses the word register, i.e. those fields
1375      that are indexed as type <literal>w</literal> in the
1376      <literal>.abs</literal> file.
1377     </para>
1378
1379     <para>
1380      If the <emphasis>Structure</emphasis> attribute is
1381      <emphasis>Numeric String</emphasis> the term is treated as an integer.
1382      The search is performed on those fields that are indexed
1383      as type <literal>n</literal> in the <literal>.abs</literal> file.
1384     </para>
1385
1386     <para>
1387      If the <emphasis>Structure</emphasis> attribute is
1388      <emphasis>URx</emphasis> the term is treated as a URX (URL) entity.
1389      The search is performed on those fields that are indexed as type
1390      <literal>u</literal> in the <literal>.abs</literal> file.
1391     </para>
1392
1393     <para>
1394      If the <emphasis>Structure</emphasis> attribute is
1395      <emphasis>Local Number</emphasis> the term is treated as
1396      native Zebra Record Identifier.
1397     </para>
1398
1399     <para>
1400      If the <emphasis>Relation</emphasis> attribute is
1401      <emphasis>Equals</emphasis> (default), the term is matched
1402      in a normal fashion (modulo truncation and processing of
1403      individual words, if required).
1404      If <emphasis>Relation</emphasis> is <emphasis>Less Than</emphasis>,
1405      <emphasis>Less Than or Equal</emphasis>,
1406      <emphasis>Greater than</emphasis>, or <emphasis>Greater than or
1407       Equal</emphasis>, the term is assumed to be numerical, and a
1408      standard regular expression is constructed to match the given
1409      expression.
1410      If <emphasis>Relation</emphasis> is <emphasis>Relevance</emphasis>,
1411      the standard natural-language query processor is invoked.
1412     </para>
1413
1414     <para>
1415      For the <emphasis>Truncation</emphasis> attribute,
1416      <emphasis>No Truncation</emphasis> is the default.
1417      <emphasis>Left Truncation</emphasis> is not supported.
1418      <emphasis>Process # in search term</emphasis> is supported, as is
1419      <emphasis>Regxp-1</emphasis>.
1420      <emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
1421      search. As a default, a single error (deletion, insertion,
1422      replacement) is accepted when terms are matched against the register
1423      contents.
1424     </para>
1425    </sect2>
1426
1427    <sect2  id="querymodel-regular">
1428     <title>Zebra Regular Expressions in Truncation Attribute (type = 5)</title>
1429
1430     <para>
1431      Each term in a query is interpreted as a regular expression if
1432      the truncation value is either <emphasis>Regxp-1 (@attr 5=102)</emphasis>
1433      or <emphasis>Regxp-2 (@attr 5=103)</emphasis>.
1434      Both query types follow the same syntax with the operands:
1435     </para>
1436
1437      <table id="querymodel-regular-operands-table">
1438       <caption>Regular Expression Operands</caption>
1439        <!--
1440        <thead>
1441        <tr><td>one</td><td>two</td></tr>
1442       </thead>
1443        -->
1444        <tbody>
1445         <tr>
1446          <td><emphasis>x</emphasis></td>
1447          <td>Matches the character <emphasis>x</emphasis>.</td>
1448         </tr>
1449         <tr>
1450          <td><emphasis>.</emphasis></td>
1451          <td>Matches any character.</td>
1452         </tr>
1453         <tr>
1454          <td><emphasis>[ .. ]</emphasis></td>
1455          <td>Matches the set of characters specified;
1456          such as <literal>[abc]</literal> or <literal>[a-c]</literal>.</td>
1457         </tr>
1458        </tbody>
1459       </table>
1460
1461     <para>
1462      The above operands can be combined with the following operators:
1463     </para>
1464
1465
1466      <table id="querymodel-regular-operators-table">
1467       <caption>Regular Expression Operators</caption>
1468        <!--
1469        <thead>
1470        <tr><td>one</td><td>two</td></tr>
1471       </thead>
1472        -->
1473        <tbody>
1474         <tr>
1475          <td><emphasis>x*</emphasis></td>
1476          <td>Matches <emphasis>x</emphasis> zero or more times.
1477           Priority: high.</td>
1478         </tr>
1479         <tr>
1480          <td><emphasis>x+</emphasis></td>
1481          <td>Matches <emphasis>x</emphasis> one or more times.
1482           Priority: high.</td>
1483         </tr>
1484         <tr>
1485          <td><emphasis>x?</emphasis></td>
1486          <td> Matches <emphasis>x</emphasis> zero or once.
1487           Priority: high.</td>
1488         </tr>
1489         <tr>
1490          <td><emphasis>xy</emphasis></td>
1491          <td> Matches <emphasis>x</emphasis>, then <emphasis>y</emphasis>.
1492          Priority: medium.</td>
1493         </tr>
1494         <tr>
1495          <td><emphasis>x|y</emphasis></td>
1496          <td> Matches either <emphasis>x</emphasis> or <emphasis>y</emphasis>.
1497          Priority: low.</td>
1498         </tr>
1499         <tr>
1500          <td><emphasis>( )</emphasis></td>
1501          <td>The order of evaluation may be changed by using parentheses.</td>
1502         </tr>
1503        </tbody>
1504       </table>
1505
1506     <para>
1507      If the first character of the <emphasis>Regxp-2</emphasis> query
1508      is a plus character (<literal>+</literal>) it marks the
1509      beginning of a section with non-standard specifiers.
1510      The next plus character marks the end of the section.
1511      Currently Zebra only supports one specifier, the error tolerance,
1512      which consists one digit.
1513     </para>
1514
1515     <para>
1516      Since the plus operator is normally a suffix operator the addition to
1517      the query syntax doesn't violate the syntax for standard regular
1518      expressions.
1519     </para>
1520
1521     <para>
1522      For example, a phrase search with regular expressions  in
1523      the title-register is performed like this:
1524      <screen>
1525       Z> find @attr 1=4 @attr 5=102 "informat.* retrieval"
1526      </screen>
1527     </para>
1528
1529     <para>
1530      Combinations with other attributes are possible. For example, a
1531      ranked search with a regular expression
1532      (see <xref linkend="administration-ranking"/> for the glory details):
1533      <screen>
1534       Z> find @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval"
1535      </screen>
1536     </para>
1537    </sect2>
1538
1539
1540    <!--
1541    <para>
1542     The RecordType parameter in the <literal>zebra.cfg</literal> file, or
1543     the <literal>-t</literal> option to the indexer tells Zebra how to
1544     process input records.
1545     Two basic types of processing are available - raw text and structured
1546     data. Raw text is just that, and it is selected by providing the
1547     argument <emphasis>text</emphasis> to Zebra. Structured records are
1548     all handled internally using the basic mechanisms described in the
1549     subsequent sections.
1550     Zebra can read structured records in many different formats.
1551    </para>
1552    -->
1553   </sect1>
1554
1555
1556   <sect1 id="querymodel-cql-to-pqf">
1557    <title>Server Side CQL to PQF Query Translation</title>
1558    <para>
1559     Using the
1560     <literal>&lt;cql2rpn&gt;l2rpn.txt&lt;/cql2rpn&gt;</literal>
1561       YAZ Frontend Virtual
1562     Hosts option, one can configure
1563     the YAZ Frontend CQL-to-PQF
1564     converter, specifying the interpretation of various
1565     <ulink url="&url.cql;">CQL</ulink>
1566     indexes, relations, etc. in terms of Type-1 query attributes.
1567     <!-- The  yaz-client config file -->
1568    </para>
1569    <para>
1570     For example, using server-side CQL-to-PQF conversion, one might
1571     query a zebra server like this:
1572     <screen>
1573     <![CDATA[
1574      yaz-client localhost:9999
1575      Z> querytype cql
1576      Z> find text=(plant and soil)
1577      ]]>
1578     </screen>
1579      and - if properly configured - even static relevance ranking can
1580      be performed using CQL query syntax:
1581     <screen>
1582     <![CDATA[
1583      Z> find text = /relevant (plant and soil)
1584      ]]>
1585      </screen>
1586    </para>
1587
1588    <para>
1589     By the way, the same configuration can be used to
1590     search using client-side CQL-to-PQF conversion:
1591     (the only difference is <literal>querytype cql2rpn</literal>
1592     instead of
1593     <literal>querytype cql</literal>, and the call specifying a local
1594     conversion file)
1595     <screen>
1596     <![CDATA[
1597      yaz-client -q local/cql2pqf.txt localhost:9999
1598      Z> querytype cql2rpn
1599      Z> find text=(plant and soil)
1600      ]]>
1601      </screen>
1602    </para>
1603
1604    <para>
1605     Exhaustive information can be found in the
1606     Section "Specification of CQL to RPN mappings" in the YAZ manual.
1607     <ulink url="http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map">
1608      http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map</ulink>,
1609    and shall therefore not be repeated here.
1610    </para>
1611   <!--
1612   <para>
1613     See
1614       <ulink url="http://www.loc.gov/z3950/agency/zing/cql/dc-indexes.html">
1615       http://www.loc.gov/z3950/agency/zing/cql/dc-indexes.html</ulink>
1616     for the Maintenance Agency's work-in-progress mapping of Dublin Core
1617     indexes to Attribute Architecture (util, XD and BIB-2)
1618     attributes.
1619    </para>
1620    -->
1621  </sect1>
1622
1623
1624
1625 </chapter>
1626
1627  <!-- Keep this comment at the end of the file
1628  Local variables:
1629  mode: sgml
1630  sgml-omittag:t
1631  sgml-shorttag:t
1632  sgml-minimize-attributes:nil
1633  sgml-always-quote-attributes:t
1634  sgml-indent-step:1
1635  sgml-indent-data:t
1636  sgml-parent-document: "zebra.xml"
1637  sgml-local-catalogs: nil
1638  sgml-namecase-general:t
1639  End:
1640  -->