1 <chapter id="querymodel">
2 <!-- $Id: querymodel.xml,v 1.11 2006-06-22 14:01:55 marc Exp $ -->
3 <title>Query Model</title>
5 <sect1 id="querymodel-overview">
6 <title>Query Model Overview</title>
9 <sect2 id="querymodel-query-languages">
10 <title>Query Languages</title>
13 Zebra is born as a networking Information Retrieval engine adhering
14 to the international standards
15 <ulink url="&url.z39.50;">Z39.50</ulink> and
16 <ulink url="&url.sru;">SRU</ulink>,
18 <literal>type-1 Reverse Polish Notation (RPN)</literal> query
20 Unfortunately, this model has only defined a binary
21 encoded representation, which is used as transport packaging in
22 the Z39.50 protocol layer. This representation is not human
23 readable, nor defines any convenient way to specify queries.
26 Since the <literal>type-1 (RPN)</literal>
27 query structure has no direct, useful string
28 representation, every origin application needs to provide some
29 form of mapping from a local query notation or representation to it.
33 <sect3 id="querymodel-query-languages-pqf">
34 <title>Prefix Query Format (PQF)</title>
37 Index Data has defined a textual representaion in the
38 <literal>Prefix Query Format</literal>, short
39 <literal>PQF</literal>, which mappes
40 <literal>one-to-one</literal> to binary encoded
41 <literal>type-1 RPN</literal> query packages.
42 It has been adopted by other
43 parties developing Z39.50 software, and is often referred to as
44 <literal>Prefix Query Notation</literal>, or in short
45 <literal>PQN</literal>. See
46 <xref linkend="querymodel-pqf"/> for further explanaitions and
47 descriptions of Zebra's capabilities.
51 <sect3 id="querymodel-query-languages-cql">
52 <title>Common Query Language (CQL)</title>
54 The query model of the <literal>type-1 RPN</literal>,
55 expressed in <literal>PQF/PQN</literal> is natively supported.
56 On the other hand, the default <literal>SRU</literal>
57 webservices <literal>Common Query Language</literal>
58 <ulink url="&url.cql;">CQL</ulink> is not natively supported.
61 Zebra can be configured to understand and map CQL to PQF. See
62 <xref linkend="querymodel-cql-to-pqf"/>.
68 <sect2 id="querymodel-operation-types">
69 <title>Operation types</title>
71 Zebra supports all of the three different
72 <literal>Z39.50/SRU</literal> operations defined in the
73 standards: <literal>explain</literal>, <literal>search</literal>,
74 and <literal>scan</literal>. A short description of the
75 functionality and purpose of each is quite in order here.
78 <sect3 id="querymodel-operation-type-explain">
79 <title>Explain Operation</title>
81 The <emphasis>syntax</emphasis> of Z39.50/SRU queries is
82 well known to any client, but the specific
83 <emphasis>semantics</emphasis> - taking into account a
84 particular servers functionalities and abilities - must be
85 discovered from case to case. Enters the
86 <literal>explain</literal> operation, which provides the means
88 <emphasis>fields</emphasis> (also called
89 <emphasis>indexes</emphasis> or <emphasis>access points</emphasis>
90 are provided, which default parameter the server uses, which
91 retrieve document formats are defined, and which specific parts
92 of the general query model are supported.
95 The Z39.50 embeddes the <literal>explain</literal> operation
97 <literal>search</literal> in the magic
98 <literal>IR-Explain-1</literal> database;
99 see <xref linkend="querymodel-exp1"/>.
102 In SRU, <literal>explain</literal> is an entirely seperate
103 operation, which returns an <literal>Zeerex
104 XML</literal> record according to the
105 structure defined by the protocol.
108 In both cases, the information gathered through
109 <literal>explain</literal> operations can be used to
110 auto-configure a client user interface to the servers
115 <sect3 id="querymodel-operation-type-search">
116 <title>Search Operation</title>
118 Search and retrieve interactions are the raison d'ĂȘtre.
119 They are used to query the remote database and
120 return search result documents. Search queries span from
121 simple free text searches to nested complex boolean queries,
122 targeting specific indexes, and possibly enhanced with many
123 query semantic specifications. Search interactions are the heart
124 and soul of Z39.50/SRU servers.
128 <sect3 id="querymodel-operation-type-scan">
129 <title>Scan Operation</title>
131 The <literal>scan</literal> operation is a helper functionality,
132 which operates on one index or access point a time.
136 the means to investigate the content of specific indexes.
137 Scanning an index returns a handfull of terms actually fond in
138 the indexes, and in addition the <literal>scan</literal>
139 operation returns th enumber of documents indexed by each term.
140 A search client can use this information to propose proper
141 spelling of search terms, to auto-fill search boxes, or to
142 display controlled vocabularies.
151 <sect1 id="querymodel-pqf">
152 <title>Prefix Query Format structure and syntax</title>
154 The <ulink url="&url.yaz.pqf;">PQF grammer</ulink>
155 is documented in the YAZ manual, and shall not be
156 repeated here. This textual PQF representation
157 is always during search mapped to the equivalent Zebra internal
161 <sect2 id="querymodel-pqf-tree">
162 <title>PQF tree structure</title>
164 The PQF parse tree - or the equivalent textual representation -
165 may start with one specification of the
166 <emphasis>attribute set</emphasis> used. Following is a query
168 consists of <emphasis>atomic query parts (APT)</emphasis> or
169 <emphasis>named result sets</emphasis>, eventually
170 paired by <emphasis>boolean binary operators</emphasis>, and
171 finally <emphasis>recursively combined </emphasis> into
175 <sect3 id="querymodel-attribute-sets">
176 <title>Attribute sets</title>
178 Attribute sets define the exact meaning and semantics of queries
179 issued. Zebra comes with some predefined attribute set
180 definitions, others can easily be defined and added to the
185 <table id="querymodel-attribute-sets-table"
186 frame="all" rowsep="1" colsep="1" align="center">
188 <caption>Attribute sets predefined in Zebra</caption>
192 <td>Attribute set</td>
201 <td><literal>Explain</literal></td>
202 <td><literal>exp-1</literal></td>
203 <td>Special attribute set used on the special automagic
204 <literal>IR-Explain-1</literal> database to gain information on
205 server capabilities, database names, and database
210 <td><literal>Bib1</literal></td>
211 <td><literal>bib-1</literal></td>
212 <td>Standard PQF query language attribute set which defines the
213 semantics of Z39.50 searching. In addition, all of the
214 non-use attributes (type 2-9) define the hard-wired
220 <td><literal>GILS</literal></td>
221 <td><literal>gils</literal></td>
222 <td>Extention to the <literal>Bib1</literal> attribute set.</td>
227 <td><literal>IDXPATH</literal></td>
228 <td><literal>idxpath</literal></td>
229 <td>Hardwired XPATH like attribute set, only available for
230 indexing with the GRS record model</td>
239 The use attributes (type 1) of the predefined attribute sets can
240 be reconfigured by tweaking the files
241 <filename>tab/*.att</filename>.
242 New attribute sets can be defined by adding similar files in the
243 configuration path of the server.
247 The Zebra internal query processing is modeled after
248 the <literal>Bib1</literal> attribute set, and the non-use
249 attributes type 2-6 are hard-wired in. It is therefore essential
250 to be familiar with <xref linkend="querymodel-bib1-nonuse"/>.
254 <sect3 id="querymodel-boolean-operators">
255 <title>Boolean operators</title>
257 A pair of subquery trees, or of atomic queries, is combined
258 using the standard boolean operators into new query trees.
261 <table id="querymodel-boolean-operators-table"
262 frame="all" rowsep="1" colsep="1" align="center">
264 <caption>Boolean operators</caption>
273 <tr><td><literal>@and</literal></td>
274 <td>binary <literal>AND</literal> operator</td>
275 <td>Set intersection of two atomic queries hit sets</td>
277 <tr><td><literal>@or</literal></td>
278 <td>binary <literal>OR</literal> operator</td>
279 <td>Set union of two atomic queries hit sets</td>
281 <tr><td><literal>@not</literal></td>
282 <td>binary <literal>AND NOT</literal> operator</td>
283 <td>Set complement of two atomic queries hit sets</td>
285 <tr><td><literal>@prox</literal></td>
286 <td>binary <literal>PROXIMY</literal> operator</td>
287 <td>Set intersection of two atomic queries hit sets. In
288 addition, the intersection set is purged for all
289 documents which do not satisfy the requested query
290 term proximity. Usually a proper subset of the AND
297 For example, we can combine the terms
298 <emphasis>information</emphasis> and <emphasis>retrieval</emphasis>
299 into different searches in the default index of the default
300 attribute set as follows.
301 Querying for the union of all documents containing the
302 terms <emphasis>information</emphasis> OR
303 <emphasis>retrieval</emphasis>:
305 Z> find @or information retrieval
309 Querying for the intersection of all documents containing the
310 terms <emphasis>information</emphasis> AND
311 <emphasis>retrieval</emphasis>:
312 The hit set is a subset of the coresponding
315 Z> find @and information retrieval
319 Querying for the intersection of all documents containing the
320 terms <emphasis>information</emphasis> AND
321 <emphasis>retrieval</emphasis>, taking proximity into account:
322 The hit set is a subset of the coresponding
325 Z> find @prox 0 3 0 2 k 2 information retrieval
327 See <ulink url="&url.yaz.pqf;">PQF grammer</ulink> for details.
330 Querying for the intersection of all documents containing the
331 terms <emphasis>information</emphasis> AND
332 <emphasis>retrieval</emphasis>, in the same order and near each
333 other as described in the term list
334 The hit set is a subset of the coresponding
337 Z> find "information retrieval"
343 <sect3 id="querymodel-atomic-queries">
344 <title>Atomic queries (APT)</title>
346 Atomic queries are the query parts which work on one acess point
347 only. These consist of <literal>an attribute list</literal>
348 followed by a <literal>single term</literal> or a
349 <literal>quoted term list</literal>, and are often called
350 <emphasis>Attributes-Plus-Terms (APT)</emphasis> queries.
353 Unsupplied non-use attributes type 2-9 are either inherited from
354 higher nodes in the query tree, or are set to Zebra's default values.
355 See <xref linkend="querymodel-bib1"/> for details.
358 <table id="querymodel-atomic-queries-table"
359 frame="all" rowsep="1" colsep="1" align="center">
361 <caption>Atomic queries</caption>
364 <tr><td>one</td><td>two</td></tr>
369 <td><emphasis>attribute list</emphasis></td>
370 <td>List of <literal>orthogonal</literal> attributes</td>
371 <td>Any of the orthogonal attribute types may be omitted,
372 these are inherited from higher query tree nodes, or if not
373 inherited, are set to the default Zebra configuration values.
377 <td><emphasis>term</emphasis></td>
378 <td>single <literal>term</literal>
379 or <literal>quoted term list</literal> </td>
380 <td>Here the search terms or list of search terms is added
386 Querying for the term <emphasis>information</emphasis> in the
387 default index using the default attribite set, the server choice
388 of access point/index, and the default non-use attributes.
390 Z> find "information"
394 Equivalent query fully specified including all default values:
396 Z> find @attrset bib-1 @attr 1=1017 @attr 2=3 @attr 3=3 @attr 4=1 @attr 5=100 @attr 6=1 "information"
401 Finding all documents which have empty titles. Notice that the
402 empty term must be quoted, but is otherwise legal.
411 <sect3 id="querymodel-resultset">
412 <title>Named Result Sets</title>
414 Named result sets are supported in Zebra, and result sets can be
415 used as operands without limitations.
418 After the execution of a search, the result set is available at
419 the server, such that the client can use it for subsequent
420 searches or retrieval requests. The Z30.50 standard actually
421 stresses the fact that result sets are voliatile. It may cease
422 to exist at any time point after search, and the server will
423 send a diagnostic to the effect that the requested
424 result set does not exist any more.
428 Defining a named result set and re-using it in the next query,
429 using <literal>yaz-client</literal>.
431 Z> f @attr 1=4 mozart
433 Number of hits: 43, setno 1
435 Z> f @and @set 1 @attr 1=4 amadeus
437 Number of hits: 14, setno 2
439 Z> f @attr 1=1016 beethoven
441 Number of hits: 26, setno 3
447 Named result sets are only supported by the Z39.50 protocol.
448 The SRU web service is stateless, and therefore the notion of
449 named result sets does not exist when acessing a Zebra server by
455 <sect3 id="querymodel-use-string">
456 <title>Zebra's special use attribute type 1 of form 'string'</title>
458 The numeric <literal>use (type 1)</literal> attribute is usually
459 refered to from a given
460 attribute set. In addition, Zebra let you use
461 <emphasis>any internal index
462 name defined in your configuration</emphasis>
463 as use atribute value. This is a great feature for
464 debugging, and when you do
465 not need the complecity of defined use attribute values. It is
466 the preferred way of accessing Zebra indexes directly.
469 Finding all documents which have the term list "information
470 retrieval" in an Zebra index, using it's internal full string
471 name. Scanning the same index.
473 Z> find @attr 1=sometext "information retrieval"
474 Z> scan @attr 1=sometext aterm
478 Searching or scanning
479 the bib-1 use attribute 54 using it's string name:
481 Z> find @attr 1=Code-language eng
482 Z> scan @attr 1=Code-language ""
486 It is possible to search
487 in any silly string index - if it's defined in your
488 indexation rules and can be parsed by the PQF parser.
489 This is definitely not the recommended use of
490 this facility, as it might confuse your users with some very
493 Z> find @attr 1=silly/xpath/alike[@index]/name "information retrieval"
497 See also <xref linkend="querymodel-bib1-mapping"/> for details, and
498 <xref linkend="server-sru"/>
499 for the SRU PQF query extention using string names as a fast
504 <sect3 id="querymodel-use-xpath">
505 <title>Zebra's special use attribute type 1 of form 'XPath'
506 for GRS filters</title>
508 As we have seen above, it is possible (albeit seldom a great
510 <ulink url="http://www.w3.org/TR/xpath">XPath 1.0</ulink> based
511 search by defining <literal>use (type 1)</literal>
512 <emphasis>string</emphasis> attributes which in appearence
513 <emphasis>resemble XPath queries</emphasis>. There are two
514 problems with this approach: first, the XPath-look-alike has to
515 be defined at indexation time, no new undefined
516 XPath queries can entered at search time, and second, it might
517 confuse users very much that an XPath-alike index name in fact
518 gets populated from a possible entirely different XML element
519 than it pretends to access.
522 When using the <literal>GRS Record Model</literal>
523 (see <xref linkend="record-model-grs"/>), we have the
524 possibility to embed <emphasis>life</emphasis>
526 in the PQF queries, which are here called
527 <literal>use (type 1)</literal> <emphasis>xpath</emphasis>
528 attributes. You must enable the
529 <literal>xpath enable</literal> directive in your
530 <literal>.abs</literal> config files.
533 Only a <emphasis>very</emphasis> restricted subset of the
534 <ulink url="http://www.w3.org/TR/xpath">XPath 1.0</ulink>
535 standard is supported as the GRS record model is simpler than
536 a full XML DOM structure. See the following examples for
540 Finding all documents which have the term "content"
541 inside a text node found in a specific XML DOM
542 <emphasis>subtree</emphasis>, whose starting element is
545 Z> find @attr 1=/root content
546 Z> find @attr 1=/root/first content
548 <emphasis>Notice that the
549 XPath must be absolute, i.e., must start with '/', and that the
550 XPath <literal>decendant-or-self</literal> axis followed by a
551 text node selection <literal>text()</literal> is implicitly
552 appended to the stated XPath.
554 It follows that the above searches are interpreted as:
556 Z> find @attr 1=/root//text() content
557 Z> find @attr 1=/root/first//text() content
562 Searching inside attribute strings is possible:
564 Z> find @attr 1=/link/@creator morten
569 Filter the adressing XPath by a predicate working on exact
571 attributes (in the XML sense) can be done: return all those docs which
572 have the term "english" contained in one of all text subnodes of
573 the subtree defined by the XPath
574 <literal>/record/title[@lang='en']</literal>. And similar
577 Z> find @attr 1=/record/title[@lang='en'] english
578 Z> find @attr 1=/link[@creator='sisse'] sibelius
579 Z> find @attr 1=/link[@creator='sisse']/description[@xml:lang='da'] sibelius
584 Combining numeric indexes, boolean expressions,
585 and xpath based searches is possible:
587 Z> find @attr 1=/record/title @and foo bar
588 Z> find @and @attr 1=/record/title foo @attr 1=4 bar
592 Escaping PQF keywords and other non-parseable XPath constructs
593 with <literal>'{ }'</literal> to prevent syntax errors:
595 Z> find @attr {1=/root/first[@attr='danish']} content
596 Z> find @attr {1=/record/@set} oai
600 It is worth mentioning that these dynamic performed XPath
601 queries are a performance bottelneck, as no optimized
602 specialized indexes can be used. Therefore, avoid the use of
603 this facility when speed is essential, and the database content
604 size is medium to large.
611 <sect2 id="querymodel-exp1">
612 <title>Explain Attribute Set</title>
614 The Z39.50 standard defines the
615 <ulink url="&url.z39.50.explain;">Explain</ulink>attribute set
616 <literal>exp-1</literal>, which is used to discover information
617 about a server's search semantics and functional capabilities
618 Zebra exposes a "classic"
619 Explain database by base name <literal>IR-Explain-1</literal>, which
620 is populated with system internal information.
623 The attribute-set <literal>exp-1</literal> consists of a single
624 <literal>Use (type 1)</literal> attribute.
627 In addition, the non-Use
628 <literal>bib-1</literal> attributes, that is, the types
629 <literal>Relation</literal>, <literal>Position</literal>,
630 <literal>Structure</literal>, <literal>Truncation</literal>,
631 and <literal>Completeness</literal> are imported from
632 the <literal>bib-1</literal> attribute set, and may be used
633 within any explain query.
636 <sect3 id="querymodel-exp1-use">
637 <title>Use Attributes (type = 1)</title>
639 The following Explain search atributes are supported:
640 <literal>ExplainCategory</literal> (@attr 1=1),
641 <literal>DatabaseName</literal> (@attr 1=3),
642 <literal>DateAdded</literal> (@attr 1=9),
643 <literal>DateChanged</literal>(@attr 1=10).
646 A search in the use attribute <literal>ExplainCategory</literal>
647 supports only these predefined values:
648 <literal>CategoryList</literal>, <literal>TargetInfo</literal>,
649 <literal>DatabaseInfo</literal>, <literal>AttributeDetails</literal>.
652 See <filename>tab/explain.att</filename> and the
653 <ulink url="&url.z39.50;">Z39.50</ulink> standard
654 for more information.
659 <title>Explain searches with yaz-client</title>
661 Classic Explain only defines retrieval of Explain information
662 via ASN.1. Pratically no Z39.50 clients supports this. Fortunately
663 they don't have to - Zebra allows retrieval of this information
665 <literal>SUTRS</literal>, <literal>XML</literal>,
666 <literal>GRS-1</literal> and <literal>ASN.1</literal> Explain.
670 List supported categories to find out which explain commands are
674 Z> find @attr exp1 1=1 categorylist
681 Get target info, that is, investigate which databases exist at
682 this server endpoint:
685 Z> find @attr exp1 1=1 targetinfo
696 List all supported databases, the number of hits
697 is the number of databases found, which most commonly are the
699 the <literal>Default</literal> and the
700 <literal>IR-Explain-1</literal> databases.
703 Z> find @attr exp1 1=1 databaseinfo
710 Get database info record for database <literal>Default</literal>.
713 Z> find @and @attr exp1 1=1 databaseinfo @attr exp1 1=3 Default
715 Identical query with explicitly specified attribute set:
718 Z> find @attrset exp1 @and @attr 1=1 databaseinfo @attr 1=3 Default
723 Get attribute details record for database
724 <literal>Default</literal>.
725 This query is very useful to study the internal Zebra indexes.
726 If records have been indexed using the <literal>alvis</literal>
727 XSLT filter, the string representation names of the known indexes can be
731 Z> find @and @attr exp1 1=1 attributedetails @attr exp1 1=3 Default
733 Identical query with explicitly specified attribute set:
736 Z> find @attrset exp1 @and @attr 1=1 attributedetails @attr 1=3 Default
743 <sect2 id="querymodel-bib1">
744 <title>Bib1 Attribute Set</title>
746 Most of the information contained in this section is an excerpt of
747 the <literal>ATTRIBUTE SET BIB-1 (Z39.50-1995)
749 found at <ulink url="&url.z39.50.attset.bib1.1995;">. The BIB-1
750 Attribute Set Semantics</ulink> from 1995, also in an updated
751 <ulink url="&url.z39.50.attset.bib1;">Bib-1
752 Attribute Set</ulink>
753 version from 2003. Index Data is not the copyright holder of this
754 information, except for the configuration details, the listing of
755 Zebra's capabilities, and the example queries.
759 <sect3 id="querymodel-bib1-use">
760 <title>Use Attributes (type 1)</title>
763 A use attribute specifies an access point for any atomic query.
764 These acess points are highly dependent on the attribute set used
765 in the query, and are user configurable using the following
766 default configuration files:
767 <filename>tab/bib1.att</filename>,
768 <filename>tab/dan1.att</filename>,
769 <filename>tab/explain.att</filename>, and
770 <filename>tab/gils.att</filename>.
771 New attribute sets can be added by adding new
772 <filename>tab/*.att</filename> configuration files, which need to
773 be sourced in the main configuration <filename>zebra.cfg</filename>.
777 In addition, Zebra allows the acess of
778 <emphasis>internal index names</emphasis> and <emphasis>dynamic
779 XPath</emphasis> as use attributes; see
780 <xref linkend="querymodel-use-string"/> and
781 <xref linkend="querymodel-use-xpath"/>.
785 Phrase search for <emphasis>information retrieval</emphasis> in
786 the title-register, scanning the same register afterwards:
788 Z> find @attr 1=4 "information retrieval"
789 Z> scan @attr 1=4 information
797 <sect2 id="querymodel-bib1-nonuse">
798 <title>Zebra general Bib1 Non-Use Attributes (type 2-6)</title>
800 <sect3 id="querymodel-bib1-relation">
801 <title>Relation Attributes (type 2)</title>
804 Relation attributes describe the relationship of the access
806 of the relation) to the search term as qualified by the attributes (right
807 side of the relation), e.g., Date-publication <= 1975.
810 <table id="querymodel-bib1-relation-table"
811 frame="all" rowsep="1" colsep="1" align="center">
813 <caption>Relation Attributes (type 2)</caption>
828 <td>Less than or equal</td>
838 <td>Greater or equal</td>
843 <td>Greater than</td>
868 <td>AlwaysMatches</td>
876 The relation attribute
877 <literal>relevance (102)</literal> is supported, see
878 <xref linkend="administration-ranking"/> for full information.
879 <!-- always-matches (103) not supported for all indexes -->
883 All ordering operations are based on a lexicographical ordering,
884 <emphasis>expect</emphasis> when the
885 <literal>structure attribute numeric (109)</literal> is used. In
886 this case, ordering is numerical. See
887 <xref linkend="querymodel-bib1-structure"/>.
891 Ranked search for <emphasis>information retrieval</emphasis> in
894 Z> find @attr 1=4 @attr 2=102 "information retrieval"
899 <sect3 id="querymodel-bib1-position">
900 <title>Position Attributes (type 3)</title>
903 The position attribute specifies the location of the search term
904 within the field or subfield in which it appears.
907 <table id="querymodel-bib1-position-table"
908 frame="all" rowsep="1" colsep="1" align="center">
910 <caption>Position Attributes (type 3)</caption>
920 <td>First in field </td>
925 <td>First in subfield</td>
930 <td>Any position in field</td>
938 The position attribute values <literal>first in field (1)</literal>,
939 and <literal>first in subfield(2)</literal> are unsupported.
940 Using them does not trigger an error, but silent defaults to
941 <literal>any position in field (3)</literal>.
946 <sect3 id="querymodel-bib1-structure">
947 <title>Structure Attributes (type 4)</title>
950 The structure attribute specifies the type of search
951 term. This causes the search to be mapped on
952 different Zebra internal indexes, which must have been defined
957 The possible values of the
958 <literal>structure attribute (type 4)</literal> can be defined
959 using the configuration file <filename>
960 tab/default.idx</filename>.
961 The default configuration is summerized in this table.
964 <table id="querymodel-bib1-structure-table"
965 frame="all" rowsep="1" colsep="1" align="center">
967 <caption>Structure Attributes (type 4)</caption>
997 <td>Date (normalized)</td>
1007 <td>Date (un-normalized)</td>
1009 <td>unsupported</td>
1012 <td>Name (normalized) </td>
1014 <td>unsupported</td>
1017 <td>Name (un-normalized) </td>
1019 <td>unsupported</td>
1024 <td>unsupported</td>
1032 <td>Free-form-text</td>
1037 <td>Document-text</td>
1042 <td>Local-number</td>
1049 <td>unsupported</td>
1052 <td>Numeric string</td>
1061 The structure attribute values
1062 <literal>Word list (6)</literal>
1063 is supported, and maps to the boolean <literal>AND</literal>
1064 combination of words supplied. The word list is useful when
1065 google-like bag-of-word queries need to be translated from a GUI
1066 query language to PQF. For example, the following queries
1069 Z> find @attr 1=Title @attr 4=6 "mozart amadeus"
1070 Z> find @attr 1=Title @and mozart amadeus
1075 The structure attribute value
1076 <literal>Free-form-text (105)</literal> and
1077 <literal>Document-text (106)</literal>
1078 are supported, and map both to the boolean <literal>OR</literal>
1079 combination of words supplied. The following queries
1082 Z> find @attr 1=Body-of-text @attr 4=105 "bach salieri teleman"
1083 Z> find @attr 1=Body-of-text @attr 4=106 "bach salieri teleman"
1084 Z> find @attr 1=Body-of-text @or bach @or salieri teleman
1086 This <literal>OR</literal> list of terms is very usefull in
1087 combination with relevance ranking:
1089 Z> find @attr 1=Body-of-text @attr 2=102 @attr 4=105 "bach salieri teleman"
1094 The structure attribute value
1095 <literal>Local number (107)</literal>
1096 is supported, and maps always to the Zebra internal document ID,
1097 irrespectively which use attribute is specified. The following queries
1098 have exactly the same unique record in the hit set:
1100 Z> find @attr 4=107 10
1101 Z> find @attr 1=4 @attr 4=107 10
1102 Z> find @attr 1=1010 @attr 4=107 10
1108 the GILS schema (<literal>gils.abs</literal>), the
1109 west-bounding-coordinate is indexed as type <literal>n</literal>,
1110 and is therefore searched by specifying
1111 <emphasis>structure</emphasis>=<emphasis>Numeric String</emphasis>.
1112 To match all those records with west-bounding-coordinate greater
1113 than -114 we use the following query:
1115 Z> find @attr 4=109 @attr 2=5 @attr gils 1=2038 -114
1119 The exact mapping between PQF queries and Zebra internal indexes
1120 and index types is explained in
1121 <xref linkend="querymodel-bib1-mapping"/>.
1126 <sect3 id="querymodel-bib1-truncation">
1127 <title>Truncation Attributes (type = 5)</title>
1130 The truncation attribute specifies whether variations of one or
1131 more characters are allowed between serch term and hit terms, or
1132 not. Using non-default truncation attributes will broaden the
1133 document hit set of a search query.
1136 <table id="querymodel-bib1-truncation-table"
1137 frame="all" rowsep="1" colsep="1" align="center">
1139 <caption>Truncation Attributes (type 5)</caption>
1149 <td>Right truncation </td>
1154 <td>Left truncation</td>
1159 <td>Left and right truncation</td>
1164 <td>Do not truncate</td>
1169 <td>Process # in search term</td>
1187 The truncation attribute values 1-3 perform the obvious way:
1189 Z> scan @attr 1=Body-of-text schnittke
1195 Z> find @attr 1=Body-of-text @attr 5=1 schnittke
1197 Number of hits: 95, setno 7
1199 Z> find @attr 1=Body-of-text @attr 5=2 schnittke
1201 Number of hits: 81, setno 6
1203 Z> find @attr 1=Body-of-text @attr 5=3 schnittke
1205 Number of hits: 95, setno 8
1210 The truncation attribute value
1211 <literal>Process # in search term (101)</literal> is a
1212 poor-man's regular expression search. It maps
1213 each <literal>#</literal> to <literal>.*</literal>, and
1214 performes then a <literal>Regexp-1 (102)</literal> regular
1215 expression search. The following two queries are equivalent:
1217 Z> find @attr 1=Body-of-text @attr 5=101 schnit#ke
1218 Z> find @attr 1=Body-of-text @attr 5=102 schnit.*ke
1220 Number of hits: 89, setno 10
1225 The truncation attribute value
1226 <literal>Regexp-1 (102)</literal> is a normal regular search,
1227 see <xref linkend="querymodel-regular"/> for details.
1229 Z> find @attr 1=Body-of-text @attr 5=102 schnit+ke
1230 Z> find @attr 1=Body-of-text @attr 5=102 schni[a-t]+ke
1235 The truncation attribute value
1236 <literal>Regexp-2 (103) </literal> is a Zebra specific extention
1237 which allows <emphasis>fuzzy</emphasis> matches. One single
1238 error in spelling of search terms is allowed, i.e., a document
1239 is hit if it includes a term which can be mapped to the used
1240 search term by one character substitution, addition, deletion or
1243 Z> find @attr 1=Body-of-text @attr 5=100 schnittke
1245 Number of hits: 81, setno 14
1247 Z> find @attr 1=Body-of-text @attr 5=103 schnittke
1249 Number of hits: 103, setno 15
1255 <sect3 id="querymodel-bib1-completeness">
1256 <title>Completeness Attributes (type = 6)</title>
1260 The <literal>Completeness Attributes (type = 6)</literal>
1261 is used to specify that a given search term or term list is either
1262 part of the terms of a given index/field
1263 (<literal>Incomplete subfield (1)</literal>), or is
1264 what literally is found in the entire field's index
1265 (<literal>Complete field (3)</literal>).
1268 <table id="querymodel-bib1-completeness-table"
1269 frame="all" rowsep="1" colsep="1" align="center">
1270 <caption>Completeness Attributes (type = 6)</caption>
1273 <td>Completeness</td>
1280 <td>Incomplete subfield</td>
1285 <td>Complete subfield</td>
1287 <td>depreciated</td>
1290 <td>Complete field</td>
1298 The <literal>Completeness Attributes (type = 6)</literal>
1299 is only partially and conditionally
1300 supported in the sense that it is ignored if the hit index is
1301 not of structure <literal>type="w"</literal> or
1302 <literal>type="p"</literal>.
1305 <literal>Incomplete subfield (1)</literal> is the default, and
1307 register <literal>type="w"</literal>, whereas
1308 <literal>Complete field (3)</literal> triggers
1309 search and scan in index <literal>type="p"</literal>.
1312 The <literal>Complete subfield (2)</literal> is a reminiscens
1313 from the happy <literal>MARC</literal>
1314 binary format days. Zebra does not support it, but maps silently
1315 to <literal>Complete field (3)</literal>.
1319 The exact mapping between PQF queries and Zebra internal indexes
1320 and index types is explained in
1321 <xref linkend="querymodel-bib1-mapping"/>.
1329 <sect1 id="querymodel-zebra">
1330 <title>Advanced Zebra PQF Features</title>
1332 The Zebra internal query engine has been extended to specific needs
1333 not covered by the <literal>bib-1</literal> attribute set query
1334 model. These extentions are <emphasis>non-standard</emphasis>
1335 and <emphasis>non-portable</emphasis>: most functional extentions
1336 are modeled over the <literal>bib-1</literal> attribute set,
1337 defining type 7-9 attributes.
1338 There are also the speciel
1339 <literal>string</literal> type index names for the
1340 <literal>idxpath</literal> attribute set.
1344 <sect2 id="querymodel-zebra-attr-search">
1345 <title>Zebra specific Search Extentions to all Attribute Sets</title>
1347 Zebra extends the Bib1 attribute types, and these extentions are
1348 recognized regardless of attribute
1349 set used in a <literal>search</literal> operation query.
1352 <table id="querymodel-zebra-attr-search-table"
1353 frame="all" rowsep="1" colsep="1" align="center">
1355 <caption>Zebra Search Attribute Extentions</caption>
1361 <td>Zebra version</td>
1366 <td>Embedded Sort</td>
1378 <td>Rank Weight</td>
1384 <td>Approx Limit</td>
1390 <td>Term Reference</td>
1398 <sect3 id="querymodel-zebra-attr-sorting">
1399 <title>Zebra Extention Embedded Sort Attribute (type 7)</title>
1402 The embedded sort is a way to specify sort within a query - thus
1403 removing the need to send a Sort Request separately. It is both
1404 faster and does not require clients to deal with the Sort
1408 The possible values after attribute <literal>type 7</literal> are
1409 <literal>1</literal> ascending and
1410 <literal>2</literal> descending.
1411 The attributes+term (APT) node is separate from the
1412 rest and must be <literal>@or</literal>'ed.
1413 The term associated with APT is the sorting level in integers,
1414 where <literal>0</literal> means primary sort,
1415 <literal>1</literal> means secondary sort, and so forth.
1416 See also <xref linkend="administration-ranking"/>.
1419 For example, searching for water, sort by title (ascending)
1421 Z> find @or @attr 1=1016 water @attr 7=1 @attr 1=4 0
1425 Or, searching for water, sort by title ascending, then date descending
1427 Z> find @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1
1431 <sect3 id="querymodel-zebra-attr-estimation">
1432 <title>Zebra Extention Term Set Attribute (type 8)</title>
1435 The Term Set feature is a facility that allows a search to store
1436 hitting terms in a "pseudo" resultset; thus a search (as usual) +
1437 a scan-like facility. Requires a client that can do named result
1438 sets since the search generates two result sets. The value for
1439 attribute 8 is the name of a result set (string). The terms in
1440 the named term set are returned as SUTRS records.
1443 For example, searching for u in title, right truncated, and
1444 storing the result in term set named 'aset'
1446 Z> find @attr 5=1 @attr 1=4 @attr 8=aset u
1450 The model has one serious flaw: we don't know the size of term
1451 set. Experimental. Do not use in production code.
1454 <sect3 id="querymodel-zebra-attr-weight">
1455 <title>Zebra Extention Rank Weight Attribute (type 9)</title>
1458 Rank weight is a way to pass a value to a ranking algorithm - so
1459 that one APT has one value - while another as a different one.
1460 See also <xref linkend="administration-ranking"/>.
1463 For example, searching for utah in title with weight 30 as well
1464 as any with weight 20:
1466 Z> find @attr 2=102 @or @attr 9=30 @attr 1=4 utah @attr 9=20 utah
1470 <sect3 id="querymodel-zebra-attr-limit">
1471 <title>Zebra Extention Approximative Limit Attribute (type 9)</title>
1474 Newer Zebra versions normally estemiates hit count for every APT
1475 (leaf) in the query tree. These hit counts are returned as part of
1476 the searchResult-1 facility in the binary encoded Z39.50 search
1480 By setting a limit for the APT we can make Zebra turn into
1481 approximate hit count when a certain hit count limit is
1482 reached. A value of zero means exact hit count.
1485 For example, we might be intersted in exact hit count for a, but
1486 for b we allow hit count estimates for 1000 and higher.
1488 Z> find @and a @attr 9=1000 b
1492 The estimated hit count fascility makes searches faster, as one
1493 only needs to process large hit lists partially.
1496 This facility clashes with rank weight, because there all
1497 documents in the hit lists need to be examined for scoring and
1499 It is an experimental
1500 extention. Do not use in production code.
1503 <sect3 id="querymodel-zebra-attr-termref">
1504 <title>Zebra Extention Term Reference Attribute (type 10)</title>
1507 Zebra supports the <literal>searchResult-1</literal> facility.
1508 If the <literal>Term Reference Attribute (type 10)</literal> is
1509 given, that specifies a subqueryId value returned as part of the
1510 search result. It is a way for a client to name an APT part of a
1520 Experimental. Do not use in production code.
1527 <sect2 id="querymodel-zebra-attr-scan">
1528 <title>Zebra specific Scan Extentions to all Attribute Sets</title>
1530 Zebra extends the Bib1 attribute types, and these extentions are
1531 recognized regardless of attribute
1532 set used in a <literal>scan</literal> operation query.
1534 <table id="querymodel-zebra-attr-scan-table"
1535 frame="all" rowsep="1" colsep="1" align="center">
1537 <caption>Zebra Scan Attribute Extentions</caption>
1543 <td>Zebra version</td>
1548 <td>Result Set Narrow</td>
1554 <td>Approximative Limit</td>
1562 <sect3 id="querymodel-zebra-attr-narrow">
1563 <title>Zebra Extention Result Set Narrow (type 8)</title>
1566 If attribute <literal>Result Set Narrow (type 8)</literal>
1567 is given for <literal>scan</literal>, the value is the name of a
1568 result set. Each hit count in <literal>scan</literal> is
1569 <literal>@and</literal>'ed with the result set given.
1572 Consider for example
1573 the case of scanning all title fields around the
1574 scanterm <emphasis>mozart</emphasis>, then refining the scan by
1575 issuing a filtering query for <emphasis>amadeus</emphasis> to
1576 restric the scan to the result set of the query:
1578 Z> scan @attr 1=4 mozart
1581 mozartforskningen (1)
1585 Z> f @attr 1=4 amadeus
1587 Number of hits: 15, setno 2
1589 Z> scan @attr 1=4 @attr 8=2 mozart
1592 mozartforskningen (0)
1600 Experimental. Do not use in production code.
1603 <sect3 id="querymodel-zebra-attr-approx">
1604 <title>Zebra Extention Approximative Limit (type 9)</title>
1607 The <literal>Zebra Extention Approximative Limit (type
1608 9)</literal> is a way to enable approx
1609 hit counts for <literal>scan</literal> hit counts, in the same
1610 way as for <literal>search</literal> hit counts.
1619 Experimental and buggy. Definitely not to be used in production code.
1626 <sect2 id="querymodel-idxpath">
1627 <title>Zebra special IDXPATH Attribute Set for GRS indexing</title>
1629 The attribute-set <literal>idxpath</literal> consists of a single
1630 <literal>Use (type 1)</literal> attribute. All non-use attributes
1634 This feature is enabled when defining the
1635 <literal>xpath enable</literal> option in the GRS filter
1636 <literal>*.abs</literal> configuration files. If one wants to use
1637 the special <literal>idxpath</literal> numeric attribute set, the
1638 main Zebra configuraiton file <filename>zebra.cfg</filename>
1639 directive <literal>attset: idxpath.att</literal> must be enabled.
1641 <warning>The <literal>idxpath</literal> is depreciated, may not be
1642 supported in future Zebra versions, and should definitely
1643 not be used in production code.
1646 <sect3 id="querymodel-idxpath-use">
1647 <title>IDXPATH Use Attributes (type = 1)</title>
1649 This attribute set allows one to search GRS filter indexed
1650 records by XPATH like structured index names.
1653 <warning>The <literal>idxpath</literal> option defines hard-coded
1654 index names, which might clash with your own index names.
1657 <table id="querymodel-idxpath-use-table"
1658 frame="all" rowsep="1" colsep="1" align="center">
1660 <caption>Zebra specific IDXPATH Use Attributes (type 1)</caption>
1665 <td>String Index</td>
1671 <td>XPATH Begin</td>
1673 <td>_XPATH_BEGIN</td>
1674 <td>depreciated</td>
1680 <td>depreciated</td>
1683 <td>XPATH CData</td>
1685 <td>_XPATH_CDATA</td>
1686 <td>depreciated</td>
1689 <td>XPATH Attribute Name</td>
1691 <td>_XPATH_ATTR_NAME</td>
1692 <td>depreciated</td>
1695 <td>XPATH Attribute CData</td>
1697 <td>_XPATH_ATTR_CDATA</td>
1698 <td>depreciated</td>
1705 See <filename>tab/idxpath.att</filename> for more information.
1708 Search for all documents starting with root element
1709 <literal>/root</literal> (either using the numeric or the string
1712 Z> find @attrset idxpath @attr 1=1 @attr 4=3 root/
1713 Z> find @attr idxpath 1=1 @attr 4=3 root/
1714 Z> find @attr 1=_XPATH_BEGIN @attr 4=3 root/
1718 Search for all documents where specific nested XPATH
1719 <literal>/c1/c2/../cn</literal> exists. Notice the very
1720 counter-intuitive <emphasis>reverse</emphasis> notation!
1722 Z> find @attrset idxpath @attr 1=1 @attr 4=3 cn/cn-1/../c1/
1723 Z> find @attr 1=_XPATH_BEGIN @attr 4=3 cn/cn-1/../c1/
1727 Search for CDATA string <emphasis>text</emphasis> in any element
1729 Z> find @attrset idxpath @attr 1=1016 text
1730 Z> find @attr 1=_XPATH_CDATA text
1734 Search for CDATA string <emphasis>anothertext</emphasis> in any
1737 Z> find @attrset idxpath @attr 1=1015 anothertext
1738 Z> find @attr 1=_XPATH_ATTR_CDATA anothertext
1742 Search for all documents with have an XML element node
1743 including an XML attribute named <emphasis>creator</emphasis>
1745 Z> find @attrset idxpath @attr 1=3 @attr 4=3 creator
1746 Z> find @attr 1=_XPATH_ATTR_NAME @attr 4=3 creator
1750 Combining usual <literal>bib-1</literal> attribut set searches
1751 with <literal>idxpath</literal> attribute set searches:
1753 Z> find @and @attr idxpath 1=1 @attr 4=3 link/ @attr 1=4 mozart
1754 Z> find @and @attr 1=_XPATH_BEGIN @attr 4=3 link/ @attr 1=_XPATH_CDATA mozart
1758 Scanning is supportet on all <literal>idxpath</literal>
1759 indexes, both specified as numeric use attributes, or as string
1762 Z> scan @attrset idxpath @attr 1=1016 text
1763 Z> scan @attr 1=_XPATH_ATTR_CDATA anothertext
1764 Z> scan @attrset idxpath @attr 1=3 @attr 4=3 ''
1772 <sect2 id="querymodel-bib1-mapping">
1773 <title>Mapping from Bib1 Attributes to Zebra internal
1774 register indexes</title>
1780 <!-- see in util/zebramap.c
1783 if (completeness_value == 2 || completeness_value == 3)
1789 *sort_flag =(sort_relation_value > 0) ? 1 : 0;
1790 *search_type = "phrase";
1791 strcpy(rank_type, "void");
1792 if (relation_value == 102)
1794 if (weight_value == -1)
1796 sprintf(rank_type, "rank,w=%d,u=%d", weight_value, use_value);
1798 if (relation_value == 103)
1800 *search_type = "always";
1808 switch (structure_value)
1810 case 6: /* word list */
1811 *search_type = "and-list";
1813 case 105: /* free-form-text */
1814 *search_type = "or-list";
1816 case 106: /* document-text */
1817 *search_type = "or-list";
1820 case 1: /* phrase */
1822 case 108: /* string */
1823 *search_type = "phrase";
1825 case 107: /* local-number */
1826 *search_type = "local";
1829 case 109: /* numeric string */
1831 *search_type = "numeric";
1835 *search_type = "phrase";
1839 *search_type = "phrase";
1843 *search_type = "phrase";
1847 *search_type = "phrase";
1858 <emphasis>Use</emphasis> attributes are interpreted according to the
1859 attribute sets which have been loaded in the
1860 <literal>zebra.cfg</literal> file, and are matched against specific
1861 fields as specified in the <literal>.abs</literal> file which
1862 describes the profile of the records which have been loaded.
1863 If no Use attribute is provided, a default of Bib-1 Any is assumed.
1867 If a <emphasis>Structure</emphasis> attribute of
1868 <emphasis>Phrase</emphasis> is used in conjunction with a
1869 <emphasis>Completeness</emphasis> attribute of
1870 <emphasis>Complete (Sub)field</emphasis>, the term is matched
1871 against the contents of the phrase (long word) register, if one
1872 exists for the given <emphasis>Use</emphasis> attribute.
1873 A phrase register is created for those fields in the
1874 <literal>.abs</literal> file that contains a
1875 <literal>p</literal>-specifier.
1876 <!-- ### whatever the hell _that_ is -->
1880 If <emphasis>Structure</emphasis>=<emphasis>Phrase</emphasis> is
1881 used in conjunction with <emphasis>Incomplete Field</emphasis> - the
1882 default value for <emphasis>Completeness</emphasis>, the
1883 search is directed against the normal word registers, but if the term
1884 contains multiple words, the term will only match if all of the words
1885 are found immediately adjacent, and in the given order.
1886 The word search is performed on those fields that are indexed as
1887 type <literal>w</literal> in the <literal>.abs</literal> file.
1891 If the <emphasis>Structure</emphasis> attribute is
1892 <emphasis>Word List</emphasis>,
1893 <emphasis>Free-form Text</emphasis>, or
1894 <emphasis>Document Text</emphasis>, the term is treated as a
1895 natural-language, relevance-ranked query.
1896 This search type uses the word register, i.e. those fields
1897 that are indexed as type <literal>w</literal> in the
1898 <literal>.abs</literal> file.
1902 If the <emphasis>Structure</emphasis> attribute is
1903 <emphasis>Numeric String</emphasis> the term is treated as an integer.
1904 The search is performed on those fields that are indexed
1905 as type <literal>n</literal> in the <literal>.abs</literal> file.
1909 If the <emphasis>Structure</emphasis> attribute is
1910 <emphasis>URx</emphasis> the term is treated as a URX (URL) entity.
1911 The search is performed on those fields that are indexed as type
1912 <literal>u</literal> in the <literal>.abs</literal> file.
1916 If the <emphasis>Structure</emphasis> attribute is
1917 <emphasis>Local Number</emphasis> the term is treated as
1918 native Zebra Record Identifier.
1922 If the <emphasis>Relation</emphasis> attribute is
1923 <emphasis>Equals</emphasis> (default), the term is matched
1924 in a normal fashion (modulo truncation and processing of
1925 individual words, if required).
1926 If <emphasis>Relation</emphasis> is <emphasis>Less Than</emphasis>,
1927 <emphasis>Less Than or Equal</emphasis>,
1928 <emphasis>Greater than</emphasis>, or <emphasis>Greater than or
1929 Equal</emphasis>, the term is assumed to be numerical, and a
1930 standard regular expression is constructed to match the given
1932 If <emphasis>Relation</emphasis> is <emphasis>Relevance</emphasis>,
1933 the standard natural-language query processor is invoked.
1937 For the <emphasis>Truncation</emphasis> attribute,
1938 <emphasis>No Truncation</emphasis> is the default.
1939 <emphasis>Left Truncation</emphasis> is not supported.
1940 <emphasis>Process # in search term</emphasis> is supported, as is
1941 <emphasis>Regxp-1</emphasis>.
1942 <emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
1943 search. As a default, a single error (deletion, insertion,
1944 replacement) is accepted when terms are matched against the register
1949 <sect2 id="querymodel-regular">
1950 <title>Zebra Regular Expressions in Truncation Attribute (type = 5)</title>
1953 Each term in a query is interpreted as a regular expression if
1954 the truncation value is either <emphasis>Regxp-1 (@attr 5=102)</emphasis>
1955 or <emphasis>Regxp-2 (@attr 5=103)</emphasis>.
1956 Both query types follow the same syntax with the operands:
1959 <table id="querymodel-regular-operands-table"
1960 frame="all" rowsep="1" colsep="1" align="center">
1962 <caption>Regular Expression Operands</caption>
1965 <tr><td>one</td><td>two</td></tr>
1970 <td><literal>x</literal></td>
1971 <td>Matches the character <literal>x</literal>.</td>
1974 <td><literal>.</literal></td>
1975 <td>Matches any character.</td>
1978 <td><literal>[ .. ]</literal></td>
1979 <td>Matches the set of characters specified;
1980 such as <literal>[abc]</literal> or <literal>[a-c]</literal>.</td>
1986 The above operands can be combined with the following operators:
1989 <table id="querymodel-regular-operators-table"
1990 frame="all" rowsep="1" colsep="1" align="center">
1991 <caption>Regular Expression Operators</caption>
1994 <tr><td>one</td><td>two</td></tr>
1999 <td><literal>x*</literal></td>
2000 <td>Matches <literal>x</literal> zero or more times.
2001 Priority: high.</td>
2004 <td><literal>x+</literal></td>
2005 <td>Matches <literal>x</literal> one or more times.
2006 Priority: high.</td>
2009 <td><literal>x?</literal></td>
2010 <td> Matches <literal>x</literal> zero or once.
2011 Priority: high.</td>
2014 <td><literal>xy</literal></td>
2015 <td> Matches <literal>x</literal>, then <literal>y</literal>.
2016 Priority: medium.</td>
2019 <td><literal>x|y</literal></td>
2020 <td> Matches either <literal>x</literal> or <literal>y</literal>.
2024 <td><literal>( )</literal></td>
2025 <td>The order of evaluation may be changed by using parentheses.</td>
2031 If the first character of the <literal>Regxp-2</literal> query
2032 is a plus character (<literal>+</literal>) it marks the
2033 beginning of a section with non-standard specifiers.
2034 The next plus character marks the end of the section.
2035 Currently Zebra only supports one specifier, the error tolerance,
2036 which consists one digit.
2040 Since the plus operator is normally a suffix operator the addition to
2041 the query syntax doesn't violate the syntax for standard regular
2046 For example, a phrase search with regular expressions in
2047 the title-register is performed like this:
2049 Z> find @attr 1=4 @attr 5=102 "informat.* retrieval"
2054 Combinations with other attributes are possible. For example, a
2055 ranked search with a regular expression:
2057 Z> find @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval"
2065 The RecordType parameter in the <literal>zebra.cfg</literal> file, or
2066 the <literal>-t</literal> option to the indexer tells Zebra how to
2067 process input records.
2068 Two basic types of processing are available - raw text and structured
2069 data. Raw text is just that, and it is selected by providing the
2070 argument <literal>text</literal> to Zebra. Structured records are
2071 all handled internally using the basic mechanisms described in the
2072 subsequent sections.
2073 Zebra can read structured records in many different formats.
2079 <sect1 id="querymodel-cql-to-pqf">
2080 <title>Server Side CQL to PQF Query Translation</title>
2083 <literal><cql2rpn>l2rpn.txt</cql2rpn></literal>
2084 YAZ Frontend Virtual
2085 Hosts option, one can configure
2086 the YAZ Frontend CQL-to-PQF
2087 converter, specifying the interpretation of various
2088 <ulink url="&url.cql;">CQL</ulink>
2089 indexes, relations, etc. in terms of Type-1 query attributes.
2090 <!-- The yaz-client config file -->
2093 For example, using server-side CQL-to-PQF conversion, one might
2094 query a zebra server like this:
2097 yaz-client localhost:9999
2099 Z> find text=(plant and soil)
2102 and - if properly configured - even static relevance ranking can
2103 be performed using CQL query syntax:
2106 Z> find text = /relevant (plant and soil)
2112 By the way, the same configuration can be used to
2113 search using client-side CQL-to-PQF conversion:
2114 (the only difference is <literal>querytype cql2rpn</literal>
2116 <literal>querytype cql</literal>, and the call specifying a local
2120 yaz-client -q local/cql2pqf.txt localhost:9999
2121 Z> querytype cql2rpn
2122 Z> find text=(plant and soil)
2128 Exhaustive information can be found in the
2129 Section "Specification of CQL to RPN mappings" in the YAZ manual.
2130 <ulink url="http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map">
2131 http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map</ulink>,
2132 and shall therefore not be repeated here.
2137 <ulink url="http://www.loc.gov/z3950/agency/zing/cql/dc-indexes.html">
2138 http://www.loc.gov/z3950/agency/zing/cql/dc-indexes.html</ulink>
2139 for the Maintenance Agency's work-in-progress mapping of Dublin Core
2140 indexes to Attribute Architecture (util, XD and BIB-2)
2150 <!-- Keep this comment at the end of the file
2155 sgml-minimize-attributes:nil
2156 sgml-always-quote-attributes:t
2159 sgml-parent-document: "zebra.xml"
2160 sgml-local-catalogs: nil
2161 sgml-namecase-general:t