- <section id="record-model-domxml-internal">
- <title>&dom; filter internal record representation</title>
- <para>When indexing, an &xml; Reader is invoked to split the input
- files into suitable record &xml; pieces. Each record piece is then
- transformed to an &xml; &dom; structure, which is essentially the
- record model. Only &xslt; transformations can be applied during
- index, search and retrieval. Consequently, output formats are
- restricted to whatever &xslt; can deliver from the record &xml;
- structure, be it other &xml; formats, HTML, or plain text. In case
- you have <literal>libxslt1</literal> running with E&xslt; support,
- you can use this functionality inside the &dom;
- filter configuration &xslt; stylesheets.
+ <section id="record-model-domxml-canonical-index">
+ <title>Canonical Indexing Format</title>
+
+ <para>
+ &dom; &xml; indexing comes in two flavors: pure
+ processing-instruction governed plain &xml; documents, and - very
+ similar to the Alvis filter indexing format - &xml; documents
+ containing &xml; <literal><record></literal> and
+ <literal><index></literal> instructions from the magic
+ namespace <literal>xmlns:z="http://indexdata.dk/zebra-2.0"</literal>.
+ </para>
+
+ <section id="record-model-domxml-canonical-index-pi">
+ <title>Processing-instruction governed indexing format</title>
+
+ <para>The output of the processing instruction driven
+ indexing &xslt; stylesheets must contain
+ processing instructions named
+ <literal>zebra-2.0</literal>.
+ The output of the &xslt; indexing transformation is then
+ parsed using &dom; methods, and the contained instructions are
+ performed on the <emphasis>elements and their
+ subtrees directly following the processing instructions</emphasis>.
+ </para>
+ <para>
+ For example, the output of the command
+ <screen>
+ xsltproc dom-index-pi.xsl marc-one.xml
+ </screen>
+ might look like this:
+ <screen>
+ <![CDATA[
+ <?xml version="1.0" encoding="UTF-8"?>
+ <?zebra-2.0 record id=11224466 rank=42?>
+ <record>
+ <?zebra-2.0 index control:w?>
+ <control>11224466</control>
+ <?zebra-2.0 index title:w title:p title:s any:w?>
+ <title>How to program a computer</title>
+ </record>
+ ]]>
+ </screen>