-->
<!NOTATION PDF SYSTEM "PDF">
]>
-<!-- $Id: book.xml,v 1.51 2007-01-18 09:24:47 marc Exp $ -->
+<!-- $Id: book.xml,v 1.57 2007-03-30 11:35:04 marc Exp $ -->
<book id="metaproxy">
<bookinfo>
<title>Metaproxy - User's Guide and Reference</title>
</para>
</section>
+ <section id="installation.rpm">
+ <title>Installation on RPM based Linux Systems</title>
+ <para>
+ All external dependencies for Metaproxy are available as
+ RPM packages, either from your distribution site, or from the
+ <ulink url="http://fr.rpmfind.net/">RPMfind</ulink> site.
+ </para>
+ <para>
+ For example, an installation of the requires Boost C++ development
+ libraries on RedHat Fedora C4 and C5 can be done like this:
+ <screen>
+ wget ftp://fr.rpmfind.net/wlinux/fedora/core/updates/testing/4/SRPMS/boost-1.33.0-3.fc4.src.rpm
+ sudo rpmbuild --buildroot src/ --rebuild -p fc4/boost-1.33.0-3.fc4.src.rpm
+ sudo rpm -U /usr/src/redhat/RPMS/i386/boost-*rpm
+ </screen>
+ </para>
+ <para>
+ The <ulink url="&url.yaz;">YAZ</ulink> library is needed to
+ compile &metaproxy;, see there
+ for more information on available RPM packages.
+ </para>
+ <para>
+ There is currently no official RPM package for YAZ++.
+ See the <ulink url="&url.yaz.pp;">YAZ++</ulink> pages
+ for more information on a Unix tarball install.
+ </para>
+ <para>
+ With these packages installed, the usual configure + make
+ procedure can be used for Metaproxy as outlined in
+ <xref linkend="installation.unix"/>.
+ </para>
+ </section>
+
<section id="installation.windows">
<title>Installation on Windows</title>
<para>
</section>
</chapter>
+<chapter id="yazproxy-comparison">
+ <title>YAZ Proxy Comparison</title>
+ <para>
+ The table below lists facilities either supported by either
+ <ulink url="&url.yazproxy;">YAZ Proxy</ulink> or Metaproxy.
+ </para>
+<table id="yazproxy-comparison-table">
+ <title>Metaproxy / YAZ Proxy comparison</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Facility</entry>
+ <entry>Metaproxy</entry>
+ <entry>YAZ Proxy</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>Z39.50 server</entry>
+ <entry>Using filter <literal>frontend_net</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>SRU server</entry>
+ <entry>Supported with filter <literal>sru_z3950</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Z39.50 client</entry>
+ <entry>Supported with filter <literal>z3950_client</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>SRU client</entry>
+ <entry>Unsupported</entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Connection reuse</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Connection share</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Result set reuse</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Within one Z39.50 session / HTTP keep-alive</entry>
+ </row>
+ <row>
+ <entry>Record cache</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported for last result set within one Z39.50/HTTP-keep alive session</entry>
+ </row>
+ <row>
+ <entry>Z39.50 Virtual database, i.e. select any Z39.50 target for database</entry>
+ <entry>Supported with filter <literal>virt_db</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>SRU Virtual database, i.e. select any Z39.50 target for path</entry>
+ <entry>Supported with filter <literal>virt_db</literal>,
+ <literal>sru_z3950</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Multi target search</entry>
+ <entry>Supported with filter <literal>multi</literal> (round-robin)</entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Retrieval and search limits</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Bandwidth limits</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Connect limits</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Retrieval sanity check and conversions</entry>
+ <entry>Supported using filter <literal>record_transform</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Query check</entry>
+ <entry>
+ Supported in a limited way using <literal>query_rewrite</literal>
+ </entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Query rewrite</entry>
+ <entry>Supported with <literal>query_rewrite</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Session invalidate for -1 hits</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Architecture</entry>
+ <entry>Multi-threaded + select for networked modules such as
+ <literal>frontend_net</literal>)</entry>
+ <entry>Single-threaded using select</entry>
+ </row>
+
+ <row>
+ <entry>Extensability</entry>
+ <entry>Most functionality implemented as loadable modules</entry>
+ <entry>Unsupported and experimental</entry>
+ </row>
+
+ <row>
+ <entry><ulink url="&url.usemarcon;">USEMARCON</ulink></entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+
+ <row>
+ <entry>Portability</entry>
+ <entry>
+ Requires YAZ, YAZ++ and modern C++ compiler supporting
+ <ulink url="&url.boost;">Boost</ulink>.
+ </entry>
+ <entry>
+ Requires YAZ and YAZ++.
+ STL is not required so pretty much any C++ compiler out there should work.
+ </entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+</table>
+</chapter>
+
<chapter id="architecture">
<title>The Metaproxy Architecture</title>
<para>
plugins that provide new filters. The filter API is small and
conceptually simple, but there are many details to master. See
the section below on
- <link linkend="extensions">extensions</link>.
+ <link linkend="filters">Filters</link>.
</para>
</listitem>
</varlistentry>
sets Z39.50 packages to Z_Close, and HTTP_Request packages to
HTTP_Response err code 400 packages, and adds a suitable bounce
message.
- The bounce filter is usually added at end of each filter chain
- config.xml to prevent infinite hanging of for example HTTP
+ The bounce filter is usually added at end of each filter chain route
+ to prevent infinite hanging of for example HTTP
requests packages when only the Z39.50 client partial sink
filter is found in the
route.
</section>
<section>
+ <title><literal>cql_rpn</literal>
+ (mp::filter::CQLtoRPN)</title>
+ <para>
+ A query language transforming filter which catches Z39.50
+ <literal>searchRequest</literal>
+ packages containing <literal>CQL</literal> queries, transforms
+ those to <literal>RPN</literal> queries,
+ and sends the <literal>searchRequests</literal> on to the next
+ filters. It is among other things useful in a SRU context.
+ </para>
+ </section>
+
+ <section>
<title><literal>frontend_net</literal>
(mp::filter::FrontendNet)</title>
<para>
<title><literal>http_file</literal>
(mp::filter::HttpFile)</title>
<para>
- A partial sink which swallows only HTTP_Request packages, and
+ A partial sink which swallows only
+ <literal>HTTP_Request</literal> packages, and
returns the contents of files from the local
filesystem in response to HTTP requests.
It lets Z39.50 packages and all other forthcoming package types
<title><literal>query_rewrite</literal>
(mp::filter::QueryRewrite)</title>
<para>
- Rewrites Z39.50 Type-1 and Type-101 (``RPN'') queries by a
+ Rewrites Z39.50 <literal>Type-1</literal>
+ and <literal>Type-101</literal> (``<literal>RPN</literal>'')
+ queries by a
three-step process: the query is transliterated from Z39.50
packet structures into an XML representation; that XML
representation is transformed by an XSLT stylesheet; and the
<title><literal>session_shared</literal>
(mp::filter::SessionShared)</title>
<para>
- When this is finished, it will implement global sharing of
+ This filter implements global sharing of
result sets (i.e. between threads and therefore between
- clients), yielding performance improvements especially when
- incoming requests are from a stateless environment such as a
- web-server, in which the client process representing a session
- might be any one of many. However:
+ clients), yielding performance improvements by clever resource
+ pooling.
</para>
- <warning>
- <para>
- This filter is not yet completed.
- </para>
- </warning>
</section>
<section>
which returns the response to the client.
</para>
</section>
- <section id="checking.xml.syntax">
+
+ <section id="config-file-modularity">
+ <title>Config file modularity</title>
+ <para>
+ Metaproxy XML configuration snippets can be reused by other
+ filters using the <literal>XInclude</literal> standard, as seen in
+ the <literal>/etc/config-sru-to-z3950.xml</literal> example SRU
+ configuration.
+ <screen><![CDATA[
+ <filter id="sru" type="sru_z3950">
+ <database name="Default">
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
+ href="explain.xml"/>
+ </database>
+ </filter>
+]]></screen>
+ </para>
+ </section>
+
+ <section id="config-file-syntax-check">
<title>Config file syntax checking</title>
<para>
The distribution contains RelaxNG Compact and XML syntax checking
</chapter>
+ <chapter id="sru-server">
+ <title>Combined SRU webservice and Z39.50 server configuration</title>
+ <para>
+ Metaproxy can act as
+ <ulink url="&url.sru;">SRU</ulink> and
+ <ulink url="&url.srw;">SRW</ulink>
+ web service server, which translates web service requests to
+ <ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink> packages and
+ sends them off to common available targets.
+ </para>
+ <para>
+ A typical setup for this operation needs a filter route including the
+ following modules:
+ </para>
+
+ <table id="sru-server-table-config" frame="top">
+ <title>SRU/Z39.50 Server Filter Route Configuration</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Filter</entry>
+ <entry>Importance</entry>
+ <entry>Purpose</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry><literal>frontend_net</literal></entry>
+ <entry>required</entry>
+ <entry>Accepting HTTP connections and passing them to following
+ filters. Since this filter also accepts Z39.50 connections, the
+ server works as SRU and Z39.50 server on the same port.</entry>
+ </row>
+ <row>
+ <entry><literal>sru_z3950</literal></entry>
+ <entry>required</entry>
+ <entry>Accepting SRU GET/POST/SOAP explain and
+ searchRetrieve requests for the the configured databases.
+ Explain requests are directly served from the static XML configuration.
+ SearchRetrieve requests are
+ transformed to Z39.50 search and present packages.
+ All other HTTP and Z39.50 packages are passed unaltered.</entry>
+ </row>
+ <row>
+ <entry><literal>http_file</literal></entry>
+ <entry>optional</entry>
+ <entry>Serving HTTP requests from the filesystem. This is only
+ needed if the server should serve XSLT stylesheets, static HTML
+ files or Java Script for thin browser based clients.
+ Z39.50 packages are passed unaltered.</entry>
+ </row>
+ <row>
+ <entry><literal>cql_rpn</literal></entry>
+ <entry>required</entry>
+ <entry>Usually, Z39.50 servers do not talk CQL, hence the
+ translation of the CQL query language to RPN is mandatory in
+ most cases. Affects only Z39.50 search packages.</entry>
+ </row>
+ <row>
+ <entry><literal>record_transform</literal></entry>
+ <entry>optional</entry>
+ <entry>Some Z39.50 backend targets can not present XML record
+ syntaxes in common wanted element sets. using this filter, one
+ can transform binary MARC records to MARCXML records, and
+ further transform those to any needed XML schema/format by XSLT
+ transformations. Changes only Z39.50 present packages.</entry>
+ </row>
+ <row>
+ <entry><literal>session_shared</literal></entry>
+ <entry>optional</entry>
+ <entry>The stateless nature of web services requires frequent
+ re-searching of the same targets for display of paged result set
+ records. This might be an unacceptable burden for the accessed
+ backend Z39.50 targets, and this mosule can be added for
+ efficient backend target resource pooling.</entry>
+ </row>
+ <row>
+ <entry><literal>z3950_client</literal></entry>
+ <entry>required</entry>
+ <entry>Finally, a Z39.50 package sink is needed in the filter
+ chain to provide the response packages. The Z39.50 client module
+ is used to access external targets over the network, but any
+ coming local Z39.50 package sink could be used instead of.</entry>
+ </row>
+ <row>
+ <entry><literal>bounce</literal></entry>
+ <entry>required</entry>
+ <entry>Any Metaproxy package arriving here did not do so by
+ purpose, and is bounced back with connection closure. this
+ prevents inifinite package hanging inside the SRU server.</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>
+ A typical minimal example <ulink url="&url.sru;">SRU</ulink> and
+ <ulink url="&url.srw;">SRW</ulink> server configuration file is found
+ in the tarball distribution at
+ <literal>etc/config-sru-to-z3950.xml</literal>.
+ </para>
+ <para>
+ Off course, any other metaproxy modules can be integrated into a
+ SRU server solution, including, but not limited to, load balancing,
+ multiple target querying
+ (see <xref linkend="multidb"/>), and complex RPN query rewrites.
+ </para>
+
+ </chapter>
+
+ <!--
<chapter id="extensions">
<title>Writing extensions for Metaproxy</title>
<para>### To be written</para>
</chapter>
-
+ -->