1 <!-- $Id: book.xml,v 1.14 2006-04-23 19:08:56 adam Exp $ -->
3 <title>Metaproxy - User's Guide and Reference</title>
5 <firstname>Mike</firstname><surname>Taylor</surname>
8 <firstname>Adam</firstname><surname>Dickmeiss</surname>
12 <holder>Index Data ApS</holder>
16 Metaproxy is a universal router, proxy and encapsulated
17 metasearcher for information retrieval protocols. It accepts,
18 processes, interprets and redirects requests from IR clients using
19 standard protocols such as
20 <ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink>
21 (and in the future <ulink url="&url.sru;">SRU</ulink>
22 and <ulink url="&url.srw;">SRW</ulink>), as
23 well as functioning as a limited
24 <ulink url="&url.http;">HTTP</ulink> server.
25 Metaproxy is configured by an XML file which
26 specifies how the software should function in terms of routes that
27 the request packets can take through the proxy, each step on a
28 route being an instantiation of a filter. Filters come in many
29 types, one for each operation: accepting Z39.50 packets, logging,
30 query transformation, multiplexing, etc. Further filter-types can
31 be added as loadable modules to extend Metaproxy functionality,
35 The terms under which Metaproxy will be distributed have yet to be
36 established, but it will not necessarily be open source; so users
37 should not at this stage redistribute the code without explicit
38 written permission from the copyright holders, Index Data ApS.
43 <imagedata fileref="common/id.png" format="PNG"/>
46 <imagedata fileref="common/id.eps" format="EPS"/>
53 <chapter id="introduction">
54 <title>Introduction</title>
58 <ulink url="&url.metaproxy;">Metaproxy</ulink>
59 is a standalone program that acts as a universal router, proxy and
60 encapsulated metasearcher for information retrieval protocols such
61 as <ulink url="&url.z39.50;">Z39.50</ulink>, and in the future
62 <ulink url="&url.sru;">SRU</ulink> and <ulink url="&url.srw;">SRW</ulink>.
63 To clients, it acts as a server of these protocols: it can be searched,
64 records can be retrieved from it, etc.
65 To servers, it acts as a client: it searches in them,
66 retrieves records from them, etc. it satisfies its clients'
67 requests by transforming them, multiplexing them, forwarding them
68 on to zero or more servers, merging the results, transforming
69 them, and delivering them back to the client. In addition, it
70 acts as a simple <ulink url="&url.http;">HTTP</ulink> server; support
71 for further protocols can be added in a modular fashion, through the
72 creation of new filters.
77 Cold bananas, fish, pyjamas,
78 Mutton, beef and trout!
79 - attributed to Cole Porter.
82 Metaproxy is a more capable alternative to
83 <ulink url="&url.yazproxy;">YAZ Proxy</ulink>,
84 being more powerful, flexible, configurable and extensible. Among
85 its many advantages over the older, more pedestrian work are
86 support for multiplexing (encapsulated metasearching), routing by
87 database name, authentication and authorisation and serving local
88 files via HTTP. Equally significant, its modular architecture
89 facilitites the creation of pluggable modules implementing further
93 This manual will briefly describe Metaproxy's licensing situation
94 before giving an overview of its architecture, then discussing the
95 key concept of a filter in some depth and giving an overview of
96 the various filter types, then discussing the configuration file
97 format. After this come several optional chapters which may be
98 freely skipped: a detailed discussion of virtual databases and
99 multi-database searching, some notes on writing extensions
100 (additional filter types) and a high-level description of the
101 source code. Finally comes the reference guide, which contains
102 instructions for invoking the <command>metaproxy</command>
103 program, and detailed information on each type of filter,
110 <chapter id="licence">
111 <title>The Metaproxy Licence</title>
113 <emphasis role="strong">
114 No decision has yet been made on the terms under which
115 Metaproxy will be distributed.
117 It is possible that, unlike
118 other Index Data products, metaproxy may not be released under a
119 free-software licence such as the GNU GPL. Until a decision is
120 made and a public statement made, then, and unless it has been
121 delivered to you other specific terms, please treat Metaproxy as
122 though it were proprietary software.
123 The code should not be redistributed without explicit
124 written permission from the copyright holders, Index Data ApS.
130 <chapter id="architecture">
131 <title>The Metaproxy Architecture</title>
133 The Metaproxy architecture is based on three concepts:
134 the <emphasis>package</emphasis>,
135 the <emphasis>route</emphasis>
136 and the <emphasis>filter</emphasis>.
140 <term>Packages</term>
143 A package is request or response, encoded in some protocol,
144 issued by a client, making its way through Metaproxy, send to or
145 received from a server, or sent back to the client.
148 The core of a package is the protocol unit - for example, a
149 Z39.50 Init Request or Search Response, or an SRU searchRetrieve
150 URL or Explain Response. In addition to this core, a package
151 also carries some extra information added and used by Metaproxy
155 In general, packages are doctored as they pass through
156 Metaproxy. For example, when the proxy performs authentication
157 and authorisation on a Z39.50 Init request, it removes the
158 authentication credentials from the package so that they are not
159 passed onto the back-end server; and when search-response
160 packages are obtained from multiple servers, they are merged
161 into a single unified package that makes its way back to the
170 Packages make their way through routes, which can be thought of
171 as programs that operate on the package data-type. Each
172 incoming package initially makes its way through a default
173 route, but may be switched to a different route based on various
174 considerations. Routes are made up of sequences of filters (see
183 Filters provide the individual instructions within a route, and
184 effect the necessary transformations on packages. A particular
185 configuration of Metaproxy is essentially a set of filters,
186 described by configuration details and arranged in order in one
187 or more routes. There are many kinds of filter - about a dozen
188 at the time of writing with more appearing all the time - each
189 performing a specific function and configured by different
193 The word ``filter'' is sometimes used rather loosely, in two
194 different ways: it may be used to mean a particular
195 <emphasis>type</emphasis> of filter, as when we speak of ``the
196 auth_simplefilter'' or ``the multi filter''; or it may be used
197 to be a specific <emphasis>instance</emphasis> of a filter
198 within a Metaproxy configuration. For example, a single
199 configuration will often contain multiple instances of the
200 <literal>z3950_client</literal> filter. In
201 operational terms, of these is a separate filter. In practice,
202 context always make it clear which sense of the word ``filter''
206 Extensibility of Metaproxy is primarily through the creation of
207 plugins that provide new filters. The filter API is small and
208 conceptually simple, but there are many details to master. See
210 <link linkend="extensions">extensions</link>.
216 Since packages are created and handled by the system itself, and
217 routes are conceptually simple, most of the remainder of this
218 document concentrates on filters. After a brief overview of the
219 filter types follows, along with some thoughts on possible future
226 <chapter id="filters">
227 <title>Filters</title>
231 <title>Introductory notes</title>
233 It's useful to think of Metaproxy as an interpreter providing a small
234 number of primitives and operations, but operating on a very
235 complex data type, namely the ``package''.
238 A package represents a Z39.50 or SRU/W request (whether for Init,
239 Search, Scan, etc.) together with information about where it came
240 from. Packages are created by front-end filters such as
241 <literal>frontend_net</literal> (see below), which reads them from
242 the network; other front-end filters are possible. They then pass
243 along a route consisting of a sequence of filters, each of which
244 transforms the package and may also have side-effects such as
245 generating logging. Eventually, the route will yield a response,
246 which is sent back to the origin.
249 There are many kinds of filter: some that are defined statically
250 as part of Metaproxy, and others may be provided by third parties
251 and dynamically loaded. They all conform to the same simple API
252 of essentially two methods: <function>configure()</function> is
253 called at startup time, and is passed a DOM tree representing that
254 part of the configuration file that pertains to this filter
255 instance: it is expected to walk that tree extracting relevant
256 information; and <function>process()</function> is called every
257 time the filter has to processes a package.
260 While all filters provide the same API, there are different modes
261 of functionality. Some filters are sources: they create
263 (<literal>frontend_net</literal>);
264 others are sinks: they consume packages and return a result
265 (<literal>z3950_client</literal>,
266 <literal>backend_test</literal>,
267 <literal>http_file</literal>);
268 the others are true filters, that read, process and pass on the
269 packages they are fed
270 (<literal>auth_simple</literal>,
271 <literal>log</literal>,
272 <literal>multi</literal>,
273 <literal>query_rewrite</literal>,
274 <literal>session_shared</literal>,
275 <literal>template</literal>,
276 <literal>virt_db</literal>).
281 <section id="overview.filter.types">
282 <title>Overview of filter types</title>
284 We now briefly consider each of the types of filter supported by
285 the core Metaproxy binary. This overview is intended to give a
286 flavour of the available functionality; more detailed information
287 about each type of filter is included below in
288 <link linkend="filterref"
289 >the reference guide to Metaproxy filters</link>.
292 The filters are here named by the string that is used as the
293 <literal>type</literal> attribute of a
294 <literal><filter></literal> element in the configuration
295 file to request them, with the name of the class that implements
296 them in parentheses. (The classname is not needed for normal
297 configuration and use of Metaproxy; it is useful only to
301 The filters are here listed in alphabetical order:
305 <title><literal>auth_simple</literal>
306 (mp::filter::AuthSimple)</title>
308 Simple authentication and authorisation. The configuration
309 specifies the name of a file that is the user register, which
310 lists <varname>username</varname>:<varname>password</varname>
311 pairs, one per line, colon separated. When a session begins, it
312 is rejected unless username and passsword are supplied, and match
313 a pair in the register. The configuration file may also specific
314 the name of another file that is the target register: this lists
315 lists <varname>username</varname>:<varname>dbname</varname>,<varname>dbname</varname>...
316 sets, one per line, with multiple database names separated by
317 commas. When a search is processed, it is rejected unless the
318 database to be searched is one of those listed as available to
324 <title><literal>backend_test</literal>
325 (mp::filter::Backend_test)</title>
327 A sink that provides dummy responses in the manner of the
328 <literal>yaz-ztest</literal> Z39.50 server. This is useful only
329 for testing. Seriously, you don't need this. Pretend you didn't
330 even read this section.
335 <title><literal>frontend_net</literal>
336 (mp::filter::FrontendNet)</title>
338 A source that accepts Z39.50 connections from a port
339 specified in the configuration, reads protocol units, and
340 feeds them into the next filter in the route. When the result is
341 revceived, it is returned to the original origin.
346 <title><literal>http_file</literal>
347 (mp::filter::HttpFile)</title>
349 A sink that returns the contents of files from the local
350 filesystem in response to HTTP requests. (Yes, Virginia, this
351 does mean that Metaproxy is also a Web-server in its spare time. So
352 far it does not contain either an email-reader or a Lisp
353 interpreter, but that day is surely coming.)
358 <title><literal>log</literal>
359 (mp::filter::Log)</title>
361 Writes logging information to standard output, and passes on
362 the package unchanged.
367 <title><literal>multi</literal>
368 (mp::filter::Multi)</title>
370 Performs multicast searching.
372 <link linkend="multidb">the extended discussion</link>
373 of virtual databases and multi-database searching below.
378 <title><literal>query_rewrite</literal>
379 (mp::filter::QueryRewrite)</title>
381 Rewrites Z39.50 Type-1 and Type-101 (``RPN'') queries by a
382 three-step process: the query is transliterated from Z39.50
383 packet structures into an XML representation; that XML
384 representation is transformed by an XSLT stylesheet; and the
385 resulting XML is transliterated back into the Z39.50 packet
391 <title><literal>session_shared</literal>
392 (mp::filter::SessionShared)</title>
394 When this is finished, it will implement global sharing of
395 result sets (i.e. between threads and therefore between
396 clients), yielding performance improvements especially when
397 incoming requests are from a stateless environment such as a
398 web-server, in which the client process representing a session
399 might be any one of many. However:
403 This filter is not yet completed.
409 <title><literal>template</literal>
410 (mp::filter::Template)</title>
412 Does nothing at all, merely passing the packet on. (Maybe it
413 should be called <literal>nop</literal> or
414 <literal>passthrough</literal>?) This exists not to be used, but
415 to be copied - to become the skeleton of new filters as they are
416 written. As with <literal>backend_test</literal>, this is not
417 intended for civilians.
422 <title><literal>virt_db</literal>
423 (mp::filter::Virt_db)</title>
425 Performs virtual database selection: based on the name of the
426 database in the search request, a server is selected, and its
427 address added to the request in a <literal>VAL_PROXY</literal>
428 otherInfo packet. It will subsequently be used by a
429 <literal>z3950_client</literal> filter.
431 <link linkend="multidb">the extended discussion</link>
432 of virtual databases and multi-database searching below.
437 <title><literal>z3950_client</literal>
438 (mp::filter::Z3950Client)</title>
440 Performs Z39.50 searching and retrieval by proxying the
441 packages that are passed to it. Init requests are sent to the
442 address specified in the <literal>VAL_PROXY</literal> otherInfo
443 attached to the request: this may have been specified by client,
444 or generated by a <literal>virt_db</literal> filter earlier in
445 the route. Subsequent requests are sent to the same address,
446 which is remembered at Init time in a Session object.
452 <section id="future.directions">
453 <title>Future directions</title>
455 Some other filters that do not yet exist, but which would be
456 useful, are briefly described. These may be added in future
457 releases (or may be created by third parties, as loadable
463 <term><literal>frontend_cli</literal> (source)</term>
466 Command-line interface for generating requests.
471 <term><literal>frontend_sru</literal> (source)</term>
474 Receive SRU (and perhaps SRW) requests.
479 <term><literal>sru2z3950</literal> (filter)</term>
482 Translate SRU requests into Z39.50 requests.
487 <term><literal>sru_client</literal> (sink)</term>
490 SRU searching and retrieval.
495 <term><literal>srw_client</literal> (sink)</term>
498 SRW searching and retrieval.
503 <term><literal>opensearch_client</literal> (sink)</term>
506 A9 OpenSearch searching and retrieval.
516 <chapter id="configuration">
517 <title>Configuration: the Metaproxy configuration file format</title>
521 <title>Introductory notes</title>
523 If Metaproxy is an interpreter providing operations on packages, then
524 its configuration file can be thought of as a program for that
525 interpreter. Configuration is by means of a single file, the name
526 of which is supplied as the sole command-line argument to the
527 <command>metaproxy</command> program. (See
528 <link linkend="progref">the reference guide</link>
529 below for more information on invoking Metaproxy.)
532 The configuration files are written in XML. (But that's just an
533 implementation detail - they could just as well have been written
534 in YAML or Lisp-like S-expressions, or in a custom syntax.)
537 Since XML has been chosen, an XML schema,
538 <filename>config.xsd</filename>, is provided for validating
539 configuration files. This file is supplied in the
540 <filename>etc</filename> directory of the Metaproxy distribution. It
541 can be used by (among other tools) the <command>xmllint</command>
542 program supplied as part of the <literal>libxml2</literal>
546 xmllint --noout --schema etc/config.xsd my-config-file.xml
549 (A recent version of <literal>libxml2</literal> is required, as
550 support for XML Schemas is a relatively recent addition.)
554 <section id="overview.xml.structure">
555 <title>Overview of XML structure</title>
557 All elements and attributes are in the namespace
558 <ulink url="http://indexdata.dk/yp2/config/1"/>.
559 This is most easily achieved by setting the default namespace on
560 the top-level element, as here:
563 <yp2 xmlns="http://indexdata.dk/yp2/config/1">
566 The top-level element is <yp2>. This contains a
567 <start> element, a <filters> element and a
568 <routes> element, in that order. <filters> is
569 optional; the other two are mandatory. All three are
573 The <start> element is empty, but carries a
574 <literal>route</literal> attribute, whose value is the name of
575 route at which to start running - analogous to the name of the
576 start production in a formal grammar.
579 If present, <filters> contains zero or more <filter>
580 elements. Each filter carries a <literal>type</literal> attribute
581 which specifies what kind of filter is being defined
582 (<literal>frontend_net</literal>, <literal>log</literal>, etc.)
583 and contain various elements that provide suitable configuration
584 for a filter of its type. The filter-specific elements are
586 <link linkend="filterref">the reference guide below</link>.
587 Filters defined in this part of the file must carry an
588 <literal>id</literal> attribute so that they can be referenced
592 <routes> contains one or more <route> elements, each
593 of which must carry an <literal>id</literal> element. One of the
594 routes must have the ID value that was specified as the start
595 route in the <start> element's <literal>route</literal>
596 attribute. Each route contains zero or more <filter>
597 elements. These are of two types. They may be empty, but carry a
598 <literal>refid</literal> attribute whose value is the same as the
599 <literal>id</literal> of a filter previously defined in the
600 <filters> section. Alternatively, a route within a filter
601 may omit the <literal>refid</literal> attribute, but contain
602 configuration elements similar to those used for filters defined
603 in the <filters> section. (In other words, each filter in a
604 route may be included either by reference or by physical
610 <section id="example.configuration">
611 <title>An example configuration</title>
613 The following is a small, but complete, Metaproxy configuration
614 file (included in the distribution as
615 <literal>metaproxy/etc/config0.xml</literal>).
616 This file defines a very simple configuration that simply proxies
617 to whatever backend server the client requests, but logs each
618 request and response. This can be useful for debugging complex
619 client-server dialogues.
622 <?xml version="1.0"?>
623 <yp2 xmlns="http://indexdata.dk/yp2/config/1">
624 <start route="start"/>
626 <filter id="frontend" type="frontend_net">
629 <filter id="backend" type="z3950_client">
634 <filter refid="frontend"/>
636 <filter refid="backend"/>
642 It works by defining a single route, called
643 <literal>start</literal>, which consists of a sequence of three
644 filters. The first and last of these are included by reference:
645 their <literal><filter></literal> elements have
646 <literal>refid</literal> attributes that refer to filters defined
647 within the prior <literal><filters></literal> section. The
648 middle filter is included inline in the route.
651 The three filters in the route are as follows: first, a
652 <literal>frontend_net</literal> filter accepts Z39.50 requests
653 from any host on port 9000; then these requests are passed through
654 a <literal>log</literal> filter that emits a message for each
655 request; they are then fed into a <literal>z3950_client</literal>
656 filter, which forwards the requests to the client-specified
657 backend Z39.509 server. When the response arrives, it is handed
658 back to the <literal>log</literal> filter, which emits another
659 message; and then to the front-end filter, which returns the
660 response to the client.
667 <chapter id="multidb">
668 <title>Virtual databases and multi-database searching</title>
672 <title>Introductory notes</title>
674 <title>Lark's vomit</title>
676 This chapter goes into a level of technical detail that is
677 probably not necessary in order to configure and use Metaproxy.
678 It is provided only for those who like to know how things work.
679 You should feel free to skip on to the next section if this one
680 doesn't seem like fun.
684 Two of Metaproxy's filters are concerned with multiple-database
685 operations. Of these, <literal>virt_db</literal> can work alone
686 to control the routing of searches to one of a number of servers,
687 while <literal>multi</literal> can work with the output of
688 <literal>virt_db</literal> to perform multicast searching, merging
689 the results into a unified result-set. The interaction between
690 these two filters is necessarily complex: it reflecting the real,
691 irreducible complexity of multicast searching in a protocol such
692 as Z39.50 that separates initialisation from searching, and in
693 which the database to be searched is not known at initialisation
697 Hold on tight - this may get a little hairy.
700 In the general course of things, a Z39.50 Init request may carry
701 with it an otherInfo packet of type <literal>VAL_PROXY</literal>,
702 whose value indicates the address of a Z39.50 server to which the
703 ultimate connection is to be made. (This otherInfo packet is
704 supported by YAZ-based Z39.50 clients and servers, but has not yet
705 been ratified by the Maintenance Agency and so is not widely used
706 in non-Index Data software. We're working on it.)
707 The <literal>VAL_PROXY</literal> packet functions
708 analogously to the absoluteURI-style Request-URI used with the GET
709 method when a web browser asks a proxy to forward its request: see
711 <ulink url="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2"
714 <ulink url="http://www.w3.org/Protocols/rfc2616/rfc2616.html"
715 >the HTTP 1.1 specification</ulink>.
718 The role of the <literal>virt_db</literal> filter is to rewrite
719 this otherInfo packet dependent on the virtual database that the
720 client wants to search. For example, a <literal>virt_db</literal>
721 filter could be set up so that searches in the virtual database
722 ``lc'' are forwarded to the Library of Congress server, and
723 searches in the virtual database ``id'' are forwarded to the toy
724 GILS database that Index Data hosts for testing purposes. A
725 <literal>virt_db</literal> configuration to make this switch would
729 <filter type="virt_db">
731 <database>lc</database>
732 <target>z3950.loc.gov:7090/Voyager</target>
735 <database>id</database>
736 <target>indexdata.dk/gils</target>
738 </filter>]]></screen>
740 When Metaproxy receives a Z39.50 Init request from a client, it
741 doesn't immediately forward that request to the back-end server.
742 Why not? Because it doesn't know <emphasis>which</emphasis>
743 back-end server to forward it to until the client sends a search
744 request that specifies the database that it wants to search in.
745 Instead, it just treasures the Init request up in its heart; and,
746 later, the first time the client does a search on one of the
747 specified virtual databases, a connection is forged to the
748 appropriate server and the Init request is forwarded to it. If,
749 later in the session, the same client searches in a different
750 virtual database, then a connection is forged to the server that
751 hosts it, and the same cached Init request is forwarded there,
755 All of this clever Init-delaying is done by the
756 <literal>frontend_net</literal> filter. The
757 <literal>virt_db</literal> filter knows nothing about it; in
758 fact, because the Init request that is received from the client
759 doesn't get forwarded until a Search reqeust is received, the
760 <literal>virt_db</literal> filter (and the
761 <literal>z3950_client</literal> filter behind it) doesn't even get
762 invoked at Init time. The <emphasis>only</emphasis> thing that a
763 <literal>virt_db</literal> filter ever does is rewrite the
764 <literal>VAL_PROXY</literal> otherInfo in the requests that pass
772 <chapter id="extensions">
773 <title>Writing extensions for Metaproxy</title>
774 <para>### To be written</para>
780 <chapter id="classes">
781 <title>Classes in the Metaproxy source code</title>
785 <title>Introductory notes</title>
787 <emphasis>Stop! Do not read this!</emphasis>
788 You won't enjoy it at all. You should just skip ahead to
789 <link linkend="refguide">the reference guide</link>,
791 <!-- The remainder of this paragraph is lifted verbatim from
792 Douglas Adams' _Hitch Hiker's Guide to the Galaxy_, chapter 8 -->
793 you things you really need to know, like the fact that the
794 fabulously beautiful planet Bethselamin is now so worried about
795 the cumulative erosion by ten billion visiting tourists a year
796 that any net imbalance between the amount you eat and the amount
797 you excrete whilst on the planet is surgically removed from your
798 bodyweight when you leave: so every time you go to the lavatory it
799 is vitally important to get a receipt.
802 This chapter contains documentation of the Metaproxy source code, and is
803 of interest only to maintainers and developers. If you need to
804 change Metaproxy's behaviour or write a new filter, then you will most
805 likely find this chapter helpful. Otherwise it's a waste of your
806 good time. Seriously: go and watch a film or something.
807 <citetitle>This is Spinal Tap</citetitle> is particularly good.
810 Still here? OK, let's continue.
813 In general, classes seem to be named big-endianly, so that
814 <literal>FactoryFilter</literal> is not a filter that filters
815 factories, but a factory that produces filters; and
816 <literal>FactoryStatic</literal> is a factory for the statically
817 registered filters (as opposed to those that are dynamically
822 <section id="individual.classes">
823 <title>Individual classes</title>
825 The classes making up the Metaproxy application are here listed by
826 class-name, with the names of the source files that define them in
831 <title><literal>mp::FactoryFilter</literal>
832 (<filename>factory_filter.cpp</filename>)</title>
834 A factory class that exists primarily to provide the
835 <literal>create()</literal> method, which takes the name of a
836 filter class as its argument and returns a new filter of that
837 type. To enable this, the factory must first be populated by
838 calling <literal>add_creator()</literal> for static filters (this
839 is done by the <literal>FactoryStatic</literal> class, see below)
840 and <literal>add_creator_dyn()</literal> for filters loaded
846 <title><literal>mp::FactoryStatic</literal>
847 (<filename>factory_static.cpp</filename>)</title>
849 A subclass of <literal>FactoryFilter</literal> which is
850 responsible for registering all the statically defined filter
851 types. It does this by knowing about all those filters'
852 structures, which are listed in its constructor. Merely
853 instantiating this class registers all the static classes. It is
854 for the benefit of this class that <literal>struct
855 metaproxy_1_filter_struct</literal> exists, and that all the filter
856 classes provide a static object of that type.
861 <title><literal>mp::filter::Base</literal>
862 (<filename>filter.cpp</filename>)</title>
864 The virtual base class of all filters. The filter API is, on the
865 surface at least, extremely simple: two methods.
866 <literal>configure()</literal> is passed a DOM tree representing
867 that part of the configuration file that pertains to this filter
868 instance, and is expected to walk that tree extracting relevant
869 information. And <literal>process()</literal> processes a
870 package (see below). That surface simplicitly is a bit
871 misleading, as <literal>process()</literal> needs to know a lot
872 about the <literal>Package</literal> class in order to do
878 <title><literal>mp::filter::AuthSimple</literal>,
879 <literal>Backend_test</literal>, etc.
880 (<filename>filter_auth_simple.cpp</filename>,
881 <filename>filter_backend_test.cpp</filename>, etc.)</title>
883 Individual filters. Each of these is implemented by a header and
884 a source file, named <filename>filter_*.hpp</filename> and
885 <filename>filter_*.cpp</filename> respectively. All the header
886 files should be pretty much identical, in that they declare the
887 class, including a private <literal>Rep</literal> class and a
888 member pointer to it, and the two public methods. The only extra
889 information in any filter header is additional private types and
890 members (which should really all be in the <literal>Rep</literal>
891 anyway) and private methods (which should also remain known only
892 to the source file, but C++'s brain-damaged design requires this
893 dirty laundry to be exhibited in public. Thanks, Bjarne!)
896 The source file for each filter needs to supply:
901 A definition of the private <literal>Rep</literal> class.
906 Some boilerplate constructors and destructors.
911 A <literal>configure()</literal> method that uses the
912 appropriate XML fragment.
917 Most important, the <literal>process()</literal> method that
918 does all the actual work.
925 <title><literal>mp::Package</literal>
926 (<filename>package.cpp</filename>)</title>
928 Represents a package on its way through the series of filters
929 that make up a route. This is essentially a Z39.50 or SRU APDU
930 together with information about where it came from, which is
931 modified as it passes through the various filters.
936 <title><literal>mp::Pipe</literal>
937 (<filename>pipe.cpp</filename>)</title>
939 This class provides a compatibility layer so that we have an IPC
940 mechanism that works the same under Unix and Windows. It's not
941 particularly exciting.
946 <title><literal>mp::RouterChain</literal>
947 (<filename>router_chain.cpp</filename>)</title>
954 <title><literal>mp::RouterFleXML</literal>
955 (<filename>router_flexml.cpp</filename>)</title>
962 <title><literal>mp::Session</literal>
963 (<filename>session.cpp</filename>)</title>
970 <title><literal>mp::ThreadPoolSocketObserver</literal>
971 (<filename>thread_pool_observer.cpp</filename>)</title>
978 <title><literal>mp::util</literal>
979 (<filename>util.cpp</filename>)</title>
981 A namespace of various small utility functions and classes,
982 collected together for convenience. Most importantly, includes
983 the <literal>mp::util::odr</literal> class, a wrapper for YAZ's
989 <title><literal>mp::xml</literal>
990 (<filename>xmlutil.cpp</filename>)</title>
992 A namespace of various XML utility functions and classes,
993 collected together for convenience.
999 <section id="other.source.files">
1000 <title>Other Source Files</title>
1002 In addition to the Metaproxy source files that define the classes
1003 described above, there are a few additional files which are
1004 briefly described here:
1008 <term><literal>metaproxy_prog.cpp</literal></term>
1011 The main function of the <command>metaproxy</command> program.
1016 <term><literal>ex_router_flexml.cpp</literal></term>
1019 Identical to <literal>metaproxy_prog.cpp</literal>: it's not clear why.
1024 <term><literal>test_*.cpp</literal></term>
1027 Unit-tests for various modules.
1033 ### Still to be described:
1034 <literal>ex_filter_frontend_net.cpp</literal>,
1035 <literal>filter_dl.cpp</literal>,
1036 <literal>plainfile.cpp</literal>,
1037 <literal>tstdl.cpp</literal>.
1044 <chapter id="refguide">
1045 <title>Reference guide</title>
1047 The material in this chapter is drawn directly from the individual
1048 manual entries. In particular, the Metaproxy invocation section is
1049 available using <command>man metaproxy</command>, and the section
1050 on each individual filter is available using the name of the filter
1051 as the argument to the <command>man</command> command.
1055 <section id="progref">
1056 <title>Metaproxy invocation</title>
1061 <section id="filterref">
1062 <title>Reference guide to Metaproxy filters</title>
1069 <!-- Keep this comment at the end of the file
1074 sgml-minimize-attributes:nil
1075 sgml-always-quote-attributes:t
1078 sgml-parent-document: "main.xml"
1079 sgml-local-catalogs: nil
1080 sgml-namecase-general:t
1081 nxml-child-indent: 1