1 <chapter id="proxy-reference">
2 <title>Proxy Reference</title>
3 <section id="proxy-operation">
4 <title>Operating Environment</title>
6 The YAZ proxy is a single program. After startup it spawns
7 a child process (except on Windows or if option -X is given).
8 The child process is the core of the proxy and it handles all
9 communication with clients and servers. The parent process
10 will restart the child process if it dies unexpectedly and report
11 the reason. For options for YAZ proxy,
12 see <xref linkend="proxy-usage"/>.
15 As an option the proxy may change user identity to a less priviledged
19 <section id="proxy-target">
20 <title>Specifying the Backend Server</title>
22 When the proxy receives a Z39.50 Initialize Request from a Z39.50
23 client, it determines the backend server by the following rules:
26 <para>If the <literal>InitializeRequest</literal> PDU from the
28 <link linkend="otherinfo-encoding"><literal>otherInfo</literal></link>
30 <literal>1.2.840.10003.10.1000.81.1</literal>, then the
31 contents of that element specify the server to be used, in the
32 usual YAZ address format (typically
33 <literal>tcp:<parameter>hostname</parameter>:<parameter>port</parameter></literal>)
35 <ulink url="http://www.indexdata.dk/yaz/doc/comstack.addresses.tkl"
36 >the Addresses section of the YAZ manual</ulink>.
41 <para>Otherwise, the Proxy uses the default server, if one was
42 specified in the proxy configuration file. See
43 <xref linkend="proxy-config-target"/>.
48 <para>Otherwise, the Proxy uses the default server, if one was
49 specified on the command-line with the <literal>-t</literal>
54 <para>Otherwise, the proxy closes the connection with
61 <section id="proxy-keepalive">
62 <title>Keep-alive Facility</title>
64 The keep-alive is a facility where the proxy keeps the connection to the
65 backend server - even if the client closes the connection to the proxy.
68 If a new or another client connects to the proxy again and requests the
69 same backend it will be reassigned to this backend. In this case, the
70 proxy sends an initialize response directly to the client and an
71 initialize handshake with the backend is omitted.
74 When a client reconnects, query and record caching works better, if the
75 proxy assigns it to the same backend as before. And the result set
76 (if any) is re-used. To achieve this, Index Data defined a session
77 cookie which identifies the backend session.
80 The cookie is defined by the client and is sent as part of the
81 Initialize Request and passed in an
82 <link linkend="otherinfo-encoding"><literal>otherInfo</literal></link>
83 element with OID <literal>1.2.840.10003.10.1000.81.2</literal>.
86 Clients that do not send a cookie as part of the initialize request
87 may still better performance, since the init handshake is saved.
90 Refer to <xref linkend="proxy-config-keepalive"/> on how to setup
91 configuration parameters for keepalive.
95 <section id="proxy-config-file">
96 <title>Proxy Configuration File</title>
98 The Proxy may read a configuration file using option
99 <literal>-c</literal> followed by the filename of a config file.
102 The config file is XML based. The YAZ proxy must be compiled
103 with <ulink url="http://www.xmlsoft.org/">libxml2</ulink> and
104 <ulink url="http://xmlsoft.org/XSLT/">libXSLT</ulink> support in
105 order for the config file facility to be enabled.
108 <para>To check for a config file to be well-formed, the yazproxy may
109 be invoked without specifying a listening port, i.e.
111 yazproxy -c myconfig.xml
113 If this does not produce errors, the file is well-formed.
116 <section id="proxy-config-header">
117 <title>Proxy Configuration Header</title>
119 The proxy config file must have a root element called
120 <literal>proxy</literal> and scoped within namespace
121 <literal> xmlns="http://indexdata.dk/yazproxy/schema/0.8/</literal>.
122 All information except an optional XML header must be stored
123 within the <literal>proxy</literal> element.
126 <?xml version="1.0"?>
127 <proxy xmlns="http://indexdata.dk/yazproxy/schema/0.8/">
128 <!-- content here .. -->
132 <section id="proxy-config-target">
133 <title>target</title>
135 The element <literal>target</literal> which may be repeated zero
136 or more times with parent element <literal>proxy</literal> contains
137 information about each backend target.
138 The <literal>target</literal> element have two attributes:
139 <literal>name</literal> which holds the logical name of the backend
140 target (required) and <literal>default</literal> (optional) which
141 (when given) specifies that the backend target is the default target -
142 equivalent to command line option <literal>-t</literal>.
146 <?xml version="1.0"?>
147 <proxy xmlns="http://indexdata.dk/yazproxy/schema/0.8/">
148 <target name="server1" default="1">
149 <!-- description of server1 .. -->
151 <target name="server2">
152 <!-- description of server2 .. -->
158 <section id="proxy-config-url">
161 The <literal>url</literal> which may be repeated one or more times
162 should be the child of the <literal>target</literal> element.
163 The CDATA of <literal>url</literal> is the Z-URL of the backend.
166 Multiple <literal>url</literal> element may be used. In that case, then
167 a client initiates a session, the proxy chooses the URL with the lowest
168 number of active sessions, thereby distributing the load. It is
169 assumed that each URL represents the same database (data).
173 <section id="proxy-config-target-timeout">
174 <title>target-timeout</title>
176 The element <literal>target-timeout</literal> is the child of element
177 <literal>target</literal> and specifies the amount in seconds before
178 a target session is shut down.
181 This can also be specified on the command line by using option
182 <literal>-T</literal>. Refer to OPTIONS.
186 <section id="proxy-config-client-timeout">
187 <title>client-timeout</title>
189 The element <literal>client-timeout</literal> is the child of element
190 <literal>target</literal> and specifies the amount in seconds before
191 a client session is shut down.
194 This can also be specified on the command line by using option
195 <literal>-i</literal>. Refer to OPTIONS.
199 <section id="proxy-config-keepalive">
200 <title>keepalive</title>
201 <para>The <literal>keepalive</literal> element holds information about
202 the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend
203 sessions that is no longer associated with a client session.
205 <para>The <literal>keepalive</literal> element which is the child of
206 the <literal>target</literal>holds two elements:
207 <literal>bandwidth</literal> and <literal>pdu</literal>.
208 The <literal>bandwidth</literal> is the maximum total bytes
209 transferred to/from the target. If a target session exceeds this
210 limit, it is shut down (and no longer kept alive).
211 The <literal>pdu</literal> is the maximum number of requests sent
212 to the target. If a target session exceeds this limit, it is
213 shut down. The idea of these two limits is that avoid very long
214 sessions that use resources in a backend (that leaks!).
217 The following sets maximum number of bytes transferred in a
218 target session to 1 MB and maxinum of requests to 400.
221 <bandwidth>1048576</bandwidth>
222 <retrieve>400</retrieve>
227 <section id="proxy-config-limit">
230 The <literal>limit</literal> section specifies bandwidth/pdu requests
231 limits for an active session.
232 The proxy records bandwidth/pdu requests during the last 60 seconds
233 (1 minute). The <literal>limit</literal> may include the
234 elements <literal>bandwidth</literal>, <literal>pdu</literal>,
235 and <literal>retrieve</literal>. The <literal>bandwidth</literal>
236 measures the number of bytes transferred within the last minute.
237 The <literal>pdu</literal> is the number of requests in the last
238 minute. The <literal>retrieve</literal> holds the maximum records to
239 be retrieved in one Present Request.
242 If a bandwidth/pdu limit is reached the proxy will postpone the
243 requests to the target and wait one or more seconds. The idea of the
244 limit is to ensure that clients that downloads hundreds or thousands of
245 records do not hurt other users.
248 The following sets maximum number of bytes transferred per minute to
249 500Kbytes and maximum number of requests to 40.
252 <bandwidth>524288</bandwidth>
253 <retrieve>40</retrieve>
259 Typically the limits for keepalive are much higher than
260 those for session minute average.
265 <section id="proxy-config-attribute">
266 <title>attribute</title>
268 The <literal>attribute</literal> element specifies accept or reject
269 or a particular attribute type, value pair.
270 Well-behaving targets will reject unsupported attributes on their
271 own. This feature is useful for targets that do not gracefully
272 handle unsupported attributes.
275 Attribute elements may be repeated. The proxy inspects the attribute
276 specifications in the order as specified in the configuration file.
277 When a given attribute specification matches a given attribute list
278 in a query, the proxy takes appropriate action (reject, accept).
281 If no attribute specifications matches the attribute list in a query,
285 The <literal>attribute</literal> element has two required attributes:
286 <literal>type</literal> which is the Attribute Type-1 type, and
287 <literal>value</literal> which is the Attribute Type-1 value.
288 The special value/type <literal>*</literal> matches any attribute
289 type/value. A value may also be specified as a list with each
290 value separated by comma, a value may also be specified as a
291 list: low value - dash - high value.
294 If attribute <literal>error</literal> is given, that holds a
295 Bib-1 diagnostic which is sent to the client if the particular
296 type, value is part of a query.
299 If attribute <literal>error</literal> is not given, the attribute
300 type, value is accepted and passed to the backend target.
303 A target that supports use attributes 1,4, 1000 through 1003 and
304 no other use attributes, could use the following rules:
306 <attribute type="1" value="1,4,1000-1003">
307 <attribute type="1" value="*" error="114"/>
311 <section id="proxy-config-syntax">
312 <title>syntax</title>
314 The <literal>syntax</literal> element specifies accept or reject
315 or a particular record syntax request from the client.
318 The <literal>syntax</literal> has one required attribute:
319 <literal>type</literal> which is the Preferred Record Syntax.
322 If attribute <literal>error</literal> is given, that holds a
323 Bib-1 diagnostic which is sent to the client if the particular
324 record syntax is part of a present - or search request.
327 If attribute <literal>error</literal> is not given, the record syntax
328 is accepted and passed to the backend target.
331 If attribute <literal>marcxml</literal> is given, the proxy will
332 perform MARC21 to MARCXML conversion. In this case the
333 <literal>type</literal> should be XML. The proxy will use
334 preferred record syntax USMARC/MARC21 against the backend target.
337 If attribute <literal>stylesheet</literal> is given, the proxy
338 will convert XML record from server via XSLT. It is important
339 that the content from server is XML. If used in conjunction with
340 attribute <literal>marcxml</literal> the MARC to MARCXML conversion
341 takes place before the XSLT conversion takes place.
344 If attribute <literal>identifier</literal> is given that is the
345 SRW/SRU record schema identifier for the resulting output record (after
346 MARCXML and/or XSLT conversion).
349 If sub element <literal>title</literal> is given (as child element
350 of <literal>syntax</literal>, then that is the official SRW/SRU
351 name of the resulting record schema.
354 If sub element <literal>name</literal> is given that is an alias
355 for the record schema identifier. Multiple <literal>name</literal>s
359 <title>MARCXML conversion</title>
360 <para>To accept USMARC and offer MARCXML XML plus Dublin Core (via
361 XSLT conversion) but the following configuration could be used:
364 <target name="mytarget">
366 <syntax type="usmarc"/>
367 <syntax type="xml" marcxml="1"
368 identifier="info:srw/schema/1/marcxml-v1.1"
369 <title>MARCXML<title>
370 <name>marcxml<name>
372 <syntax type="xml" marcxml="1" stylesheet="MARC21slim2SRWDC.xsl"
373 identifier="info:srw/schema/1/dc-v1.1">
374 <title>Dublin Core<title>
377 <syntax type="*" error="238"/>
387 <section id="proxy-config-explain">
388 <title>explain</title>
390 The <literal>explain</literal> element includes Explain information
391 for SRW/SRU about the server in the target section. This
392 information must have a <literal>serverInfo</literal> element
393 with a database that this target must be available as (URL path).
396 <explain xmlns="http://explain.z3950.org/dtd/2.0/">
398 <host>myhost.org</host>
400 <database>mydatabase</database>
402 <!-- remaining Explain stuff -->
406 In the above case, the SRW/SRU service is available as
407 <literal>http://myhost.org:8000/mydatabase</literal>.
412 <section id="proxy-config-cql2rpn">
413 <title>cql2rpn</title>
415 The CDATA of <literal>cql2rpn</literal> refers to CQL to a RPN conversion
416 file - for the server in the target section. This element
417 is required for SRW/SRU searches to operate against a Z39.50
418 server that doesn't support CQL. Most Z39.50 servers only support
419 Type-1/RPN so this is usually required.
420 See YAZ documentation for more information about the
421 <ulink url="http://indexdata.dk/yaz/doc/tools.tkl#tools.cql.pqf">CQL
422 to PQF</ulink> conversion. See also the
423 <filename>pqf.properties</filename> in the <filename>etc</filename>
424 (or <replaceable>prefix/share/yazproxy</replaceable>)
425 directory of the YAZ proxy.
429 <section id="proxy-config-preinit">
430 <title>preinit</title>
432 The element <literal>preinit</literal> is the child of element
433 <literal>target</literal> and specifies the number of spare
434 connection to a target. By default no spare connection are
435 created by the proxy. If the proxy uses a target exclusive or
436 a lot, the preinit session will ensure that target sessions
437 have been made before the client makes a connection and will therefore
438 reduce the connect-init handshake dramatically. Never set this to
443 <section id="proxy-config-max-clients">
444 <title>max-clients</title>
446 The element <literal>max-clients</literal> is the child of element
447 <literal>proxy</literal> and specifies the total number of
448 allowed connections to targets (all targets). If this limit
449 is reached the proxy will close the least recently used connection.
452 Note, that many Unix systems impose a system on the number of
453 open files allowed in a single process, typically in the
454 range 256 (Solaris) to 1024 (Linux).
455 The proxy uses 2 sockets per session + a few files
456 for logging. As a rule of thumb, ensure that 2*max-clients + 5
457 can be opened by the proxy process.
461 Using the <ulink url="http://www.gnu.org/software/bash/bash.html">
462 bash</ulink> shell, you can set the limit with
463 <literal>ulimit -n</literal><replaceable>no</replaceable>.
464 Use <literal>ulimit -a</literal> to display limits.
469 <section id="proxy-config-log">
472 The element <literal>log</literal> is the child of element
473 <literal>proxy</literal> and specifies what to be logged by the
477 Specify the log file with command-line option <literal>-l</literal>.
480 The text of the <literal>log</literal> element is a sequence of
481 options separated by white space. See the table below:
482 <table frame="top"><title>Logging options</title>
484 <colspec colwidth="1*"/>
485 <colspec colwidth="2*"/><thead>
487 <entry>Option</entry>
488 <entry>Description</entry>
493 <entry><literal>client-apdu</literal></entry>
495 Log APDUs as reported by YAZ for the
496 communication between the client and the proxy.
497 This facility is equivalent to the APDU logging that
498 happens when using option <literal>-a</literal>, however
499 this tells the proxy to log in the same file as given
500 by <literal>-l</literal>.
504 <entry><literal>server-apdu</literal></entry>
506 Log APDUs as reported by YAZ for the
507 communication between the proxy and the server (backend).
511 <entry><literal>clients-requests</literal></entry>
513 Log a brief description about requests transferred between
514 the client and the proxy. The name of the request and the size
515 of the APDU is logged.
519 <entry><literal>server-requests</literal></entry>
521 Log a brief description about requests transferred between
522 the proxy and the server (backend). The name of the request
523 and the size of the APDU is logged.
531 To log communication in details between the proxy and the backend, th
532 following configuration could be used:
534 <target name="mytarget">
535 <log>server-apdu server-requests</log>
543 <section id="query-cache">
544 <title>Query Caching</title>
546 Simple stateless clients often send identical Z39.50 searches
547 in a relatively short period of time (e.g. in order to produce a
548 results-list page, the next page,
549 a single full-record, etc). And for many targets, it's
550 much more expensive to produce a new result set than to
551 reuse an existing one.
554 The proxy tries to solve that by remembering the last query for each
555 backend target, so that if an identical query is received next, it
556 is turned into Present Requests rather than new Search Requests.
560 In a future we release will will probably allows for
561 an arbitrary-sized cache for targets supporting named result sets.
565 You can enable/disable query caching using option -o.
569 <section id="record-cache">
570 <title>Record Caching</title>
572 As an option, the proxy may also cache result set records for the
574 The proxy takes into account the Record Syntax and CompSpec.
575 The CompSpec includes simple element set names as well.
576 By default the cache is 200000 bytes per session.
580 <section id="query-validation">
581 <title>Query Validation</title>
583 The Proxy may also be configured to trap particular attributes in
584 Type-1 queries and send Bib-1 diagnostics back to the client without
585 even consulting the backend target. This facility may be useful if
586 a target does not properly issue diagnostics when unsupported attributes
591 <section id="record-validation">
592 <title>Record Syntax Validation</title>
594 The proxy may be configured to accept, reject or convert records.
595 When accepted, the target passes search/present requests to the
596 backend target under the assumption that the target can honor the
597 request (In fact it may not do that). When a record is rejected because
598 the record syntax is "unsupported" the proxy returns a diagnostic to the
599 client. Finally, the proxy may convert records.
602 The proxy can convert from MARC to MARCXML and thereby offer an
603 XML version of any MARC record as long as it is ISO2709 encoded.
604 If the proxy is compiled with libXSLT support it can also
609 <section id="other-optimizations">
610 <title>Other Optimizations</title>
612 We've had some plans to support global caching of result set records,
613 but this has not yet been implemented.
617 <section id="proxy-usage">
618 <title>Proxy Usage (man page)</title>
619 <refentry id="yazproxy-man">
624 <section id="otherinfo-encoding">
625 <title>OtherInformation Encoding</title>
627 The proxy uses the OtherInformation definition to carry
628 information about the target address and cookie.
631 OtherInformation ::= [201] IMPLICIT SEQUENCE OF SEQUENCE{
632 category [1] IMPLICIT InfoCategory OPTIONAL,
634 characterInfo [2] IMPLICIT InternationalString,
635 binaryInfo [3] IMPLICIT OCTET STRING,
636 externallyDefinedInfo [4] IMPLICIT EXTERNAL,
637 oid [5] IMPLICIT OBJECT IDENTIFIER}}
639 InfoCategory ::= SEQUENCE{
640 categoryTypeId [1] IMPLICIT OBJECT IDENTIFIER OPTIONAL,
641 categoryValue [2] IMPLICIT INTEGER}
644 The <literal>categoryTypeId</literal> is either
645 OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2
646 for proxy target and proxy cookie respectively. The
647 integer element <literal>category</literal> is set to 0.
648 The value proxy and cookie is stored in element
649 <literal>characterInfo</literal> of the <literal>information</literal>
655 <!-- Keep this comment at the end of the file
660 sgml-minimize-attributes:nil
661 sgml-always-quote-attributes:t
664 sgml-parent-document: "yazproxy.xml"
665 sgml-local-catalogs: nil
666 sgml-namecase-general:t