1 # $Id: ZOOM.pod,v 1.9 2005-11-17 15:31:06 mike Exp $
8 ZOOM - Perl extension implementing the ZOOM API for Information Retrieval
14 $conn = new ZOOM::Connection($host, $port)
15 $conn->option(preferredRecordSyntax => "usmarc");
16 $rs = $conn->search_pqf('@attr 1=4 dinosaur');
18 print $rs->record(0)->render();
21 print "Error ", $@->code(), ": ", $@->message(), "\n";
26 This module provides a nice, Perlish implementation of the ZOOM
27 Abstract API described and documented at http://zoom.z3950.org/api/
29 the ZOOM module is implemented as a set of thin classes on top of the
30 non-OO functions provided by this distribution's C<Net::Z3950::ZOOM>
32 turn is a thin layer on top of the ZOOM-C code supplied as part of
33 Index Data's YAZ Toolkit. Because ZOOM-C is also the underlying code
34 that implements ZOOM bindings in C++, Visual Basic, Scheme, Ruby, .NET
35 (including C#) and other languages, this Perl module works compatibly
36 with those other implementations. (Of course, the point of a public
37 API such as ZOOM is that all implementations should be compatible
38 anyway; but knowing that the same code is running is reassuring.)
40 The ZOOM module provides two enumerations (C<ZOOM::Error> and
41 C<ZOOM::Event>), a single utility function C<diag_str()> in the C<ZOOM>
42 package itself, and eight classes:
52 Of these, the Query class is abstract, and has two concrete
57 Many useful ZOOM applications can be built using only the Connection,
58 ResultSet, Record and Exception classes, as in the example
61 A typical application will begin by creating an Connection object,
62 then using that to execute searches that yield ResultSet objects, then
63 fetching records from the result-sets to yield Record objects. If an
64 error occurs, an Exception object is thrown and can be dealt with.
66 More sophisticated applications might also browse the server's indexes
67 to create a ScanSet, from which indexed terms may be retrieved; others
68 might send ``Extended Services'' Packages to the server, to achieve
69 non-standard tasks such as database creation and record update.
70 Searching using a query syntax other than PQF can be done using an
71 query object of one of the Query subclasses. Finally, sets of options
72 may be manipulated independently of the objects they are associated
73 with using an Options object.
75 In general, method calls throw an exception if anything goes wrong, so
76 you don't need to test for success after each call. See the section
77 below on the Exception class for details.
79 =head1 UTILITY FUNCTION
81 =head2 ZOOM::diag_str()
83 $msg = ZOOM::diag_str(ZOOM::Error::INVALID_QUERY);
85 Returns a human-readable English-language string corresponding to the
86 error code that is its own parameter. This works for any error-code
88 C<ZOOM::Exception::code()>,
89 C<ZOOM::Connection::error_x()>
91 C<ZOOM::Connection::errcode()>,
92 irrespective of whether it is a member of the C<ZOOM::Error>
93 enumeration or drawn from the BIB-1 diagnostic set.
97 The eight ZOOM classes are described here in ``sensible order'':
98 first, the four commonly used classes, in the he order that they will
99 tend to be used in most programs (Connection, ResultSet, Record,
100 Exception); then the four more esoteric classes in descending order of
101 how often they are needed.
103 With the exception of the Options class, which is an extension to the
104 ZOOM model, the introduction to each class includes a link to the
105 relevant section of the ZOOM Abstract API.
107 =head2 ZOOM::Connection
109 $conn = new ZOOM::Connection("indexdata.dk:210/gils");
110 print("server is '", $conn->option("serverImplementationName"), "'\n");
111 $conn->option(preferredRecordSyntax => "usmarc");
112 $rs = $conn->search_pqf('@attr 1=4 mineral');
113 $ss = $conn->scan('@attr 1=1003 a');
114 if ($conn->errcode() != 0) {
115 die("somthing went wrong: " . $conn->errmsg())
119 This class represents a connection to an information retrieval server,
120 using an IR protocol such as ANSI/NISO Z39.50, SRW (the
121 Search/Retrieve Webservice), SRU (the Search/Retrieve URL) or
122 OpenSearch. Not all of these protocols require a low-level connection
123 to be maintained, but the Connection object nevertheless provides a
124 location for the necessary cache of configuration and state
125 information, as well as a uniform API to the connection-oriented
126 facilities (searching, index browsing, etc.), provided by these
129 See the description of the C<Connection> class in the ZOOM Abstract
131 http://zoom.z3950.org/api/zoom-current.html#3.2
137 $conn = new ZOOM::Connection("indexdata.dk", 210);
138 $conn = new ZOOM::Connection("indexdata.dk:210/gils");
139 $conn = new ZOOM::Connection("tcp:indexdata.dk:210/gils");
140 $conn = new ZOOM::Connection("http:indexdata.dk:210/gils");
142 Creates a new Connection object, and immediately connects it to the
143 specified server. If you want to make a new Connection object but
144 delay forging the connection, use the C<create()> and C<connect()>
147 This constructor can be called with two arguments or a single
148 argument. In the former case, the arguments are the name and port
149 number of the Z39.50 server to connect to; in the latter case, the
150 single argument is a YAZ service-specifier string of the form
156 [I<scheme>:]I<host>[:I<port>][/I<databaseName>]
160 In which the I<host> and I<port> parts are as in the two-argument
161 form, the I<databaseName> if provided specifies the name of the
162 database to be used in subsequent searches on this connection, and the
163 optional I<scheme> (default C<tcp>) indicates what protocol should be
164 used. At present, the following schemes are supported:
174 Z39.50 connection encrypted using SSL (Secure Sockets Layer). Not
175 many servers support this, but Index Data's Zebra is one that does.
179 Z39.50 connection on a Unix-domain (local) socket, in which case the
180 I<hostname> portion of the string is instead used as a filename in the
185 SRW connection using SOAP over HTTP.
189 Support for SRU will follow in the fullness of time.
191 If an error occurs, an exception is thrown. This may indicate a
192 networking problem (e.g. the host is not found or unreachable), or a
193 protocol-level problem (e.g. a Z39.50 server rejected the Init
196 =head4 create() / connect()
198 $options = new ZOOM::Options();
199 $options->option(implementationName => "my client");
200 $conn = create ZOOM::Connection($options)
201 $conn->connect($host, 0);
203 The usual Connection constructor, C<new()> brings a new object into
204 existence and forges the connection to the server all in one
205 operation, which is often what you want. For applications that need
206 more control, however, these two method separate the two steps,
207 allowing additional steps in between such as the setting of options.
209 C<create()> creates and returns a new Connection object, which is
210 I<not> connected to any server. It may be passed an options block, of
211 type C<ZOOM::Options> (see below), into which options may be set
212 before or after the creation of the Connection. The connection to the
213 server may then be forged by the C<connect()> method, the arguments of
214 which are the same as those of the C<new()> constructor.
216 =head4 error_x() / errcode() / errmsg() / addinfo() / diagset()
218 ($errcode, $errmsg, $addinfo, $diagset) = $conn->error_x();
219 $errcode = $conn->errcode();
220 $errmsg = $conn->errmsg();
221 $addinfo = $conn->addinfo();
222 $diagset = $conn->diagset();
224 These methods may be used to obtain information about the last error
225 to have occurred on a connection - although typically they will not
226 been used, as the same information is available through the
227 C<ZOOM::Exception> that is thrown when the error occurs. The
233 methods each return one element of the diagnostic, and
235 returns all four at once.
237 See the C<ZOOM::Exception> for the interpretation of these elements.
239 =head4 option() / option_binary()
241 print("server is '", $conn->option("serverImplementationName"), "'\n");
242 $conn->option(preferredRecordSyntax => "usmarc");
243 $conn->option_binary(iconBlob => "foo\0bar");
244 die if length($conn->option_binary("iconBlob") != 7);
246 Objects of the Connection, ResultSet, ScanSet and Package classes
247 carry with them a set of named options which affect their behaviour in
248 certain ways. See the ZOOM-C options documentation for details:
254 Connection options are listed at
255 http://indexdata.com/yaz/doc/zoom.tkl#zoom.connections
259 ResultSet options are listed at
260 http://indexdata.com/yaz/doc/zoom.resultsets.tkl
261 I<### move this obvservation down to the appropriate place>
265 ScanSet options are listed at
266 http://indexdata.com/yaz/doc/zoom.scan.tkl
267 I<### move this obvservation down to the appropriate place>
271 Package options are listed at
272 http://indexdata.com/yaz/doc/zoom.ext.html
273 I<### move this obvservation down to the appropriate place>
277 These options are set and fetched using the C<option()> method, which
278 may be called with either one or two arguments. In the two-argument
279 form, the option named by the first argument is set to the value of
280 the second argument, and its old value is returned. In the
281 one-argument form, the value of the specified option is returned.
283 For historical reasons, option values are not binary-clean, so that a
284 value containing a NUL byte will be returned in truncated form. The
285 C<option_binary()> method behaves identically to C<option()> except
286 that it is binary-clean, so that values containing NUL bytes are set
287 and returned correctly.
289 =head4 search() / search_pqf()
291 $rs = $conn->search(new ZOOM::Query::CQL('title=dinosaur'));
292 # The next two lines are equivalent
293 $rs = $conn->search(new ZOOM::Query::PQF('@attr 1=4 dinosaur'));
294 $rs = $conn->search_pqf('@attr 1=4 dinosaur');
296 The principal purpose of a search-and-retrieve protocol is searching
297 (and, er, retrieval), so the principal method used on a Connection
298 object is C<search()>. It accepts a single argument, a C<ZOOM::Query>
299 object (or, more precisely, an object of a subclass of this class);
300 and it creates and returns a new ResultSet object representing the set
301 of records resulting from the search.
303 Since queries using PQF (Prefix Query Format) are so common, we make
304 them a special case by providing a C<search_prefix()> method. This is
305 identical to C<search()> except that it accepts a string containing
306 the query rather than an object, thereby obviating the need to create
307 a C<ZOOM::Query::PQF> object. See the documentation of that class for
308 information about PQF.
312 Many Z39.50 servers allow you to browse their indexes to find terms to
313 search for. This is done using the C<scan> method, which creates and
314 returns a new ScanSet object representing the set of terms resulting
317 C<scan()> takes a single argument, but it has to work hard: it
318 specifies both what index to scan for terms, and where in the index to
319 start scanning. What's more, the specification of what index to scan
320 includes multiple facets, such as what database fields it's an index
321 of (author, subject, title, etc.) and whether to scan for whole fields
322 or single words (e.g. the title ``I<The Empire Strikes Back>'', or the
323 four words ``Back'', ``Empire'', ``Strikes'' and ``The'', interleaved
324 with words from other titles in the same index.
326 All of this is done by using a single term from the PQF query as the
327 C<scan()> argument. (At present, only PQF is supported, although
328 there is no reason in principle why CQL and other query syntaxes
329 should not be supported in future). The attributes associated with
330 the term indicate which index is to be used, and the term itself
331 indicates the point in the index at which to start the scan. For
332 example, if the argument is C<@attr 1=4 fish>, then
338 This is the BIB-1 attribute with type 1 (meaning access-point, which
339 specifies an index), and type 4 (which means ``title''). So the scan
340 is in the title index.
344 Start the scan from the lexicographically earliest term that is equal
345 to or falls after ``fish''.
349 The argument C<@attr 1=4 @attr 6=3 fish> would behave similarly; but
350 the BIB-1 attribute 6=3 mean completeness=``complete field'', so the
351 scan would be for complete titles rather than for words occurring in
354 This takes a bit of getting used to.
356 I<###> discuss how the values of options affect scanning.
360 $p = $conn->package();
361 $o = new ZOOM::Options();
362 $o->option(databaseName => "newdb");
363 $p = $conn->package($o);
365 Creates and returns a new C<ZOOM::Package>, to be used in invoking an
366 Extended Service. An options block may optionally be passed in. See
367 the C<ZOOM::Package> documentation.
373 Destroys a Connection object, tearing down any low-level connection
374 associated with it and freeing its resources. It is an error to reuse
375 a Connection that has been C<destroy()>ed.
377 =head2 ZOOM::ResultSet
385 =head2 ZOOM::Exception
387 In general, method calls throw an exception (of class
388 C<ZOOM::Exception>) if anything goes wrong, so you don't need to test
389 for success after each call. Exceptions are caught by enclosing the
390 main code in an C<eval{}> block and checking C<$@> on exit from that
391 block, as in the code-sample above.
393 There are a small number of exceptions to this rule: the three
394 record-fetching methods in the C<ZOOM::ResultSet> class,
396 C<record_immediate()>,
399 can all return undefined values for legitimate reasons, under
400 circumstances that do not merit throwing an exception. For this
401 reason, the return values of these methods should be checked. See the
402 individual methods' documentation for details.
426 The ZOOM module provides two enumerations that list possible return
427 values from particular functions. They are described in the following
432 if ($@->code() == ZOOM::Error::QUERY_PQF) {
433 return "your query was not accepted";
436 This class provides a set of manifest constants representing some of
437 the possible error codes that can be raised by the ZOOM module. The
438 methods that return error-codes are
439 C<ZOOM::Exception::code()>,
440 C<ZOOM::Connection::error_x()>
442 C<ZOOM::Connection::errcode()>.
444 The C<ZOOM::Error> class provides the constants
454 C<UNSUPPORTED_PROTOCOL>,
455 C<UNSUPPORTED_QUERY>,
465 each of which specifies a client-side error. Since errors may also be
466 diagnosed by the server, and returned to the client, error codes may
467 also take values from the BIB-1 diagnostic set of Z39.50, listed at
468 the Z39.50 Maintenance Agency's web-site at
469 http://www.loc.gov/z3950/agency/defns/bib1diag.html
471 All error-codes, whether client-side from the C<ZOOM::Error>
472 enumeration or server-side from the BIB-1 diagnostic set, can be
473 translated into human-readable messages by passing them to the
474 C<ZOOM::diag_str()> utility function.
478 if ($conn->last_event() == ZOOM::Event::CONNECT) {
479 print "Connected!\n";
482 In applications that need it - mostly complex multiplexing
483 applications - The C<ZOOM::Connection::last_event()> method is used to
484 return an indication of the last event that occurred on a particular
485 connection. It always returns a value drawn from this enumeration,
486 that is, one of C<NONE>, C<CONNECT>, C<SEND_DATA>, C<RECV_DATA>,
487 C<TIMEOUT>, C<UNKNOWN>, C<SEND_APDU>, C<RECV_APDU>, C<RECV_RECORD> or
490 You almost certainly don't need to know about this. Frankly, I'm not
491 sure how to use it myself.
495 The ZOOM abstract API,
496 http://zoom.z3950.org/api/zoom-current.html
498 The C<Net::Z3950::ZOOM> module, included in the same distribution as this one.
500 The C<Net::Z3950> module, which this one supersedes.
501 http://perl.z3950.org/
503 The documentation for the ZOOM-C module of the YAZ Toolkit, which this
504 module is built on. Specifically, its lists of options are useful.
505 http://indexdata.com/yaz/doc/zoom.tkl
507 The BIB-1 diagnostic set of Z39.50,
508 http://www.loc.gov/z3950/agency/defns/bib1diag.html
512 Mike Taylor, E<lt>mike@indexdata.comE<gt>
514 =head1 COPYRIGHT AND LICENCE
516 Copyright (C) 2005 by Index Data.
518 This library is free software; you can redistribute it and/or modify
519 it under the same terms as Perl itself, either Perl version 5.8.4 or,
520 at your option, any later version of Perl 5 you may have available.