- <bookinfo>
- <title>Pazpar2 - User's Guide and Reference</title>
- <author>
- <firstname>Sebastian</firstname><surname>Hammer</surname>
- </author>
- <copyright>
- <year>©right-year;</year>
- <holder>Index Data</holder>
- </copyright>
- <abstract>
- <simpara>
- Pazpar2 is a high-performance, user interface-independent, data
- model-independent metasearching
- middleware featuring merging, relevance ranking, record sorting,
- and faceted results.
- </simpara>
- <simpara>
- This document is a guide and reference to Pazpar version &version;.
- </simpara>
- <simpara>
- <inlinemediaobject>
- <imageobject>
- <imagedata fileref="common/id.png" format="PNG"/>
- </imageobject>
- <imageobject>
- <imagedata fileref="common/id.eps" format="EPS"/>
- </imageobject>
- </inlinemediaobject>
- </simpara>
- </abstract>
- </bookinfo>
-
- <chapter id="introduction">
- <title>Introduction</title>
- <para>
- Pazpar2 is a stand-alone metasearch client with a webservice API, designed
- to be used either from a browser-based client (JavaScript, Flash, Java,
- etc.), from from server-side code, or any combination of the two.
- Pazpar2 is a highly optimized client designed to
- search many resources in parallel. It implements record merging,
- relevance-ranking and sorting by arbitrary data content, and facet
- analysis for browsing purposes. It is designed to be data model
- independent, and is capable of working with MARC, DublinCore, or any
- other XML-structured response format -- XSLT is used to normalize and extract
- data from retrieval records for display and analysis. It can be used
- against any server which supports the Z39.50 protocol. Proprietary
- backend modules can be used to support a large number of other protocols
- (please contact Index Data for further information about this).
- </para>
- <para>
- Additional functionality such as
- user management, attractive displays are expected to be implemented by
- applications that use pazpar2. Pazpar2 is user interface independent.
- Its functionality is exposed through a simple REST-style webservice API,
- designed to be simple to use from an Ajax-enbled browser, Flash
- animation, Java applet, etc., or from a higher-level server-side language
- like PHP or Java. Because session information can be shared between
- browser-based logic and your server-side scripting, there is tremendous
- flexibility in how you implement your business logic on top of pazpar2.
- </para>
- <para>
- Once you launch a search in pazpar2, the operation continues behind the
- scenes. Pazpar2 connects to servers, carries out searches, and
- retrieves, deduplicates, and stores results internally. Your application
- code may periodically inquire about the status of an ongoing operation,
- and ask to see records or other result set facets. Result become
- available immediately, and it is easy to build end-user interfaces which
- feel extremely responsive, even when searching more than 100 servers
- concurrently.
- </para>
- <para>
- Pazpar2 is designed to be highly configurable. Incoming records are
- normalized to XML/UTF-8, and then further normalized using XSLT to a
- simple internal representation that is suitable for analysis. By
- providing XSLT stylesheets for different kinds of result records, you
- can tune pazpar2 to work against different kinds of information
- retrieval servers. Finally, metadata is extracted, in a configurable
- way, from this internal record, to support display, merging, ranking,
- result set facets, and sorting. Pazpar2 is not bound to a specific model
- of metadata, such as DublinCore or MARC -- by providing the right
- configuration, it can work with a number of different kinds of data in
- support of many different applications.
- </para>
- <para>
- Pazpar2 is designed to be efficient and scalable. You can set it up to
- search several hundred targets in parallel, or you can use it to support
- hundreds of concurrent users. It is implemented with the same attention
- to performance and economy that we use in our indexing engines, so that
- you can focus on building your application, without worrying about the
- details of metasearch logic. You can devote all of your attention to
- usability and let pazpar2 do what it does best -- metasearch.
+ <bookinfo>
+ <title>Pazpar2 - User's Guide and Reference</title>
+ <author>
+ <firstname>Sebastian</firstname><surname>Hammer</surname>
+ </author>
+ <author>
+ <firstname>Adam</firstname><surname>Dickmeiss</surname>
+ </author>
+ <author>
+ <firstname>Marc</firstname><surname>Cromme</surname>
+ </author>
+ <author>
+ <firstname>Jakub</firstname><surname>Skoczen</surname>
+ </author>
+ <author>
+ <firstname>Mike</firstname><surname>Taylor</surname>
+ </author>
+ <releaseinfo>&version;</releaseinfo>
+ <copyright>
+ <year>©right-year;</year>
+ <holder>Index Data</holder>
+ </copyright>
+ <abstract>
+ <simpara>
+ Pazpar2 is a high-performance metasearch engine featuring
+ merging, relevance ranking, record sorting,
+ and faceted results.
+ It is middleware: it has no user interface of its own, but can be
+ configured and controlled by an XML-over-HTTP web-service to provide
+ metasearching functionality behind any user interface.
+ </simpara>
+ <simpara>
+ This document is a guide and reference to Pazpar2 version &version;.
+ </simpara>
+ <simpara>
+ <inlinemediaobject>
+ <imageobject>
+ <imagedata fileref="common/id.png" format="PNG"/>
+ </imageobject>
+ <imageobject>
+ <imagedata fileref="common/id.eps" format="EPS"/>
+ </imageobject>
+ </inlinemediaobject>
+ </simpara>
+ </abstract>
+ </bookinfo>
+
+ <chapter id="introduction">
+ <title>Introduction</title>
+
+ <section id="what.pazpar2.is">
+ <title>What Pazpar2 is</title>
+ <para>
+ Pazpar2 is a stand-alone metasearch engine with a web-service API, designed
+ to be used either from a browser-based client (JavaScript, Flash,
+ Java applet,
+ etc.), from server-side code, or any combination of the two.
+ Pazpar2 is a highly optimized client designed to
+ search many resources in parallel. It implements record merging,
+ relevance-ranking and sorting by arbitrary data content, and facet
+ analysis for browsing purposes. It is designed to be data-model
+ independent, and is capable of working with MARC, DublinCore, or any
+ other <ulink url="&url.xml;">XML</ulink>-structured response format
+ -- <ulink url="&url.xslt;">XSLT</ulink> is used to normalize and extract
+ data from retrieval records for display and analysis. It can be used
+ against any server which supports the
+ <ulink url="&url.z39.50;">Z39.50</ulink> or <ulink url="&url.sru;">SRU/SRW</ulink>
+ protocol. Proprietary
+ backend modules can function as connectors between these standard
+ protocols and any non-standard API, including web-site scraping, to
+ support a large number of other protocols.
+ </para>
+ <para>
+ Additional functionality such as
+ user management and attractive displays are expected to be implemented by
+ applications that use Pazpar2. Pazpar2 itself is user-interface independent.
+ Its functionality is exposed through a simple XML-based web-service API,
+ designed to be easy to use from an Ajax-enabled browser, Flash
+ animation, Java applet, etc., or from a higher-level server-side language
+ like PHP, Perl or Java. Because session information can be shared between
+ browser-based logic and server-side scripting, there is tremendous
+ flexibility in how you implement application-specific logic on top
+ of Pazpar2.
+ </para>
+ <para>
+ Once you launch a search in Pazpar2, the operation continues behind the
+ scenes. Pazpar2 connects to servers, carries out searches, and
+ retrieves, deduplicates, and stores results internally. Your application
+ code may periodically inquire about the status of an ongoing operation,
+ and ask to see records or result set facets. Results become
+ available immediately, and it is easy to build end-user interfaces than
+ feel extremely responsive, even when searching more than 100 servers
+ concurrently.
+ </para>
+ <para>
+ Pazpar2 is designed to be highly configurable. Incoming records are
+ normalized to XML/UTF-8, and then further normalized using XSLT to a
+ simple internal representation that is suitable for analysis. By
+ providing XSLT stylesheets for different kinds of result records, you
+ can configure Pazpar2 to work against different kinds of information
+ retrieval servers. Finally, metadata is extracted in a configurable
+ way from this internal record, to support display, merging, ranking,
+ result set facets, and sorting. Pazpar2 is not bound to a specific model
+ of metadata, such as DublinCore or MARC: by providing the right
+ configuration, it can work with any combination of different kinds of data in
+ support of many different applications.
+ </para>
+ <para>
+ Pazpar2 is designed to be efficient and scalable. You can set it up to
+ search several hundred targets in parallel, or you can use it to support
+ hundreds of concurrent users. It is implemented with the same attention
+ to performance and economy that we use in our indexing engines, so that
+ you can focus on building your application without worrying about the
+ details of metasearch logic. You can devote all of your attention to
+ usability and let Pazpar2 do what it does best -- metasearch.
+ </para>
+ <para>
+ Pazpar2 is our attempt to re-think the traditional paradigms for
+ implementing and deploying metasearch logic, with an uncompromising
+ approach to performance, and attempting to make maximum use of the
+ capabilities of modern browsers. The demo user interface that
+ accompanies the distribution is but one example. If you think of new
+ ways of using Pazpar2, we hope you'll share them with us, and if we
+ can provide assistance with regards to training, design, programming,
+ integration with different backends, hosting, or support, please don't
+ hesitate to contact us. If you'd like to see functionality in Pazpar2
+ that is not there today, please don't hesitate to contact us. It may
+ already be in our development pipeline, or there might be a
+ possibility for you to help out by sponsoring development time or
+ code. Either way, get in touch and we will give you straight answers.
+ </para>
+ <para>
+ Enjoy!
+ </para>
+ <para>
+ Pazpar2 is covered by the GNU General Public License (GPL) version 2.
+ See <xref linkend="license"/> for further information.
+ </para>
+ </section>
+
+ <section id="connectors">
+ <title>Connectors to non-standard databases</title>
+ <para>
+ If you wish to connect to commercial or other databases which do not
+ support open standards, please contact Index Data on
+ <email>info@indexdata.com</email>. We have a
+ proprietary framework for building connectors that enable Pazpar2
+ to access
+ thousands of online databases, in addition to the vast number of catalogs
+ and online services that support the Z39.50/SRU/SRW protocols.
+ </para>
+ </section>
+
+ <section id="name">
+ <title>A note on the name Pazpar2</title>
+ <para>
+ The name Pazpar2 derives from three sources. One one hand, it is
+ Index Data's second major piece of software that does parallel
+ searching of Z39.50 targets. On the other, it is a near-homophone
+ of Passpartout, the ever-helpful servant in Jules Verne's novel
+ Around the World in Eighty Days (who helpfully uses the language
+ of his master). Finally, "passe par tout" means something like
+ "passes through anything" in French -- on other words, a universal
+ solution, or if you like a MasterKey.
+ </para>
+ </section>
+ </chapter>
+
+ <chapter id="installation">
+ <title>Installation</title>
+ <para>
+ The Pazpar2 package includes documentation as well
+ as the Pazpar2 server. The package also includes a simple user
+ interface called "test1", which consists of a single HTML page and a single
+ JavaScript file to illustrate the use of Pazpar2.
+ </para>
+ <para>
+ Pazpar2 depends on the following tools/libraries:
+ <variablelist>
+ <varlistentry><term><ulink url="&url.yaz;">YAZ</ulink></term>
+ <listitem>
+ <para>
+ The popular Z39.50 toolkit for the C language.
+ YAZ <emphasis>must</emphasis> be compiled with Libxml2/Libxslt support.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><ulink url="&url.icu;">International
+ Components for Unicode (ICU)</ulink></term>
+ <listitem>
+ <para>
+ ICU provides Unicode support for non-English languages with
+ character sets outside the range of 7bit ASCII, like
+ Greek, Russian, German and French. Pazpar2 uses the ICU
+ Unicode character conversions, Unicode normalization, case
+ folding and other fundamental operations needed in
+ tokenization, normalization and ranking of records.
+ </para>
+ <para>
+ Compiling, linking, and usage of the ICU libraries is optional,
+ but strongly recommended for usage in an international
+ environment.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ In order to compile Pazpar2, a C compiler which supports C99 or later
+ is required.
+ </para>
+
+ <section id="installation.unix">
+ <title>Installation from source on Unix (including Linux, MacOS, etc.)</title>
+ <para>
+ The latest source code for Pazpar2 is available from
+ <ulink url="&url.pazpar2.download;"/>.
+ Most Unix-based operating systems have the required
+ tools available as binary packages.
+ For example, if Libxml2/libXSLT libraries
+ are already installed as development packages, use these.
+ </para>
+
+ <para>
+ Ensure that the development libraries and header files are
+ available on your system before compiling Pazpar2. For installation
+ of YAZ, refer to the Installation chapter of the YAZ manual at
+ <ulink url="&url.yaz.install;"/>.
+ </para>
+ <para>
+ Once the dependencies are in place, Pazpar2 can be unpacked and
+ installed as follows:
+ </para>
+ <screen>
+ tar xzf pazpar2-VERSION.tar.gz
+ cd pazpar2-VERSION
+ ./configure
+ make
+ sudo make install
+ </screen>
+ <para>
+ The <literal>make install</literal> will install manpages as well as the
+ Pazpar2 server, <literal>pazpar2</literal>,
+ in PREFIX<literal>/sbin</literal>.
+ By default, PREFIX is <literal>/usr/local/</literal> . This can be
+ changed with configure option <option>--prefix</option>.
+ </para>
+ </section>
+
+ <section id="installation.win32">
+ <title>Installation from source on Windows</title>
+ <para>
+ Pazpar2 can be built for Windows using
+ <ulink url="&url.vstudio;">Microsoft Visual Studio</ulink>.
+ The support files for building YAZ on Windows are located in the
+ <filename>win</filename> directory. The compilation is performed
+ using the <filename>win/makefile</filename> which is to be
+ processed by the NMAKE utility part of Visual Studio.