1 # $Id: PQF.pm,v 1.8 2007-10-05 12:12:34 mike Exp $
3 package Net::Z3950::PQF;
9 use Net::Z3950::PQF::Node;
11 our $VERSION = '0.04';
16 Net::Z3950::PQF - Perl extension for parsing PQF (Prefix Query Format)
21 $parser = new Net::Z3950::PQF();
22 $node = $parser->parse('@and @attr 1=1003 kernighan @attr 1=4 unix');
23 print $node->render(0);
27 This library provides a parser for PQF (Prefix Query Format), an ugly
28 but precise string format for expressing Z39.50 Type-1 queries. This
29 format is widely used behind the scenes of Z39.50 applications, and is
30 also used extensively with test-harness programs such as the YAZ
31 command-line client, C<yaz-client>. A few particularly misguided
32 souls have been known to type it by hand.
34 Unlike PQF itself, this module
35 is simple to use. Create a parser object, then pass PQF strings
36 into its C<parse()> method to yield parse-trees. The trees are made
37 up of nodes whose types are subclasses of
38 C<Net::Z3950::PQF::Node>.
39 and have names of the form
40 C<Net::Z3950::PQF::somethingNode>. You may find it helpful to use
41 C<Data::Dumper> to visualise the structure of the returned
44 What is a PQF parse-tree good for? Not much. You can render a
45 human-readable version by invoking the top node's C<render()> method,
46 which is probably useful only for debugging. Or you can turn it into
47 tree of nodes like those passed into SimpleServer search handlers
48 using C<toSimpleServer()>. If you want to do anything useful, such as
49 implementing an actual query server that understands PQF, you'll have
56 $parser = new Net::Z3950::PQF();
58 Creates a new parser object.
74 $query = '@and @attr 1=1003 kernighan @attr 1=4 unix';
75 $node = $parser->parse($query);
77 die "parse($query) failed: " . $parser->errmsg();
80 Parses the PQF string provided as its argument. If an error occurs,
81 then an undefined value is returned, and the error message can be
82 obtained by calling the C<errmsg()> method. Otherwise, the top node
83 of the parse tree is returned.
85 $node2 = $parser->parse($query, "zthes");
86 $node3 = $parser->parse($query, "1.2.840.10003.3.13");
88 A second argument may be provided after the query itself. If it is
89 provided, then it is taken to be either the name or the OID of a
90 default attribute set, which attributes specified in the query belong
91 to if no alternative attribute set is explicitly specified within the
92 query. When this second argument is absent, the default attribute set
99 my($text, $attrset) = @_;
100 $attrset = "bib-1" if !defined $attrset;
102 $this->{text} = $text;
103 return $this->_parse($attrset, {});
107 # PRIVATE to parse();
109 # Underlying parse function. $attrset is the default attribute-set to
110 # use for attributes that are not specified with an explicit set, and
111 # $attrhash is hash of attributes (at most one per type per
112 # attribute-set) to be applied to all nodes below this point. The
113 # keys of this hash are of the form "<attrset>:<type>" and the values
114 # are the corresponding attribute values.
118 my($attrset, $attrhash) = @_;
120 $this->{text} =~ s/^\s+//;
122 ### This rather nasty hack for quoted terms doesn't recognised
123 # backslash-quoted embedded double quotes.
124 if ($this->{text} =~ s/^"(.*?)"//) {
125 return $this->_leaf('term', $1, $attrhash);
128 # Also recognise multi-word terms enclosed in {curly braces}
129 if ($this->{text} =~ s/^{(.*?)}//) {
130 return $this->_leaf('term', $1, $attrhash);
133 my $word = $this->_word();
134 if ($word eq '@attrset') {
135 $attrset = $this->_word();
136 return $this->_parse($attrset, $attrhash);
138 } elsif ($word eq '@attr') {
139 $word = $this->_word();
142 $word = $this->_word();
144 my($type, $val) = ($word =~ /(.*)=(.*)/);
146 $h{"$attrset:$type"} = $val;
147 return $this->_parse($attrset, \%h);
149 } elsif ($word eq '@and' || $word eq '@or' || $word eq '@not') {
150 my $sub1 = $this->_parse($attrset, $attrhash);
151 my $sub2 = $this->_parse($attrset, $attrhash);
152 if ($word eq '@and') {
153 return new Net::Z3950::PQF::AndNode($sub1, $sub2);
154 } elsif ($word eq '@or') {
155 return new Net::Z3950::PQF::OrNode($sub1, $sub2);
156 } elsif ($word eq '@not') {
157 return new Net::Z3950::PQF::NotNode($sub1, $sub2);
159 die "Houston, we have a problem";
162 } elsif ($word eq '@prox') {
163 return $this->_error("proximity not yet implemented");
165 } elsif ($word eq '@set') {
166 $word = $this->_word();
167 return $this->_leaf('rset', $word, $attrhash);
170 # It must be a bareword
171 return $this->_leaf('term', $word, $attrhash);
175 # PRIVATE to _parse();
179 $this->{text} =~ s/^\s+//;
180 $this->{text} =~ s/^(\S+)//;
185 # PRIVATE to _parse();
190 $this->{errmsg} = join("", @msg);
195 # PRIVATE to _parse();
198 my($type, $word, $attrhash) = @_;
201 foreach my $key (sort keys %$attrhash) {
202 my($attrset, $type) = split /:/, $key;
203 push @attrs, [ $attrset, $type, $attrhash->{$key} ];
206 if ($type eq 'term') {
207 return new Net::Z3950::PQF::TermNode($word, @attrs);
208 } elsif ($type eq 'rset') {
209 return new Net::Z3950::PQF::RsetNode($word, @attrs);
211 die "_leaf() called with type='$type' (should be 'term' or 'rset')";
218 print $parser->errmsg();
220 Returns the last error-message generated by a failed attempt to parse
227 return $this->{errmsg};
233 The C<Net::Z3950::PQF::Node> module.
235 The definition of the Type-1 query in the Z39.50 standard, the
236 relevant section of which is on-line at
237 http://www.loc.gov/z3950/agency/markup/09.html#3.7
239 The documentation of Prefix Query Format in the YAZ Manual, the
240 relevant section of which is on-line at
241 http://indexdata.com/yaz/doc/tools.tkl#PQF
245 Mike Taylor, E<lt>mike@indexdata.comE<gt>
247 =head1 COPYRIGHT AND LICENSE
249 Copyright 2004 by Index Data ApS.
251 This library is free software; you can redistribute it and/or modify
252 it under the same terms as Perl itself.