1 # $Id: PQF.pm,v 1.7 2004-12-23 10:24:12 mike Exp $
3 package Net::Z3950::PQF;
9 use Net::Z3950::PQF::Node;
11 our $VERSION = '0.03';
16 Net::Z3950::PQF - Perl extension for parsing PQF (Prefix Query Format)
21 $parser = new Net::Z3950::PQF();
22 $node = $parser->parse('@and @attr 1=1003 kernighan @attr 1=4 unix');
23 print $node->render(0);
27 This library provides a parser for PQF (Prefix Query Format), an ugly
28 but precise string format for expressing Z39.50 Type-1 queries. This
29 format is widely used behind the scenes of Z39.50 applications, and is
30 also used extensively with test-harness programs such as the YAZ
31 command-line client, C<yaz-client>. A few particularly misguided
32 souls have been known to type it by hand.
34 Unlike PQF itself, this module
35 is simple to use. Create a parser object, then pass PQF strings
36 into its C<parse()> method to yield parse-trees. The trees are made
37 up of nodes whose types are subclasses of
38 C<Net::Z3950::PQF::Node>.
39 and have names of the form
40 C<Net::Z3950::PQF::somethingNode>. You may find it helpful to use
41 C<Data::Dumper> to visualise the structure of the returned
44 What is a PQF parse-tree good for? Not much. You can render a
45 human-readable version by invoking the top node's C<render()> method,
46 which is probably useful only for debugging. If you want to do
47 anything useful, such as implementing an actual query server that
48 understands PQF, you'll have to walk the tree.
54 $parser = new Net::Z3950::PQF();
56 Creates a new parser object.
72 $query = '@and @attr 1=1003 kernighan @attr 1=4 unix';
73 $node = $parser->parse($query);
75 die "parse($query) failed: " . $parser->errmsg();
78 Parses the PQF string provided as its argument. If an error occurs,
79 then an undefined value is returned, and the error message can be
80 obtained by calling the C<errmsg()> method. Otherwise, the top node
81 of the parse tree is returned.
83 $node2 = $parser->parse($query, "zthes");
84 $node3 = $parser->parse($query, "1.2.840.10003.3.13");
86 A second argument may be provided after the query itself. If it is
87 provided, then it is taken to be either the name or the OID of a
88 default attribute set, which attributes specified in the query belong
89 to if no alternative attribute set is explicitly specified within the
90 query. When this second argument is absent, the default attribute set
97 my($text, $attrset) = @_;
98 $attrset = "bib-1" if !defined $attrset;
100 $this->{text} = $text;
101 return $this->_parse($attrset, {});
105 # PRIVATE to parse();
107 # Underlying parse function. $attrset is the default attribute-set to
108 # use for attributes that are not specified with an explicit set, and
109 # $attrhash is hash of attributes (at most one per type per
110 # attribute-set) to be applied to all nodes below this point. The
111 # keys of this hash are of the form "<attrset>:<type>" and the values
112 # are the corresponding attribute values.
116 my($attrset, $attrhash) = @_;
118 $this->{text} =~ s/^\s+//;
120 ### This rather nasty hack for quoted terms doesn't recognised
121 # backslash-quoted embedded double quotes.
122 if ($this->{text} =~ s/^"(.*?)"//) {
123 return $this->_leaf('term', $1, $attrhash);
126 # Also recognise multi-word terms enclosed in {curly braces}
127 if ($this->{text} =~ s/^{(.*?)}//) {
128 return $this->_leaf('term', $1, $attrhash);
131 my $word = $this->_word();
132 if ($word eq '@attrset') {
133 $attrset = $this->_word();
134 return $this->_parse($attrset, $attrhash);
136 } elsif ($word eq '@attr') {
137 $word = $this->_word();
140 $word = $this->_word();
142 my($type, $val) = ($word =~ /(.*)=(.*)/);
144 $h{"$attrset:$type"} = $val;
145 return $this->_parse($attrset, \%h);
147 } elsif ($word eq '@and' || $word eq '@or' || $word eq '@not') {
148 my $sub1 = $this->_parse($attrset, $attrhash);
149 my $sub2 = $this->_parse($attrset, $attrhash);
150 if ($word eq '@and') {
151 return new Net::Z3950::PQF::AndNode($sub1, $sub2);
152 } elsif ($word eq '@or') {
153 return new Net::Z3950::PQF::OrNode($sub1, $sub2);
154 } elsif ($word eq '@not') {
155 return new Net::Z3950::PQF::NotNode($sub1, $sub2);
157 die "Houston, we have a problem";
160 } elsif ($word eq '@prox') {
161 return $this->_error("proximity not yet implemented");
163 } elsif ($word eq '@set') {
164 $word = $this->_word();
165 return $this->_leaf('rset', $word, $attrhash);
168 # It must be a bareword
169 return $this->_leaf('term', $word, $attrhash);
173 # PRIVATE to _parse();
177 $this->{text} =~ s/^\s+//;
178 $this->{text} =~ s/^(\S+)//;
183 # PRIVATE to _parse();
188 $this->{errmsg} = join("", @msg);
193 # PRIVATE to _parse();
196 my($type, $word, $attrhash) = @_;
199 foreach my $key (sort keys %$attrhash) {
200 my($attrset, $type) = split /:/, $key;
201 push @attrs, [ $attrset, $type, $attrhash->{$key} ];
204 if ($type eq 'term') {
205 return new Net::Z3950::PQF::TermNode($word, @attrs);
206 } elsif ($type eq 'rset') {
207 return new Net::Z3950::PQF::RsetNode($word, @attrs);
209 die "_leaf() called with type='$type' (should be 'term' or 'rset')";
216 print $parser->errmsg();
218 Returns the last error-message generated by a failed attempt to parse
225 return $this->{errmsg};
231 The C<Net::Z3950::PQF::Node> module.
233 The definition of the Type-1 query in the Z39.50 standard, the
234 relevant section of which is on-line at
235 http://www.loc.gov/z3950/agency/markup/09.html#3.7
237 The documentation of Prefix Query Format in the YAZ Manual, the
238 relevant section of which is on-line at
239 http://indexdata.com/yaz/doc/tools.tkl#PQF
243 Mike Taylor, E<lt>mike@indexdata.comE<gt>
245 =head1 COPYRIGHT AND LICENSE
247 Copyright 2004 by Index Data ApS.
249 This library is free software; you can redistribute it and/or modify
250 it under the same terms as Perl itself.