[ZOOM] Catching up

Ashley Sanders zzaascs at irwell.mimas.ac.uk
Wed Nov 14 13:19:12 CET 2001


Mike,

> P.S.  C'mon Ashley, where's 1.0g?!  :-)

I was going to try and do it yesterday, but 53 ZOOM emails just proved
too much for me.

So, in no particular order.... and I know I've probably missed
loads of wheat in the chaff :-)

Jaob suggested:

>   for(size_t sz = 0; sz < rs.size(); ++ sz)
>      cout << rs[sz]["title"] << endl;

I almost suggested this myself, but backed away from doing so. I
rather like it, but another side of me just says it is syntactic
sugar for the weak of mind :-)

Mike:

> Now we decided that rather than add the recordtype-specific methods to
> the record class subtypes themselves, we'd add them to the parallel 
> hierarchy of data class subtypes.  I would welcome input from anyone
> who remembers why we decided to do it that way!

Because I thought it would be useful. I'll admit that they are
perhaps redundent, but I still prefer it over doing a dynamic_cast
to find out the type. But if we're getting rid of data anyway...?

Re: Discussion about constructors of connection class.

My understranding was that zoom.hh was the public interface to ZOOM
and so anything not explicitly declared public was therefore
private. So the fact that constructors aren't shown doesn't mean that
C++ can go and create default constructors, but does mean that they
are private. We do need to make this explicit somewhere (and I'm
not sure we yet have all the constructors correct anyway.)

Re: pointless comments in zoom.hh.

Yup they are pointless, but I think they were just notes for us,
rather than being useful for a user. They need changing.

Jakob:

> - enum ZOOM::record::recordSyntax, a enu of record syntaxes will
> allways be +wrong.

I agree that the enum will always be wrong. But, ZOOM was meant to
hide the peccadillos of Z39.50 and I think OIDs are something that
should be hidden at the expense of the enum always being wrong. I
think we should just make the enum as complete as possible and if
anyone wants a new syntax adding, then they contact the ZOOM
maintainers and it gets added (but see below.)

Jakob, then Mike,

>> - enum ZOOM::record::recordSyntax, a enu of record syntaxes will
>>   allways be wrong.  in my current work we uses
>>   1.2.840.10003.5.1000.105.221 to send special present formats. and
>>   as it is a private OID used. 
>
>This is an excellent point - we don't want to preclude the use of
>non-standard record syntaxes.  Anyone want to propose a good way to
>handle this? 

I've never seen a problem with implementations adding their own
private syntaxes to the enum (I do this already with what little
implementation I've done.) If it's a private OID then it
shouldn't ever appear in public ZOOM and it's use is only between
consenting parties.

Mike:

> However, given that the potential for such misunderstanding exists,
> maybe it would be better to change the C++ binding so that the
> two-argument version of the option() method returns void?  Maybe (as a
> wise man once wrote) "each [method] should do one thing well".

I've not got a problem with the set option function returning
the old value. It is rather analagous to the C/C++ code of:

    int b = c++;

But a decision would have to be made as to whether the return
value needs to be freed/deleted by the caller or whether it is
something in static memory, and if so, for how long the value it
contains is valid. Using STL strings does make this
simpler/neater -- but C/C++ programmers have been coping with the
old way of doing things for years. It's slightly messy but a
trivial point that can be cleared up later.

Mike:

> In case anyone didn't "get" that, the message is that I am starting to
> think that the rawdata objects are an extra layer of added complexity 
> for no or marginal gain.

Have we decied how to allow access to the internal structure of a
record? DOM? I'm sorry, but I can't work out what has been decied
here.

Jakob:

> Personly i thing all record's shut be returned as a raw data, and a easy
> way for the user to decode it with libs for handling different encodings.

Libs for handling decoding of raw data is fine, but out of scope
for ZOOM. ZOOM does need a simple way of getting at things like
titles for people who want to write simple programs that do
useful things. Hence we need the field() function.

Re: DOM

I'm happy with DOM as long as it is optional and I don't have to have
anything to do with it. I just don't have the time or the need to get
into it.

Feel free to ignore the rest of this message as it is probably
out of scope.

Adam:

> We could consider an abstract result set class. That is we
> define an interface that all result sets expose. Then do the
> same thing for connections and records. Having said that, I
> think that's going somewhat too far, for now. It also makes
> it even more painful to discuss other language bindings (except  
> for Java, perhaps). So I won't say anymore on this until I'm
> forced to;)

I did also think you could make the connection class virtual.  Then
you could have a connection type (like we have now) that searches one
target, or another connection type that is capable of searching
multiple targets. The user (ie programmer) selects which connection
type they require and that would be the only difference between single
target and multi target searching. But that removes the manager
class and so is very un-ZOOM, so I'll shut up.

Re: merging result sets.

As to Sebastian's merging result sets from multiple searches,
then this is exactly what the new version of COPAC tries to do.
And as Sebastian says, it aint easy. It's slow because you have
to retrieve the whole result set from each of the targets. We
currently use a very simplistic de-dup method, but the cpu needed
for something more sophisticated wont be large and the delay from
downloading all the records will be the limiting factor. If the
combined number of hits reach an arbitary limit (currently set at
750 I think) then we just present the result sets one after the
other. And we also sort -- but that aint perfect yet either. As
Sebastian said, even if a target sorts you can't rely on it's
method and neither can you rely on the filing characters
indicator in MARC records for helping with removing leading
articles of a title. And we should be getting funding to to
make it all a bit more sophisticated.

Ashley.

-- 
Ashley Sanders                                a.sanders at mcc.ac.uk
COPAC: A public bibliographic database from MIMAS, funded by JISC
             http://copac.ac.uk/ - copac at mimas.ac.uk



More information about the ZOOM mailing list