[Ex-plain] Latest DTD (was ZIG)

Alan Kent ajk at mds.rmit.edu.au
Fri Mar 8 00:36:39 CET 2002


On Thu, Mar 07, 2002 at 11:25:57AM +0000, Robert Sanderson wrote:
> > * The same name is being used for search, sort, and scan. The assumption
> >   is these belong to the same space of index names. Is this true all the 
> >   time? Should the index name be unique in a single XML record?
> 
> I don't follow?

Sorry - I was typing fast (wife waiting to go home!) so was a bit
terse on a few things. A fundamental assumption I have is that we
are trying to describe a Z39.50 database, and so want to map directly
on to Z39.50 concepts. This is not quite true in the proposal (which
does not stress me out). The Z39.50 protocol does not define the
concept of 'indexes'. It defines attributes, elements, etc.
So if you are introducing a new concept, a clear definition of
exactly what it is and how it relates to Z39.50 I think would be
important. You might have an implicit understanding of what it means
with respect to your implementation of Z39.50, and I don't mind
the concept, but it needs to be documented.

> Index describes a single index in the database.  This allows for multiple 
> attributes to be grouped together as aliases for a single index. This will 
> be especially useful (IMO) when we get BIB2, as then we can have an author 
> index with both BIB1 and BIB2 attributes associated with it.

'index' is not a Z39.50 protocol concept that I know of. How exactly
do you map 'index' onto Z39.50? Term List? etc. (I mean I can guess,
but guessing is not good enough for a interoperable spec is it :-)

> In the latest version there is no <name> element in <index> as it was 
> irrelevant, so what do you mean by index name? 

I think I meant the id attribute of <index ...>

For CCL to RPN, I would like to have a symbolic name to use in text
query languages (a 'name'). I would try to use the 'id' attribute
in the current proposal to do this. (Also relevant for CQL in SRW.)

> > * What does 'primary' mean on indexes?
> 
> To use the example above, if the database maintainer wanted people to use 
> BIB2 rather than BIB1 as a transition phase, then they would put 
> primary="true" on the BIB2 map.  Equally the same goes for <host>, 
> <title>, <description> etcetc.

Sorry, wrong question. I meant on <indexTitle>. It's not defined on
<index>, only <indexTitle> and <map>. I understand the <map>
attribute now. But there is only one <indexTitle>, so why have a
'primary' attribute?

> > * What is <indexType>?
> 
> The 'type' of the index, if this is not verifiable by the attributes. (But 
> can be used even if it is)
> For example, I have my own attribute set for collectable card games.  It 
> has an attribute and index for 'card name' and if I were doing a cross db 
> title search I would want to search this, but it wouldn't be possible to 
> know to do this without some typing mechanism.

Can you give a concrete example of what you put in the element? Is it
something specific to your implementation or a generic Z39.50 concept?

> > * Do you really want to turn attribute types (numbers) into different
> >   schema elements? This would restrict the population space and gives
> >   a big long list of element names. Would it be better to use numbers
> >   avoiding mapping problems? I was not sure what <hitcount> etc meant.
> 
> Yes.  Otherwise you need to put attribute set on Every element which has 
> them as use and access are both '1' (etc)

Sorry, I don't understand exactly (and I might not have been clear).
By enumerating attribute types (which are numbers in Z39.50) as elements
in the DTD, it means if anyone ever comes up with a new attribute type,
the DTD needs to be extended. This seemed architectually bad to me.
Why not just types and values as in the spec? I must admit I don't
understand your comment. I might be missing the point.

I keep relating things back to the Z39.50 spec. If I want to form a
query, I need to know the global attribute set to use, and then
the attribute list (where V3 allows an attribute set per type/value).

I am suggesting something like:

    <map attributeSet='Bib-1'> <!-- default global attr set -->
	<attr type='1' value='3'/> <!-- title -->
	<attr type='2' value='1'/>
	<attr type='3' value='7'/>
	...etc...
    </map>

rather than:

    <map attributeSet='Bib-1'> <!-- default global attr set -->
	<use>3</use> <!-- title -->
	<posit>2</posit>
	<struct>3</struct>
	...etc...
    </map>

Your form is easier to read, but not easier to use automatically for
forming queries etc (I must define a table mapping element names to
attribute type numbers, and extend the table if anyone introduces
a new element name etc).

Or are we talking different wavelengths?

> > * Should attribute values distinguish between string and numeric attribute
> >   values?
> 
> Why, and if so, how?

Scratch this - stupid question. Got mixed up with GRS-1 element names.
There is a choice for attribute type in RPN queries, but its numeric
or complex (weird stuff). I am happy sticking to numeric attribute values.

> > * What is a <sortKeyword>?
> 
> A keyword that is accepted for sorting.

Sorry, what does this mean with respect to the Z39.50 spec? Do you mean
the 'sortfield' member of the SortKey type?

As an aside, what do people normally use with sorting? Sort fields,
attribute lists, or element specifications? (I am talking about in
the SortKey CHOICE.)

> > * What are legal format names? (What is the exact format etc)
> 
> The OID, or the official name for it?  GRS-1, SUTRS, MARC, XML, SGML, etc

I guess the OID would make the most sense (silly me). It saves having
to define standard tables - and is extensible without changing the
Explain-- spec.

> > * Should there be able to be a description for element sets (rather than
> >   just the names?)
> 
> Why?  I can see a Title for elementset, but if we allow a description, 
> then we should also allow a description for each index. 

I guess I am thinking it would be sensible for Explain-- to be pretty
compatible (a subset of) Explain. Explain has a description field
for ElementSetDetails, but no title. For F and B, no description is
needed. But what about element set 'X' (locally defined)?

Description per index is not silly either though (optional if you
like). I would like to use the index names (ids?) in text queries
(CCL, CQL, etc). I would like to show a user a list of 'indexes' they
can query on. Descriptions I think are useful for anything that may
be shown to a user.

> > <indexInfo> 
> > <index id="DatabaseName" search="true" scan="true" sort="false"> 
> > <indexTitle primary="false">Database Name</indexTitle> 
> > <indexType>clever</indexType> 
> > <map attributeset="1.2.840.10003.3.1000.62.0.10.1.33541.5992.210"
> > primary="false"> 
> > <use>numeric 1</use> 
> > </map> 
> > </index> 
> 
> Should be, IMO, <use> 1 </use> as use attributes are only ever numeric.
> See indexType and primary explanation above. 

That was actually a bug! I will fix :-)

> > <index id="F" search="false" scan="false" sort="true"> 
> > <indexTitle primary="false"></indexTitle> 
> > <indexType>clever</indexType> 
> > <map primary="false"> 
> > </map> 
> > </index> 
> 
> This is the old and stupid way of representing a sort by keyword, although 
> one which I quite liked.
> 
> New way is <sortKeyword> F </sortKeyword>, which I like even more.

Oh, so by sortKeyword you mean 'sortfield' in 'SortKey' in the ASN.1
of Z39.50. I will move my definition.


One new question - I did not do the last mod time yet. What format should
that be in?

Alan




More information about the Ex-plain mailing list