IEC JTC 1/SC34 N261

ISO/IEC JTC 1/SC34

Information Technology --

Document Description and Processing Languages

Title:	Proposers Response to SC 34 N 252 - National Body Comments Received on SC 34 N 229 – Topic Map Data Model – An Infoset-Based Proposal
Source:	SC34 Secretariat
Project:	Topic Maps
Project editor:	L. M. Garshol
Status:	Editor's Response
Action:	For review and comment
Date:
Summary:
Distribution:	SC34 and Liaisons
Refer to:	252
Supercedes:
Reply to:	Dr. James David Mason (ISO/IEC JTC1/SC34 Chairman) Y-12 National Security Complex Information Technology Services Bldg. 9113 M.S. 8208 Oak Ridge, TN 37831-8208 U.S.A. Telephone: +1 865 574-6973 Facsimile: +1 865 574-1896 E-mailk: mailto:[email protected] http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm Ms. Sara Hafele, ISO/IEC JTC 1/SC 34 Secretariat American National Standards Institute 25 West 43rd Street New York, NY 10036 Tel: +1 212 642-4937 Fax: +1 212 840-2298 E-mail: [email protected]

Response to SC34 N 252

SC34 N 252 contains national body comments from the Japanese and UK national bodies on SC34 N 229, the infoset-based topic map data model proposal. This document is the formal response from Lars Marius Garshol to these comments. Some of the issues here raised will be discussed in more depth on the SC34 mailing list.

Response to comments from Japan

NP processing

The National Body of Japan recognizes the importance of Data Model and Processing Model. For the development of those models in SC34, new projects should be officially added to the SC34 projects; first of all, processing of NP ballots should be started. The National Body of Japan requests the NP ballots.

The editors of the infoset-based model strongly support the idea that this work should be given some form of official status, and preferably end up in the form of a normative technical report, and later as part of a second edition of the standard. We ask that this be discussed on the SC34 mailing list.

Title

The wording "Processing model" should be added in the title of this document, because there are some descriptions on the processing model in "3. XTM processing model".

True. The title has been changed accordingly

Relationship

There should be a clarification on the relationship between Data Model and Processing Model and a description on the positions of those models among the topic maps related standards.

The editors of the infoset-based model very strongly agree with this, and will create a requirements document for the topic map data model work be created to provide an appropriate focus for a discussion of these issues, as well as documentation of emerging consensus from these discussions.

Section 2.1: The topic map information item

The following [topic map] should be added: [topic map] This is the set of topic map information items specified by mergeMap elements.

Topic maps referenced from the XTM document are merged into the topic map information item created by that document and retain no separate existence of their own. This is so because the information about which XTM documents made up the topic map in serialized form is considered a lexical detail that is not part of the topic map itself. That is, it is, like the order of the attributes of an element in an XML document, not considered to be relevant for applications, and therefore is left out.

Applications that wish to retain the information about which parts of the merged topic map came from which XTM documents can use the scope-adding capabilities of the <mergeMap> element.

Section 2.4: Variant information items

The following [parameter] and [variant] should be added: [parameter] This is the set of Locator information items. [variant] This is the set of Variant information items.

The contents of the <parameter> element of XTM can be found in the [scope] property of the variant information item. This property could have been named [parameter] to follow the XTM syntax, but for the sake of clarity and internal consistency the name [scope] was preferred.

While the <variant> element may indeed have nested <variant> children, the exact form of this tree structure is not logically significant. Its only purpose is to simplify the XML form of the topic map through the inheritance of themes in the scopes of the variants. The infoset-based model recognizes this, and section 3.6 describes how the variant tree is flattened into a set of variant information items without loss of significant information.

Section 3: XTM processing model

The <roleSpec> element should be added.

It is true that there is no separate section for this element, but the correct processing of it is described in section 3.11. The text was written in this way since that was considered simpler.

Response to comments from the UK

General comments

The data model fails to support all features of ISO/IEC 13250, and provides information that is not part of an ISO/IEC 13250 information set. The model must be fully conformant with ISO/IEC 13250 rather than being based on a derivative from the international standard for which there are no formally recognized definitions.

It is true that the current proposal fails to support all features of ISO/IEC 13250, and this is an acknowledged defect that the editors intend to fix as work on this document continues.

The inclusion of features not currently part of ISO/IEC 13250 are part of an effort to bring the XTM 1.0 specification into ISO/IEC 13250 with no loss of coherence on the part of the resulting whole. This work is thus part of a larger work that will provide formally recognized definitions for these terms. The editors will create a a requirements document in order to document issues like this one for participants in the standardization process who may not have been present at meetings where issues like this one have been discussed.

No allowance is made for the use of facets as part of the data model.

It is true that no explicit allowance for facets is made; it is, however, intended that the as-yet-unwritten section 4 will demonstrate how <facet> elements can be represented in terms of the existing infoset model. Specifically, it is thought that topics reifying the resources to which the facet values are assigned will be able to represent all the information facets can carry.

Section 1: Purpose and scope

Remove 1st and 2nd sentences of second paragraph (they are unsuitable for an international standard).
Remove "serve many purposes" from end of remaining text in 2nd paragraph.
Remove all material after the 2nd paragraph, especially the last sentence (copyright cannot be claimed on material submitted for use as a proposed international standard).

Accepted. The text in question described the status of the document at the time of its writing and submission to SC34, but as a result of the decisions made by the Montréal meeting it is no longer correct, and will be replaced by text describing the document's current status.

Section 2.2: Topic information items

The required unique identifier of a topic should be distinguished from other potential source locators (such as a count of topics in an XPath statement).

This section describes the structure of a topic information item independent of the possible serialization syntaxes for topic maps. It is up to the specifications describing how to build instances of the model from serialized forms of topic maps to describe how source locators are to be assigned.

What might be inserted in this section is text placing particular requirements on the allowed forms of source locators. The editors are uncertain what forms such requirements might take, however. Feedback on this issue would be welcome.

The set of sort names assigned to a topic should also be part of the information set as it may adjust the order in which topics are presented.

This comment raises some interesting issues. One possible answer to this comment would be to say that the set of sort names assigned to the topic information item can be found in the existing structure by locating all variant information items whose [scope] properties contain topic information items whose [subject indicators] properties contain a locator whose [notation] property is set to "URI" and whose address property is set to "http://www.topicmaps.org/xtm/1.0/core.xtm#sort".

It might be that topic map implementations should keep this informaton directly on the topic information item, however, in order to be able to perform common operations like the sorting of topics more quickly. The question is whether the infoset-based model should describe only the minimum set of information topic map implementations must provide to applications, or whether it should also provide information beyond this. The answer to this question belongs in a requirements document.

Section 2.3/2.4

The set of sort name and display names should not be grouped in a single Variants information item as they have different processes applied to them. They should be provided as separate information sets.

As described above, this model represents sort and display names by means of variant names scoped by topics having particular subject indicators. The model as described thus contains the information necessary to apply these processes.

Section 2.5: Occurrence information items

The last sentence reads "Occurrence information items are considered equal if the values of their [value], [resource], [scope], and [class] properties are equal." Is this true if their two source locators differ?

Two occurrence information items whose [source locators] values differ may be considered equal, yes. This corresponds to two <occurrence> elements with identical contents occurring in the same XTM document, but with different values in their id attributes. It is the opinion of the editors that the occurrence information items created from these two elements should be considered equal.

An interesting question is whether two occurrence information items attached to different topic information items, and whose [value], [resource], [scope], and [class] properties are equal should be considered equal or not. If not, should a [topic] property be added? If such a property should be added, what other classes of information items should get similar parent properties? Feedback on this issue would be most welcome.

(A similar question can be raised in other clauses, but here we are talking about references between topic maps. Here we are talking about references outside of the topic map, where the statement is more unsupportable.)

The [source locators] property contains the locators by which the information items can be addressed, not the locators by which the information items address the world outside the topic map. The resource that is an occurrence of information about a topic is addressed by the [resource] property, not the [source locators] property. It may well be that the text should have made this more clear than it currently does.

Section 2.9: Unique value constraints

The statement "No two information items within the same topic map information set may contain the same locator information item in their [source locators] property." needs to be proved. Why cannot two associations contain the same source locators? Surely a pair of topics can be connected by more than one association.

If two different association information items contain the same source locator in their [source locators] property after the deserialization of an XTM document, this can only have happened if the <association> elements that caused them to be created appeared in the same XML entity and had the same XML ID. This would be a violation of the XTM DTD.

The purpose of the [source locators] property is to allow items in a topic map information set to be addressed by means of locators. If two items have the same source locator that is no longer possible.

One of the uses for such addressing is reification, as defined by XTM 1.0. It may be that the semantics of reification should be described as part of the infoset-based model, but until there is agreement on the requirements the infoset-based model should fulfill it is impossible to tell whether this text belongs here or elsewhere. The editors of the infoset-based model will create such a requirements document to allow these issues to be addressed.

(The fifth and sixth of the listed constraints also need to be discussed. The latter is incomplete at very least as topics can have the same base name, providing the names at least have a different scope.)

Constraint number five is "A topic information item may not contain in its [source locators] property the same locator information item that can be found in the [subject address] property of another." What this requirement is saying is that a topic information item may not be the subject of a topic in the topic map. Or, it may be that it says that the XML element that gave rise to the information item may not be the subject of a topic in the topic map. Whether this makes sense or not depends on the precise semantics of the <resourceRef> element of XTM 1.0. The editors think it important that this question be addressed in normative text, but are undecided as to where and how this ought to happen.

Constraint number six is "Two topic information items may not contain equal base name information items in their [base names] property." Section 2.3 states that base name information items are equal only if both their [value] and [scope] properties are equal, so constraint number six already allows base names to have equal values provided their scopes are different.