ISO/IEC JTC 1/SC34 N0327
ISO/IEC JTC 1/SC34
Document Description and Processing Languages
|Title:||Note on issues to be decided on scope|
|Source:||Marc de Graauw, JTC1/SC34|
|Project editor:||Steven R. Newcomb, Michel Biezunski, Martin Bryan|
|Action:||For review and comment|
|Distribution:||SC34 and Liaisons|
|Reply to:||Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mailk: mailto:[email protected]
Ms. Sara Hafele, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
25 West 43rd Street
New York, NY 10036
Tel: +1 212 642-4937
Fax: +1 212 840-2298
E-mail: [email protected]
Note on issues to be decided on scope
Discussion paper 14 07 2002
This is version 1. (Note: minor editorial modifications were made to this note by Lars Marius Garshol in order to mark it up in XML. One point as also added.)
This note discusses three related SAM issues:
The intention of the author is to give a survey of existing viewpoints, not to add new argumentation. This note will be input for the MontrÃÂ©al 2002 meeting of SC34WG3. Any new argumentation can be forwarded there by the attendees, or before that meeting on the mailing list by non-attendees. Such posted arguments should be read in conjunction with this note. Since I am off on holiday, I will not be able to change this note before MontrÃÂ©al. The attendees there can do so if they wish.
Several people have signalled the need for a more structured scope. Three discussions merit special attention:
Towards a general theory of scope, Steve Pepper & Geir Ove GrÃÂ¸nmo.
None of these are ready to be accepted as a definitive proposal yet.
And of course a position on the issue is we should not pursue a more structured scope but leave this up to applications.
Questions the meeting can decide on:
Is a more structured scope to be pursued?
If so, how will this take place? As part of the SAM effort, as part of a TMCL effort, or as a separate effort.
This issue falls apart in several smaller and/or new ones.
There is a proposal to adapt the terminology and replace the terms 'intersection'/'union' with 'any subjects'/'all subjects'.
Intersection and union do not apply to topics, but to sets, so the current usage of intersection and union is not correct. The proposal for the new terminology is:
When we rephrase scope-as-union in this way we get: if we have topic X with name Y which is scoped by topics A and B, then Y is a valid name whenever A applies _or_ B applies.
When we rephrase scope-as-intersection we get: if we have topic X with name Y which is scoped by topics A and B, then Y is a valid name whenever A applies _and_ B applies.
Overview of terminology:
A choice should be made between the 'any subjects'/'all subjects' views.
If it is decided to pursue a more structured scope, quite possible both views will be supported. From the viewpoint of user requirements the issue is less (though still) important then.
Pros of the 'any subjects' view:
It is compatible with ISO 13250:2000 and XTM 1.0.
It contains redundancy. Example: If we say that scope is union that would mean that the following topic
[finland = "Finland" / norwegian swedish = "Finland" / norwegian = "Finland" / swedish]
contains two redundancies, and is actually equivalent to
[finland = "Finland" / norwegian swedish]
which would seem to imply that the rules for equivalence and redundancy elimination need to be modified.
ISO 13250:2000 and XTM 1.0 are internally inconsistent. If scope is defined as 'any subjects', the merging rule for the topic naming constraint cannot be correct.
Pros of the 'all subjects' view:
It is easier to implement (also see discussion of unconstrained scope) and use.
One can still make assertions of the 'any subject' kind, by making the assertion twice and scoping each differently.
It is internally consistent, and requires no changes to merging and redundancy removal rules.
Not backwards compatible.
Question to be decided: Do we take the 'any subjects' or 'all subjects' view?
Is a scoped topic map assignment 'not valid' or 'not known to be 'valid' outside its scope? I.e. in the following assertion:
[economy = "economie" / dutch]
Does this mean "economie" is not a valid basename when the context is English (and not Dutch too!) or is the basename not known to be valid?
Discussion on the mailing seem to go towards consensus on the not known to be valid' view.
There is a textual proposal:
All topic characteristic assignments have a scope, which defines the extent to which the statement represented by the assignment is valid. Outside the context represented by the scope the statement is not known to be valid. Formally, a scope is composed of a set of subjects that together define the context. That is, the topic characteristic is known to be valid only in contexts where all the subjects in the scope apply.
Or 'any of the subjects', depending on 2b.
Question to be decided: Is this definition accepted? If not, what is?
There is a problem with the ISO definition of scope.
Explanation: The ISO 13250 definition is strange, since the unconstrained scope here is defined to be equivalent to the scope made up of all topics in the Topic Map. I.e. when we have a topic map which consists of two topics:
<topicmap> <topic id="NL"> <topname> <basename>Netherlands</basename> </topname> </topic> <topic id="MdG"> <topname> <basename>Marc de Graauw</basename> </topname> </topic> </topicmap>
then ISO 13250 would imply that this is equivalent to:
<topicmap> <topic id="MdG"> <topname scope="MdG NL"> <basename>Marc de Graauw</basename> </topname> </topic> <topic id="NL"> <topname> <basename>Netherlands</basename> </topname> </topic> </topicmap>
Since these two topic maps will behave differently when merging with a third one, I do not see how this could be correct. It is also strange to use a topic to scope its own name as a general principle, although this would be correct in some circumstances. Further, the second Topic Map would seem to suggest that the basename "Marc de Graauw" is not valid outside the defined scope (or not known to be valid), while the first Topic Map would seem to suggest that there is no known limitation to the validity of the basename "Marc de Graauw". Though strictly speaking this interpretation is up to applications, it still seems counterintuitive.
There are a few alternatives for the unconstrained scope:
It is the empty set. (Only in 'all subjects' view.)
It is the set of all topics. (Only in 'any subjects' view.)
Leave it undefined.
The unconstrained scope is the set of all topics that are 'known' to be relevant for a Topic Map to an application. These would include:
all topics in the Topic Map under scrutiny,
all topics in Topic Maps that are to be merged.
The unconstrained scope is the union of all topics currently used as scopes.
Solution 2. has as a problem that the set of all topics cannot be iterated. This might cause problems with TMQL implementations. (Note that proposals for structured scope might re-introduce this problem.)
Solution 3. is possible if the merging process is well defined. This would probably mean that the nature of the unconstrained scope is decided implicitly, and it that case it is better to do it explicitly.
Solutions 4. and 5. are similar and look a lot like ISO 13250:2000. They do not seem to avoid the problem sketched above: merging behaviour is different after the unconstrained scope 'translated' to an explicitly given scope.