ISO/IEC JTC1/SC34 N443

ISO/IEC

ISO/IEC JTC 1/SC34

Information Technology —

Document Description and Processing Languages

Title: Topic Maps — Data Model
Source: Lars Marius Garshol, Graham Moore, JTC1 / SC34
Project: ISO 13250: Topic Maps
Project editor: Steven R. Newcomb, Michel Biezunski, Martin Bryan
Status: Committee draft
Action: For ballot
Date: 2003-11-02
Summary:
Distribution: SC34 and Liaisons
Refer to:
Supercedes:
Reply to: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mail: mailto:[email protected]
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: [email protected]

Topic Maps — Data Model

Contents

1 Scope
2 Normative references
3 Terms and definitions
4   The metamodel
4.1   Introduction
4.2   The fundamental types
4.3   Constraints
5   The data model
5.1   Locator items
5.2   Source locators
5.3   The topic map item
5.4   Topic items
5.4.1   Subjects and topics
5.4.2   Identifying subjects
5.4.3   Topic characteristics
5.4.4   Scope
5.4.5   Reification
5.4.6   Properties
5.5   Topic name items
5.6   Variant items
5.7   Occurrence items
5.8   Association items
5.9   Association role items
6   Merging
6.1   General
6.2   Merging topic items
6.3   Merging topic name items
6.4   Merging variant items
6.5   Merging occurrence items
6.6   Merging association items
6.7   Merging association role items
6.8   Merging locator items
7   Published subjects
7.1   General
7.2   The type-instance relationship
7.3   The supertype-subtype relationship
7.4   Variant name scopes
7.5   Topic characteristic types
7.6   Topic map constructs

Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

ISO/IEC 13250-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages.

ISO/IEC 13250 consists of the following parts, under the general title Topic Maps:

  • Part 1: Overview and Basic Concepts
  • Part 2: Data Model
  • Part 3: XML Syntax
  • Part 4: Canonicalization

Introduction

Topic maps are abstract structures that can encode knowledge and connect this encoded knowledge to relevant information resources. Topic maps are organized around topics, which represent subjects of discourse; associations, representing relationships between the subjects; and occurrences, which connect the subjects to pertinent information resources.

Topic maps may be represented in many ways: using topic map syntaxes in files, inside databases, as internal data structures in running programs, and even mentally in the minds of humans. All these forms are different ways of representing the same abstract structure. It is that structure which this part of ISO/IEC 13250 defines, in the form of a data model.

Ed. Note:

What to do with the assoc-role-type-player issue?

Topic Maps — Data Model

1 Scope

NOTE:

This clause defines the scope of this part of ISO/IEC 13250. It should not be confused with the concept of "scope" defined in 5.4.4, which only applies in the context of topic maps.

This part of ISO/IEC 13250 specifies a data model for topic maps. It defines the abstract structure of topic maps, using the information set formalism, and to some extent their interpretation, using prose. The rules for merging in topic maps are also defined, as are some fundamental published subjects.

The purpose of the data model is to define the interpretation of the topic map interchange syntaxes, and to serve as a foundation for the definition of supporting standards for canonicalization, querying, constraints, and so on. All of these standards fall outside the scope of this part of ISO/IEC 13250, however.

2 Normative references

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

NOTE:

Each of the following documents has a unique identifier that is used to cite the document in the text. The unique identifier consists of the part of the reference up to the first comma.

Unicode, The Unicode Standard, Version 3.0, The Unicode Consortium, Reading, Massachusetts, USA, Addison-Wesley Developer's Press, 2000, ISBN 0-201-61633-5

IETF RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax, Internet Standards Track Specification, August 1998, http://www.ietf.org/rfc/rfc2396.txt

IETF RFC 2732, Format for Literal IPv6 Addresses in URLs, Internet Standards Track Specification, December 1999, http://www.ietf.org/rfc/rfc2732.txt

XML Infoset, XML Information Set, World Wide Web Consortium, 24 October 2001, http://www.w3.org/TR/2001/REC-xml-infoset-20011024

HyTime, ISO 10744:1997: Information Processing — Text and office systems — Hypermedia/Time-based Structuring Language (HyTime), ISO, 1997

3 Terms and definitions

For the purposes of this part of ISO/IEC 13250, the following terms and definitions apply.

3.1
association

a representation of a relationship between one or more subjects

3.2
association role player

a topic participating in an association

3.3
association role

a representation of the involvement of a subject in a relationship representated by an association

3.4
association role type

a topic defining the nature of the participation of an association role player in an association

3.5
association type

a topic representing a subject which describes the nature of the relationship represented by the association

3.6
base name

a name or label for a subject, expressed as a string

3.7
fundamental types

the types of values that are so basic that they have no topic map-defined semantics independent the context they appear in

3.8
information item

abstract representations of topic map constructs

3.9
information resource

a resource that can be represented as a sequence of bytes, and thus could potentially be retrieved over a network

3.10
instance

Any subject that belongs to a particular type

3.11
locator

a string conforming to some locator notation that references one or more information resources

3.12
locator notation

a definition of the formal syntax and interpretation of a class of locators

3.13
Merging

a process applied to topic maps in order to reduce the number of redundant topic map constructs

3.14
occurrence

a representation of relationship between a subject and an information resource

3.15
occurrence type

a topic attached to an occurrence to describe the nature of the relationship between the subject and the information resource linked by that occurrence

3.16
published subject

any subject for which there exists at least one published subject indicator

3.17
published subject identifier

the subject identifier of a published subject indicator

3.18
published subject indicator

a subject indicator that is published and maintained at an advertised location for the purposes of supporting topic map interchange and mergeability

3.19
reification

making a topic represent a subject that is a topic map construct

3.20
scope

the context within which a topic characteristic assignment is valid

3.21
source locator

a locator assigned to a topic map construct in order to allow it to be referred to

3.22
statement

the assignment of a topic characteristic to a topic

3.23
subject

anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. In particular, it is anything on which the creator of a topic map chooses to discourse.

3.24
subject identifier

a locator that refers to a subject indicator

3.25
subject indicator

an information resource that is referred to from a topic map in an attempt to unambiguously identify the subject of a topic to a human being. Any information resource can become a subject indicator by being referred to as such from within some topic map, whether or not it was intended by its publisher to be a subject indicator.

3.26
subject locator

a locator that refers to the information resource that is the subject of a topic. The topic thus represents that particular information resource; i.e. the information resource is the subject of the topic.

3.27
supertype-subtype relationship

the relationship between a more general type (the supertype) and a specialization of that type (the subtype)

3.28
topic

a symbol used within a topic map to represent some subject, about which the creator of the topic map wishes to make statements

3.29
topic characteristic

a topic name, occurrence, or association role belonging to some topic

3.30
topic characteristic assignment

a statement that a certain topic characteristic belongs to a certain topic

3.31
topic map

a set of topics and associations

3.32
topic name

a name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names

3.33
type

an abstraction that captures characteristics common to a set of subjects

3.34
unconstrained scope

the scope used to indicate that a topic characteristic assignment is considered to have unlimited validity

3.35
variant name

an alternative form of a base name that may be more suitable in certain contexts than the base name itself

4 The metamodel

4.1 Introduction

The metamodel used in this document is the same as that used by the XML Information Set [XML Infoset]. A topic map's information set consists of a number of information items, which are abstract representations of topic map constructs. Every information item is an instance of some information item type, which specifies a number of named properties which the information item must have. Throughout this part of ISO/IEC 13250 the term "information item" refers to the information item types defined in this model, while information items of particular types are referred to as "topic items", "topic name items", and so on.

Ed. Note:

Should we define the term 'information item' at all?

The names of these properties are written in square brackets: [property name], following the convention used in [XML Infoset]. Every property has an associated type that constrains what values it may have. Properties are not allowed to have null as their value unless this is explicitly stated in the definition of the property.

Certain properties in the model are specified as computed properties, which means that they are specified in terms of how their values may be produced from other properties in the model. These properties are specified for reasons of convenience or to better reflect the conceptual model but are strictly speaking redundant.

All types defined in this part of ISO/IEC 13250, whether fundamental types or information item types, have a well-defined test of equality. This equality test is used to avoid duplicate values in properties whose values are of type set. Information items have identity, independent of their values, so items can be compared both by identity and by value. Equality throughout this part of ISO/IEC 13250 should be taken to mean equality according to the rules defined for the types of the values being compared.

UML diagrams UML[2] are used in addition to the infoset formalism for purposes of illustration. These diagrams are purely informative, and in cases of discrepancy between the diagrams and normative prose, the prose is definitive.

Issue (sam-conformance):

Should the DM have a conformance section of its own? If so, what does it mean to conform to the DM?

4.2 The fundamental types

The values of information item properties may be either other information items, or values of the fundamental types, which are the types of values that are so basic that they have no topic map-defined semantics independent the context they appear in. The fundamental types are:

String

Strings are sequences of Unicode scalar values conforming to Unicode Normalization Form C [Unicode].

Strings are equal if they consist of the exact same sequence of Unicode scalar values. This implies that all comparisons are case-sensitive.

Set

Sets are collections of zero or more unordered elements that contain no elements that are equal to each other. In this data model, the elements of a set are always information items.

Two sets are equal unless there exists an element in one set for which no equal element can be found in the other.

Null

Null is used to indicate that properties have no value; it does not necessarily indicate that the value of the property is unknown. In this model null can never be contained in a set.

Null is distinct from all other values (including the empty set and the empty string); it is only equal to itself.

4.3 Constraints

The model defined in this this part of ISO/IEC 13250 contains not only fundamental types and information item types with named properties, but also constraints on the allowed instances of the model. The purpose of these constraints is to prevent inconsistencies in instances of the data model.

5 The data model

5.1 Locator items

An information resource is a resource that can be represented as a sequence of bytes, and thus could potentially be retrieved over a network. Topic maps can refer to information resources external to themselves in order to make statements about them. These information resources are not part of the topic map; they are only referenced from it.

A locator is a string conforming to some locator notation that references one or more information resources. Locators are always expressed in some locator notation, which is a definition of the formal syntax and interpretation of a class of locators. The definition of locator notations is outside the scope of this part of ISO/IEC 13250.

Locator items represent locators. Locator items have the following properties:

  1. [notation]: A non-empty string. The string is the name of the notation used by this locator. If the string is "URI" the notation is that described in [IETF RFC 2396] and modified in [IETF RFC 2732]; if it is "HyTime" the notation is one of those described in [HyTime]. If it is neither, the two first characters of the string must be "X-"; all values that do not begin with "X-" are reserved.

  2. [reference]: A non-empty string. The string is the locator, whose interpretation and syntax is governed by the value of the [notation] property.

Equality rule: Locator items are equal if they have the same values in their [notation] and [reference] properties.

NOTE:

This part of ISO/IEC 13250 does not require normalization to be applied to the syntactical expressions of locators in order to detect that syntactically different but logically equivalent locators are in fact equivalent. The application of such logic is encouraged, however. As the it cannot be guaranteed that normalization will be performed reliance on normalization is strongly discouraged.

5.2 Source locators

A source locator is a locator assigned to a topic map construct in order to allow it to be referred to. It is not specified how and when source locators are assigned to information items, and source locators may be freely assigned to information items.

NOTE:

In a sense source locators are identifiers for topic map constructs devoid of any specified semantics, and these may be automatically assigned to information items to provide them with an identifier or to identify the origin of the item.

When an instance of the data model is created through deserialization from some topic map syntax, source locators are created that point back to the syntactical constructs that gave rise to the information items in the data model instance. In these cases the source locators will point to the minimal syntactical construct of origin, which means that for topic items created from the XTM syntax, for example, the source locator will point to the originating topic element, rather than the containing topicMap element.

Topic map constructs may have any number source locators since when duplicate information items are merged the resulting information item inherits all the source locators of the original information items.

Constraint: Duplicate source locators

It is an error for two different information items to have locator items that are equal in their [source locators] properties, unless they are topic items. If they are topic items they must be merged according to the procedure in 6.2.

5.3 The topic map item

A topic map is a set of topics and associations. Its purpose is to convey information about subjects through the assignment of characteristics to topics representing those subjects. The topic map itself has no meaning or significance beyond its use as a container for the information about those subjects; in particular, the topic map does not represent anything but itself.

NOTE:

However, while the topic map does not represent anything it may be reified in order to make statements about the topic map (that is, the collection of topics and associations) as a whole. These statements may for example provide traditional metadata such as author, version, copyright, or they may reference system metadata such as a schema for the topic map, external documentation of it, and so on.

The topic map item represents the topic map. Topic map items have the following properties:

  1. [topics]: A set of topic items. All the topics in the topic map.

  2. [associations]: A set of association items. All the associations in the topic map.

  3. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  4. [base locator]: A locator item, or null. A reference to the location where the topic map is stored.

  5. [source locators]: A set of locator items. The source locators of the topic map.

5.4 Topic items

5.4.1 Subjects and topics

A subject can be anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. In particular, it is anything on which the creator of a topic map chooses to discourse.

A topic is a symbol used within a topic map to represent some subject, about which the creator of the topic map wishes to make statements. Topics are used to represent subjects in order to allow statements to be made about the subjects through the assignment of characteristics to the topics that represent them. A statement is the assignment of a topic characteristic to a topic.

Every topic represents one, and only one, subject. The process of merging ensures that whenever two topics are known to represent the same subject they are merged. It may well be, however, that two topics may represent the same subject without this being detectable by the rules of this part of ISO/IEC 13250. Merging beyond the minimal merging required by the rules of Clause 6 is freely allowed. Most commonly this will be done by inferring the subject of the topics from their characteristics.

EXAMPLE:

Examples of subjects for which topics may be created are:

  • The moon.

  • The Soviet Union. This subject no longer exists as an organizational unit, but the idea still exists, and so is still a subject.

  • The letters 'A', 'B', 'C', and 'D'. This is a single subject, a set with four elements.

  • Plato's notion of the good. This subject is different from, but related to, "good" in the abstract, and John Stuart Mill's notion of "good".

5.4.2 Identifying subjects

Formal identification of subjects with locators allows topic maps to be merged safely and precisely, and also allows the definition of subjects with semantics that can be implemented in topic map systems. Examples of such subjects can be found in Clause 7.

A subject indicator is an information resource that is referred to from a topic map in an attempt to unambiguously identify the subject of a topic to a human being. Any information resource can become a subject indicator by being referred to as such from within some topic map, whether or not it was intended by its publisher to be a subject indicator.

A subject identifier is a locator that refers to a subject indicator. Topic maps contain only subject identifiers, and consequently it is the subject identifier that is the basis for merging; the subject indicator is ignored during merging.

A subject locator is a locator that refers to the information resource that is the subject of a topic. The topic thus represents that particular information resource; i.e. the information resource is the subject of the topic. Different locators are assumed to identify different information resources.

EXAMPLE:

Consider the URI http://www.topicmaps.org. If given as the subject locator of topic A this would mean that topic A represents the information resource identified by this URI. However, using it as the subject identifier of topic B would mean that B represents what is described in that information resource. At the time of writing this would seem to be the organization known as TopicMaps.Org. (Note: the organization; the real-world institution known by that name. This is different from the subject of A, which is the web page itself.)

Note the uncertainty in the last sentence above. The information resource in question is a subject indicator for topic B, but it was not written to be a subject indicator (that is, it is not a published subject indicator), and so is not entirely unambiguous with respect to what subject it indicates. Nor is it guaranteed to be stable, so at the time of reading it may indicate some other subject, or it may no longer exist.

5.4.3 Topic characteristics

A topic characteristic is a topic name, occurrence, or association role belonging to some topic. A topic characteristic assignment is a statement that a certain topic characteristic belongs to a certain topic. In the data model this is represented by the inclusion of an information item representing a topic characteristic in the value of a property of a topic item. Any topic characteristic assignment constitutes a statement about the subject represented by the topic.

The properties of topic items that do not represent topic characteristics are not statements about the subject; they are statements about the topic. As such they are part of the topic map machinery, rather than statements about the subject represented in the topic map.

5.4.4 Scope

All topic characteristic assignments have a scope. The scope represents the context within which a topic characteristic assignment is valid. Outside the context represented by the scope the assignment is not known to be valid. Formally, a scope is composed of a set of topics that together define the context. That is, the topic characteristic is known to be valid only in contexts where all the subjects in the scope apply.

The unconstrained scope is the scope used to indicate that a topic characteristic assignment is considered to have unlimited validity. In the model this is represented by the empty set.

Precisely how a subject, or a set of subjects, defines a context is not defined by this part of ISO/IEC 13250, but left for those creating topic maps to define as part of the definition of their subjects.

EXAMPLE:

Examples of the use of scope are given below:

  • "Suomi" is the name of the country Finland in Finnish. This corresponds to assigning the topic name "Suomi" to a topic representing Finland, and giving it as scope a topic representing Finnish.

  • According to Norman Davies World War II started on June 6, 1937 (Davies[1]). This corresponds to creating a topic representing WWII, and assigning to it the string "June 6, 1937" as an occurrence of type "start date", and giving this occurrence as scope a topic representing the person Norman Davies.

  • According to Peter T. Daniels, the Devanagari script is an instance of the script type "abugida," whereas according to William Bright it is an "alphasyllabary". This corresponds to having two "type-instance" associations, each scoped with a topic representing the relevant authority.

5.4.5 Reification

The act of reification is the act of making a topic represent a subject that is a topic map construct. For example, creating a topic that represents an association is reification.

In many cases it is desirable to be able to attach additional information to topic map constructs such as topic names or associations. One may want to give an association occurrences, or to give an occurrence a name. The basic topic map model does not allow this, but through reification this can be done by creating a topic that reifies the topic map construct. The necessary information can then be attached to the reifying topic, and the reification relationship is present in a structured form that can reliably be detected by implementations.

Reification is achieved by giving the reifying topic a subject identifier that refers to the topic map construct that is being reified. In model terms, this means that if an information item has a source locator item that is equal to one of the items in the [subject identifiers] property of a topic, that topic item reifies the information item.

NOTE:

One topic cannot reify another. A topic reifying a topic map construct in reality represents the real-world thing that that topic map construct represents. A topic reifying an association really represents the relationship represented by that association, and so if one topic were to reify another that would mean that the topic represents the subject of the other, and so the two would have to merge, since they would have the same subject.

Ed. Note:

Definition contradicts this note...

5.4.6 Properties

Topic items represent topics. Topic items have the following properties:

  1. [topic names]: A set of topic name items. This is the set of topic names assigned to this topic.

  2. [occurrences]: A set of occurrence items. This is the set of occurrences assigned to this topic.

  3. [roles played]: A set of association role items. This is the set of association roles played by this topic.

    Computed value: the set of all association role items whose [player] property value is this topic item.

  4. [subject identifiers]: A set of locator items. The locators referring to the subject indicators of this topic.

  5. [subject locator]: A locator item, or null. The locator, if present, refers to the information resource that is the subject of this topic.

  6. [reified]: An information item, or null. If given, the topic map construct that is the subject of this topic.

    Computed value: if any information item contains in its [source locators] property a locator item equal to one in the [subject identifiers] property of this topic item, that information item is the value of the [reified] property. If no such information item is found the value is null.

  7. [source locators]: A set of locator items. The source locators of the topic.

  8. [parent]: An information item. The topic map containing the topic.

    Computed value: the topic map item whose [topics] property contains this topic item.

Equality rule: Two topic items are equal if they have:

  • at least one equal locator item in their [subject identifiers] properties,

  • at least one equal locator item in their [source locators] properties,

  • equal locator items in their [subject locator] properties,

  • an equal locator in the [subject identifiers] property of the one topic item and the [source locators] property of the other, or

  • the same information item in their [reified] properties.

Constraint: Single reified

The computation that produces the value of the [reified] property must yield a single information item, as topics are required to have only one subject.

Constraint: Topic identity required

All topic items must have a value for at least one of the [subject identifiers], [subject locator], and [source locators] properties that is neither the empty set nor null.

NOTE:

Locators which refer directly to subjects which are not information resources must be used with caution. They should not be used in the [subject locator] property, as this is intended only for references to information resources. Rather, they should be placed in the [subject identifiers] property.

The isbn URN scheme used to identify books (IETF RFC 2288[3]), for example, does not reference information resources, and so should not be put in the [subject locator] property, but instead in the [subject identifiers] property.

5.5 Topic name items

A topic name is a name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names. It is the topic name which is a topic characteristic; the base name and variant names are only parts of the topic name characteristic.

Topic names may have a type, which defines what kind of name the topic name represents. They always have a scope, which defines in what contexts the topic name is an appropriate label for the subject. A topic may have any number of topic names.

A base name is a name or label for a subject, expressed as a string. That is, it is something that identifies the subject (though not necessarily uniquely) and can be used as a label for the subject in user interfaces. The notion of a base name corresponds closely to the common sense notion of a name.

NOTE:

Suitable base names for people, countries, and organizations are their names, while base names for documents, musical works, and movies might be their titles. Base names may have variant names, which are alternative forms of the base name that may be more appropriate in specific contexts. Essentially, a base name is a specialized kind of occurrence.

Topic name items represent topic names. Topic name items have the following properties:

  1. [value]: A string. The base name of the topic.

  2. [type]: A topic item, or null. If given, the topic defining what kind of name is represented by this topic name.

  3. [scope]: A set of topic items. The scope that represents the validity of this topic name as a label for the topic.

  4. [variants]: A set of variant name items. The variant names that are alternative forms of the base name.

  5. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  6. [source locators]: A set of locator items. The source locators of this topic name.

  7. [parent]: An information item. The topic containing the topic name.

    Computed value: the topic name item whose [topic names] property contains this topic name item.

Equality rule: Topic name items are equal if the values of their [value], [type], [scope], and [parent] properties are equal.

5.6 Variant items

A variant name is an alternative form of a base name that may be more suitable in certain contexts than the base name itself. The scope of the variant name is the only basis for establishing what variant name is most suitable in any given situation. A variant name may be a string, but it may also be any other kind of information resource.

NOTE:

When choosing a label for a topic, the topic name considered most appropriate should be chosen, and then the form of that topic name best suited for display in that particular context should be chosen, which may be the base name or one of its variants. This part of ISO/IEC 13250 does not constrain the process by which this is done.

7.4 defines some published subjects that may be useful for scoping variant names.

Variant items represent variant names. Variant items have the following properties:

  1. [value]: A string, which may be empty, or it may be null. If given, the variant name.

  2. [resource]: A locator item, or null. If given, a reference to the information resource that is the variant name.

  3. [scope]: A non-empty set of topic items. The scope that describes in what context(s) the variant name may be preferred as a label for the topic.

  4. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  5. [source locators]: A set of locator items. The source locators of the variant name.

  6. [parent]: An information item. The topic name containing the variant.

    Computed value: the topic name item whose [variants] property contains this variant item.

Equality rule: Variant items are equal if the values of their [value], [resource], [scope], and [parent] properties are equal.

Constraint: Variant scope

The value of the [scope] property of each variant item must be a true superset of the value of the [scope] property of the topic name item in its [parent] property.

Constraint: Value/resource exclusion

Exactly one of the [value] and [resource] properties must be null.

5.7 Occurrence items

An occurrence is a representation of relationship between a subject and an information resource. The subject in question is that represented by the topic which contains the occurrence. The information resource may either be a string inside the topic map or an external information resource. Occurrences are essentially a specialized kind of association, where one participant in the association must be an information resource. An occurrence type is a topic attached to an occurrence to describe the nature of the relationship between the subject and the information resource linked by that occurrence.

All occurrences have a scope, which defines the contexts in which the occurrence relationship between the information resource and the subject is valid.

Occurrence items represent occurrences. Occurrence items have the following properties:

  1. [value]: A string, or null. The string, if present, is the information resource the occurrence connects with the subject.

  2. [resource]: A locator item, or null. If given, a reference to the information resource the occurrence connects with the subject.

  3. [scope]: A set of topic items. The scope that describes in what context the occurrence relationship may be considered valid.

  4. [type]: A topic item, or null. The topic that defines the nature of the occurrence relationship.

  5. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  6. [source locators]: A set of locator items. The source locators of the occurrence.

  7. [parent]: An information item. The topic containing the occurrence.

    Computed value: the topic item whose [occurrences] property contains this occurrence item.

Equality rule: Occurrence items are equal if the values of their [value], [resource], [scope], [type], and [parent] properties are equal.

Constraint: Value/resource exclusion

Exactly one of the [value] or [resource] properties must be null.

Issue (xml-data-representation):

How should XML data be represented in the data model?

5.8 Association items

An association is a representation of a relationship between one or more subjects. Associations have an association type, a topic representing a subject which describes the nature of the relationship represented by the association.

An association role is a representation of the involvement of a subject in a relationship representated by an association. An association role connects two pieces of information within an association: the association role player, that is, a topic participating in an association, and the association role type, that is, a topic defining the nature of the participation of an association role player in an association.

EXAMPLE:

An example of an association might be the 'authorship' relationship between Henrik Ibsen and the play 'Peer Gynt'. In this relationship there are two roles: Ibsen plays the role of 'author', while 'Peer Gynt' plays the role of 'work'.

Another example might be the 'parenthood' relationship between Hamlet, King Hamlet, and Queen Gertrude. This relationship has three roles: Hamlet plays the role of 'child', the King that of 'father', and the Queen that of 'mother'.

All associations have a scope, which defines the context in which the relationship represented by the association can be considered valid. The scope applies to the assignment of the roles to the topics playing them in the same way as it does to the association as a whole.

Association items represent associations. Association items have the following properties:

  1. [type]: A topic item, or null. If given, the topic that defines the nature of the relationship represented by the association.

  2. [scope]: A set of topic items. The scope that describes in what context the association may be considered valid.

  3. [roles]: A non-empty set of association role items. The association roles for all the topics that participate in this relationship.

  4. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  5. [source locators]: A set of locator items. The source locators of the association.

  6. [parent]: An information item. The topic map containing the association.

    Computed value: the topic map item whose [associations] property contains this association item.

Equality rule: Association items are equal if the values of their [scope], [type], and [roles] properties are equal.

5.9 Association role items

NOTE:

See 5.8 for the definition of the term 'association role'.

Association role items represent association roles. Association role items have the following properties:

  1. [player]: A topic item, or null. If given, the topic that plays this role in the association.

  2. [type]: A topic item, or null. If given, the topic that represents the nature of the involvement of the association role player in the association.

  3. [reifier]: A topic item, or null. If present, the topic that reifies this topic map construct.

    Computed value: if there exists a topic item in whose [subject identifiers] property can be found a locator item equal to one in the [source locators] property of this information item that topic item is the value of the [reifier] property. If not, its value is null.

  4. [source locators]: A set of information items. The source locators of this association role.

  5. [parent]: An information item. The association containing the association role.

    Computed value: the association item whose [roles] property contains this association role item.

Equality rule: Association role items are equal if the values of their [type], [player], and [parent] properties are equal.

6 Merging

6.1 General

Merging is a process applied to topic maps in order to reduce the number of redundant topic map constructs. This clause defines the rules that define in which situations merging must occur, but these rules are insufficient to ensure that all redundant information is removed from a topic map.

Any change to a topic map that causes a set to contain two information items equal to each other must be followed by the merging of those two information items according to the rules given below for the type of information item to which the two equal information items belong.

6.2 Merging topic items

The procedure for merging two topic items A and B is given below.

  1. Create a new topic item C.

  2. Replace A by C wherever it appears in one of the following properties of some information item: [topics], [scope], [type], and [player].

  3. Repeat for B.

  4. Set C's [source locators] property to the union of the values of A and B's [source locators] properties.

  5. Set C's [subject identifiers] property to the union of the values of A and B's [subject identifiers] properties.

  6. Set C's [subject locator] property to the union of the values of A and B's [subject locator] properties.

    NOTE:

    This must be a single locator, since the property only contains a single locator, and it is forbidden to merge topics with different values in this property.

  7. Set C's [topic names] property to the union of the values of A and B's [topic names] properties.

  8. Set C's [occurrences] property to the union of the values of A and B's [occurrences] properties.

Constraint: Subject locator collision

Two topics being merged cannot have different values in their [subject locator] properties.

6.3 Merging topic name items

The procedure for merging two topic name items A and B is given below.

  1. Create a new topic name item C.

  2. Set C's [source locators] property to the union of the value of the [source locators] properties of A and B.

  3. Set C's [value] property to the value of the [value] property of A. B's value is equal that of A, and need therefore not be taken into account.

  4. Set C's [scope] property to the value of the [scope] property of A. B's value is equal that of A, and need therefore not be taken into account.

  5. Set C's [variants] property to the union of the [variants] properties of A and B.

  6. Remove A and B from the [topic names] property of the topic item in their [parent] properties, and add C.

6.4 Merging variant items

The procedure for merging two variant items A and B is given below.

  1. Create a new variant item, C.

  2. Set C's [source locators] property to the union of the values of A's and B's [source locators] properties.

  3. Set C's [value] property to the value of A's [value] property. B's value is equal to that of A, and need therefore not be taken into account.

  4. Set C's [resource] property to the value of A's [resource] property. B's value is equal to that of A, and need therefore not be taken into account.

  5. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A, and need therefore not be taken into account.

  6. Remove A and B from the [variants] property of the topic name item in their [parent] properties, and add C.

6.5 Merging occurrence items

The procedure for merging two occurrence items A and B is given below.

  1. Create a new occurrence item, C.

  2. Set C's [source locators] property to the union of the values of A's and B's [source locators] properties.

  3. Set C's [value] property to the value of A's [value] property. B's value is equal to that of A, and need therefore not be taken into account.

  4. Set C's [resource] property to the value of A's [resource] property. B's value is equal to that of A, and need therefore not be taken into account.

  5. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A, and need therefore not be taken into account.

  6. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A, and need therefore not be taken into account.

  7. Remove A and B from the [occurrences] property of the topic item in their [parent] properties, and add C.

6.6 Merging association items

The procedure for merging two association items A and B is given below.

  1. Create a new association item, C.

  2. Set C's [source locators] property to the union of the values of A's and B's [source locators] properties.

  3. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A, and need therefore not be taken into account.

  4. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A, and need therefore not be taken into account.

  5. Set C's [roles] property to the value of A's [roles] property. B's value is equal to that of A, and need therefore not be taken into account.

  6. Remove A and B from the [associations] property of the topic map item in their [parent] properties, and add C.

6.7 Merging association role items

The procedure for merging two association role items A and B is given below.

  1. Create a new association role item, C.

  2. Set C's [source locators] property to the union of the values of A's and B's [source locators] properties.

  3. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A, and need therefore not be taken into account.

  4. Set C's [player] property to the value of A's [player] property. B's value is equal to that of A, and need therefore not be taken into account.

  5. Remove A and B from the [roles] property of the association item in their [parent] properties, and add C.

6.8 Merging locator items

Locator items are not merged. If one locator item being added to a set is equal to one already in it the new locator item can be discarded as being redundant.

7 Published subjects

7.1 General

A published subject indicator is a subject indicator that is published and maintained at an advertised location for the purposes of supporting topic map interchange and mergeability. A published subject is any subject for which there exists at least one published subject indicator. A published subject identifier is the subject identifier of a published subject indicator.

This clause defines a number of published subjects for core subjects in order to achieve interoperability through consistent behaviour. These subjects are central to this part of ISO/IEC 13250, yet there is no requirement that they be used, and alternative subjects for this functionality may be used instead.

All published subjects defined by this part of ISO/IEC 13250 are distinct.

7.2 The type-instance relationship

A type is an abstraction that captures characteristics common to a set of subjects. Any subject that belongs to a particular type is known as an instance of that type. A type may itself be an instance of another type, and there is no limit to the number of types a subject may be an instance of.

The type-instance relationship is not transitive. That is, if B is an instance of the type A, and B is a type of which C is an instance, it does not follow that C is an instance of A.

The type-instance relationship between two topic can be asserted using an association item that conforms to the following rules:

  • The [type] property must be set to a topic item that has in its [subject identifiers] property a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#type-instance".

  • The [roles] property must contain exactly two association role items.

  • One of the association items in the [roles] property must have its [type] property set to a topic item whose [subject identifiers] property is set to a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#type". The [role player] property will then contain the topic item representing the type.

  • One of the association items in the [roles] property must have its [type] property set to a topic item whose [subject identifiers] property is set to a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#instance". The [role player] property will then contain the topic item representing the instance.

Association items that use one or more of the published subject identifiers defined in this clause, but which do not conform to these structural rules, are not considered to represent type-instance relationships.

Scope applies to this association type in just the same way as it does to any other.

7.3 The supertype-subtype relationship

The supertype-subtype relationship is the relationship between a more general type (the supertype) and a specialization of that type (the subtype). If B is the subtype of A, it follows that every instance of B is also an instance of A. The converse is not necessarily true. A type may have any number of subtypes and supertypes.

The supertype-subtype relationship is transitive, which means that if B is a subtype of A, and C a subtype of B, C is also a subtype of A.

NOTE:

Loops in this relationship are allowed, and should be interpreted to mean that the sets of instances for all types in the loop are the same. This does not, however, necessarily imply that the types are the same.

The supertype-subtype relationship between two types can be asserted using an association item that conforms to the following rules:

  • The [type] property must be set to a topic item that has in its [subject identifiers] property a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#supertype-subtype".

  • The [roles] property must contain exactly two association role items.

  • One of the association items in the [roles] property must have its [type] property set to a topic item whose [subject identifiers] property is set to a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#supertype". The [role player] property will then contain the topic item representing the supertype.

  • One of the association items in the [roles] property must have its [type] property set to a topic item whose [subject identifiers] property is set to a locator item with [notation] set to "URI" and [reference] set to "http://psi.topicmaps.org/sam/1.0/#subtype". The [role player] property will then contain the topic item representing the subtype.

Association items that use one or more of the published subject identifiers defined in this clause, but which do not conform to these structural rules, are not considered to represent supertype-subtype relationships.

Scope applies to this association type in just the same way as it does to any other.

EXAMPLE:

This means that if 'a' is an instance of 'b' in scope 'Y' and 'X', and 'b' is a subtype of 'c' in scope 'Y' and 'Z' 'a' is an instance of 'c' only in the context where all three topics 'X', 'Y', and 'Z' apply.

7.4 Variant name scopes

Sort names are a form of topic names used to sort topics. Sort names will be sorted in Unicode code point order. To get the desired sort order sort names must be used that, when sorted with this algorithm, give that order.

Sort names can be represented by variant items whose [scope] property contains a topic item whose [subject identifiers] property contains a locator item with http://psi.topicmaps.org/sam/1.0/#sort in its [reference] property and "URI" in its [notation] property.

Display names are a form of topic names intended to be used as the preferred label for the topic.

Display names can be represented by variant items whose [scope] property contains a topic item whose [subject identifiers] property contains a locator item with http://psi.topicmaps.org/sam/1.0/#display in its [reference] property and "URI" in its [notation] property.

7.5 Topic characteristic types

A unique topic characteristic is one where the value of the characteristic effectively identifies the subject of the topic, which means that if two different topics have the same value for a unique characteristic they also represent the same subject, and so must be merged.

A topic name, occurrence, or association item represents a unique topic characteristic if its [type] property contains a topic item which plays the role of instance in a type-instance association conforming to the structural rule in 7.2 where a topic item plays the role of type, and has a locator representing http://psi.topicmaps.org/sam/1.0/#unique-characteristic (notation "URI") in its [subject identifiers] property.

Topics having unique topic characteristics are to be merged according to the procedure in 6.2 if:

  • each has a topic name item representing a unique topic characteristic in its [topic names] property where two such topic name items have equal values in their [value], [type], and [scope] properties, but not in their [parent] properties,

  • each has an occurrence item representing a unique topic characteristic in its [occurrences] property where two such occurrence items have equal values in their [value], [type], [resource], and [scope] properties, but not in their [parent] properties,

  • each has an association role item whose [parent] property contains an association representing a unique topic characteristic where two such association items have equal values in their [type] and [scope] properties and where all association role items in their [roles] properties are equal, except for the two association role items that triggered the merging, which only need to be equal in their [type] properties.

7.6 Topic map constructs

This clause describes published subjects for the main topic map constructs, useful as types for topics reifying topic map constructs of various kinds.

The subject identifier http://psi.topicmaps.org/sam/1.0/#topic-name (notation "URI") identifies the type of topic names, as described in 5.5.

The subject identifier http://psi.topicmaps.org/sam/1.0/#variant (notation "URI") identifies the type of variant names, as described in 5.6.

The subject identifier http://psi.topicmaps.org/sam/1.0/#occurrence (notation "URI") identifies the type of occurrences, as described in 5.7.

The subject identifier http://psi.topicmaps.org/sam/1.0/#association (notation "URI") identifies the type of associations, as described in 5.8.

The subject identifier http://psi.topicmaps.org/sam/1.0/#association-role (notation "URI") identifies the type of association roles, as described in 5.9.

Bibliography

Davies, Europe: A History, Norman Davies, Oxford University Press, 1996, ISBN 0-19-820171-0

UML, Unified Modeling Language (UML), Version 1.5, Object Management Group, http://www.omg.org/technology/documents/formal/uml.htm

IETF RFC 2288, Using Existing Bibliographic Identifiers as Uniform Resource Names, Informational Memo, February 1998, http://www.ietf.org/rfc/rfc2288.txt