ISO/IEC JTC 1/SC 34N0549

ISO/IEC logo

ISO/IEC JTC 1/SC 34

Information Technology --
Document Description and Processing Languages

TITLE: Topic Map Constraint Language
SOURCE: Mr. Graham Moore; Ms. Mary Nishikawa; Mr. Dmitry Bogachev
PROJECT: CD 19756: Information Technology - Topic Maps - Constraint Language (TMCL)
PROJECT EDITOR: Mr. Dmitry Bogachev; Mr. Graham Moore; Ms. Mary Nishikawa
STATUS: Editors' Draft
ACTION: For review and comment
DATE: 2004-10-16
DISTRIBUTION: SC34 and Liaisons
REFER TO: N0548 - 2004-10-16 - Topic Map Constraint Language (TMCL) Requirements and Use Cases
REPLY TO:

Dr. James David Mason
(ISO/IEC JTC 1/SC 34 Chairman)
Y-12 National Security Complex
Bldg. 9113, M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
Network: [email protected]
http://www.y12.doe.gov/sgml/sc34/
ftp://ftp.y12.doe.gov/pub/sgml/sc34/

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: [email protected]
http://www.jtc1sc34.org



Topic Map Constraint Language

1. Introduction

1.1 Overview

Topic Map Constraint Language [TMCL] provides a means to express constraints on topic maps conforming to ISO/IEC 13250:2000 [13250]; these will be over and above the constraints currently defined in the Topic Map Data Model [DM].

2. TMCL

2.1 Introduction

TMCL is designed to allow users to constrain any aspect of the topic map data model [DM]. TMCL adopts TMQL [TMQLreq] as a means to express both the topic map constructs to be constrained and topic map structures that must exist in order for the constraint to be met. TMCL defines TMCL-Schema and TMCL-Rule. TMCL-Schema provides a type based model of constraints. TMCL-Schema is defined in terms of a more abstract model TMCL-Model. TMCL-Rule provides a generlised model of constraint based on TMQL. For each langauge a model, semantics and syntax are defined.

Both TMCL-Rule\Schema define sets of constraints. In general these constraints consist of terms that identify parts of the Topic Map to be constrained and terms that define the predicate or truth that must hold for the Topic Map to be considered to be consistent.

2.2 TMCL-Rule\Schema Validation Semantics

TMCL-Rule and TMCL-Schema are used to constrain instances of the Topic Map Data Model. If the topic map is valid in respect to the constraints being tested then validation is said to have succeeded. More formally it can be said that :

Given:
  TopicMap: tm1
  Schema  : sc1
Then:
  Validate(tm1, sc1) => (true, notifyItem*)  | (false, conflictItem+, notifyItem*)

[NOTE 1] We might want to return a topic map here rather than true or false.

[gdm] Maybe, but not convincing. We should provide machinery that the result data model can be expressed as a topic map but not mandate it.

The Validate function is defined as follows:

  1. Evaluates the 'selector' part of the constraint. This results in one or more bindings of topic map information items to variables.
  2. For each value of each bound variable in the selector that also occurs in the constrainer expression, the constrainer expression is evaluated.
  3. If the constrainer expression returns an empty result set, i.e., no matches were found, then the topic map data model instance does not meet the constraint.
  4. The process is repeated for each constraint in the schema. The topic map is valid with respect to the schema if all constraints are valid.

2.3 TMCL-Model

The following model constructs describe the constraint language that underlies the TMCL-Schema section of 2.3. The Model is split into three parts; the set of predicate types used to identify the thing to be constrained (TMQL core predicate set), the set of constraints for different TMDM constructs (extended TMQL predicates) and the schema constructs that use predicates and constraints to define a complete schema.

[gdm] The format of the following needs fixing.

Predicates

The predicates define which TMDM construct is to be constrained. The constructs here are a set of TMQL core predicates. They are designed so that they can be composed together to form complex expressions. There is no OR or NOT. Define an extra schema rule for OR. NOT needs some consideration.

[ISSUE 1] Are the TMQL core predicates built into TMCL?

[mn] See section 4 in A Proposed Foundational Model for Topic Maps. Lars Marius Garshol. ISO/IEC JTC 1/SC 34N0529.

[NOTE 2] The set of TMQL core predicates are designed so that they can be composed together to form complex expressions. There is no OR or NOT. TMCL needs to Define an extra schema rule for OR. NOT needs some consideration.

topic-predicate(loc-predicate* srclocs, 
                loc-predicate* resrefs,
                loc-predicate* subjinsd,
                basename-predicate* names,
                occurrence-predicate* occurrences,
                type-predicate* types)
               
loc-predicate(RegEx match-rule)

type-predicate(topic-predicate type)

basename-predicate(type-predicate type,
                   scope-predicate scope,
                   value-predicate value,
                   variant-predicate variants*)

variant-predicate(scope-predicate scope,
                  value-predicate value)
            
occurrence-predicate(type-predicate type,
                     scope-predicate scope,
                     value-predicate value | loc-predicate resref)

scope-predicate(topic-predicate* topics)

association-predicate(role-predicate* roles,
                      type-predicate type,
                      scope-predicate scope)

role-predicate(topic-predicate role-type,
               topic-predicate role-player)

value-predicate(RegEx match-rule)

[mn] There is value-predicate(RegEx match-rule) and loc-predicate(RegEx match-rule). RegEx match-rule would be different for these? Needs further explanation.

Constraint Constructs

The following set of constraint constructs utilise the predicates defined above to complete the constraint model.

 

topic-constraint(loc-constraint* srclocs, 
                 loc-constraint* resrefs,
                 loc-constraint* subjinsd,
                 basename-constraint* names,
                 occurrence-constraint* occurrences,
                 type-constraint* types,
                 topic-predicate* one-of,
                 topic-predicate same-topic-as,
                 play-role-constraint playsrole,
                 role-constraint typesrole)
          
loc-constraint(RegEx match-rule, Int maxcard, Int mincard)

value-constraint(RegEx match-rule)

type-constraint(topic-predicate type)

basename-constraint(type-predicate type,
                    scope-constraint scope,
                    value-constraint value,
                    variant-constraint variants*,
                    Int cardMin,
                    Int cardMax,
                    String* OneOf)

variant-constraint(scope-constraint scope,
                   value-constraint value)            

occurrence-constraint(type-predicate type,
                      scope-constraint scope,
                      value-constraint value | loc-constraint resref
                      Int cardMin,
                      Int cardMax,
                      String* OneOf)

scope-constraint(topic-predicate* topics)

association-constraint(role-constraint* roles,
                       type-constraint type,
                       scope-constraint)
                       
role-constraint(topic-predicate role-type,
                topic-predicate* all-players-from,
                topic-predicate some-players-from,
                topic-predicate one-of,
                Int cardMin,
                Int cardMax)

play-role-constraint(topic-predicate role-type,
                     topic-predicate association-type,
                     scope-constraint scope,
                     Int cardMin,
                     Int cardMax,
                     role-constraint other-players)
                     

The following schema definitions allow constraints to be constructed consisting of predicates for selection and constraints for assertion.

topic-schema(topic-predicate, topic-constraint)
association-schema(association-predicate, association-constraint)

[gdm] I really thought that we could simplify this further, but the model is so big that this seems like the minimum of constructs required. A reference model would greatly aid the process here.

2.4 TMCL-Schema

This proposal is a variant of Ontopia Schema Language [OSL] and TMSchema [TMS].

[gdm] I see this more and more as a possible syntax and that the model above more clearly expresses and is more expressive than what follows.

The following definition is a language that can be used to constrain classes of topics and associations.

TopicMap Schema

Used to group together a collection of constraints.

  TopicMapSchema:
    TopicSchema *
    AssociationSchema *
    SameTopicAs TMQLExp* 

Topic Identification

Topic Identification is used to identify exactly 1 topic.

  TopicIdentification:
    SrcLocators      # URI *  
    SubjectIndicator # URI *  
    SubjectAddress   # URI     
    TMQLExp          # String 

Set of topic identification constructs.

TopicSet:
  TopicIdentification *

Topic Schema

TopicSchema:
  Type                       # TopicIdentification
  SubjectAddressSchema      
  SubjectIndicatorSchema* 
  BaseNameSchema *
  InternalOccurrenceSchema *
  ExternalOccurrenceSchema *
  OneOfSchema?
  SameTopicAsSchema?
  PlayRoleSchema *
  RoleSchema *

Same Topic As

Standardized mechanism for capturing additional merging rules.

SameTopicAsSchema:
  Matches TMQLExp   // select $A where 
                       connected-to($A, $area), 
                       connected-to($this, $area), 
                       $A\=$this,
                       gets-benefits($A, $lots), 
                       gets-benefits($this, $lots),
                       $B\=$this, 
                       has-doctor($A, ?Y)   
                       has-doctor($this, ?Y)
                       $C\=$this                   
                       // $this is already bound.   

SubjectIndicator Schema

Constrains the cardinality and shape of subject indicator locator.

SubjectIndicatorSchema:
  cardMin           # Integer
  cardMax           # Integer
  match             # Regular Expression

SubjectAddress Schema

Constrains the cardinality and shape of subject address locator.

SubjectAddressSchema:
  cardMin           # Integer
  cardMax           # Integer
  match             # Regular Expression

Base Name Schema

Constrains topic names.

BaseNameSchema:
  type              # TopicIdentification
  scope             # TopicSet
  cardMin           # Integer
  cardMax           # Integer
  dataType          # xsd and custom xml schemas
  one of            # String*
  match             # Regular Expression*

ExternalOccurrenceSchema

Constrains internal occurrences.

InternalOccurrenceSchema:
  type              # TopicIdentification
  scope             # TopicSet
  cardMin           # Integer
  cardMax           # Integer
  dataType          # xsd and custom schemas
  one of            # String*
  match             # Regular Expression *  

ExternalOccurrenceSchema

Constrains external occurrences.

ExternalOccurrenceSchema:
  type              # TopicIdentification
  scope             # TopicSet
  cardMin           # Integer
  cardMax           # Integer
  one of            # URI*
  match             # Regular Expression *  

AssociationSchema

Constrains classes of association.

AssociationSchema:
  type              # TopicIdentification
  scope             # TopicSet
  RoleSchema+

Role Schema

Constraints the nature of roles on associations of specific types.

RoleSchema:
  roleType          # TopicIdentification
  cardMin           # Integer
  cardMax           # Integer
  allPlayersFrom    # TopicSet  // list of Types
  somePlayersFrom   # TopicSet  // list of Types
  oneOf             # TopicSet  // list of topics

Role Schema

Constraints the nature of participation in associations.

PlayRoleSchema:
  roleType          # TopicIdentification
  associationType   # TopicIdentification
  scope             # TopicSet
  cardMin           # Integer
  cardMax           # Integer
  otherPlayers      # RoleSchema*

One Of

One of is used to defined a controlled vocabulary.

OneOfSchema:
  one-of            # TopicSet  

Mapping to TMCL Model

[gdm] Once we agree the model looks ok and TMCL Schema is more solid we need to define this.

Syntax for TMCL-Schema

To be done...

2.5 TMCL-Rule

TMCL-Rule allows to declare set of assertions about topic maps. It is a rule-based language which leverages TMQL constructs for specifying conditions and assertions.

Note: Relation to Schematron

TMCL-Rule is close to ISO/IEC 19757-3 (Document Schema Definition Languages (DSDL)- Part 3: Rule-based validation ? Schematron) [DSDL]. Schematron allows do define validation rules for XML documents. TMCL-Rule leverages experience from other rule-based languages and allows specifying constraints based on TMDM.

RuleSchemaItem

The RuleSchemaItem collects together a set of rules that can be used to validate a topic map. There is exactly one schema information item in each information set.

RuleSchemaItem: 
   ID                 #defines schema ID
   Name?              #defines schema Name
   RuleItem*          #set of rules
   DiagnosticItem*    #provides more specific details for assertions and reports

RuleItem

The RuleItem defines set of assertions about topic map. The RuleItem consists of optional context item, optional let items and one or more assertion or/and report items.

RuleItem: 
   ID             #defines rule ID 
   Name?          #defines rule Name
   ContextItem?   #locates topic map data model information items to be constrained.
   LetItem*       #introduces local variables which can be used in assertions and report items 
   AssertItem*    #if test is negative AssertItem generates ConflictItem
   ReportItem*    #if test is positive ReportItem generates NotifyItem

ContextItem

The ContextItem is used to locate topic map data model information items to be constrained. It allows to express assertions in a form of "forevery X,Y... where P(X,Y...) satisfies Q(X,Y,...) Variables defined in ContextItem can be used in LetItems, AssertItems and ReportItems The ContextItem is optional element. If rule does not have ContextItem then assertions are evaluated in the context of full topic map.

ContextItem: 
   ForEvery+      #list of variables
   Where          #TMQL predicate expression with free varaibles from ForEvery list

LetItem

The LetItem introduces local variable which can be used in AssertItem and ReportItem

LetItem: 
   Variable       #variable which receives value
   Where          #TMQL predicate expression which generates value

AssertItem

If rule has ContextItem then AssertItem is an assertion about topic map information items located by the ContextItem. In this case assertion can use variables defined in ContextItem. If rule does not have ContextItem assertions are evaluated in the context of full topic map. If test is negative AssertItem generates ConflictItem

AssertItem: 
   Test           #TMQL expression which can include variables from ContextItem and LetItems and returns true or false
   Message        #string which can include variables(and simple path expressions) from ContextItem and LetItems
   Diagnostics    #list of DiagnosticItem IDs, is used for detailed notification  

Note 1:
Rules without ContextItem allow to express constraints defined on full topic map

Example 1: Topic map must have more than 20 topics of "musician" type.

Example 2: Topic map must have a topic for composer who was born in Milan.

Note 2:
If constraint can be formulated in a form of "forevery X,Y... where P(X,Y...) satisfies Q(X,Y,...)" preferable form of a rule includes explicit ContextItem.

ReportItem

If rule has ContextItem then ReportItem is an assertion about topic map information items located by the ContextItem. In this case assertion can use variables defined in ContextItem. If rule does not have ContextItem report assertions are evaluated in the context of full topic map. If test is positive ReportItem generates NotifyItem

ReportItem: 
   Test           #TMQL expression which can include variables from ContextItem and LetItems and returns true or false
   Message        #string which can include variables(and simple path expressions) from ContextItem and LetItems
   Diagnostics    #list of DiagnosticItem IDs (with 0 or more parameters) is used for detailed notification  

DiagnosticItem

DiagnosticItem provides more specific details for assertion and report notifications. DiagnosticItem can include variables and simple path expressions. Variables receive values when diagnostic item is called during rule evaluation. DiagnosticItem can also provide some recommendations for conflict resolution.

DiagnosticItem: 
   Parameter*    #list of variables
   Message       #string scoped by language which can include variables and simple path expressions

ConflictItem

ConflictItem: 
   RuleID             # reference to rule which generates conflict
   TestMessage        # string representing Test item from assertion
   ContextBinding*    # defines binding for variables from ContextItem 
   Message            # string
   DiagosticMessage*  # string 

NotifyItem

NotifyItem: 
   RuleID             # reference to rule which generates report
   TestMessage        # string representing Test item from assertion
   ContextBinding*    # defines binding for variables from ContextItem 
   Message            # string
   DiagosticMessage*  # string 

Syntax for TMCL-Rule

[gdm] To be done.

2.6 Combining TMCL-Rule and TMCL-Schema

TMCL-Rule and TMCL-Schema expressions can be combined in the same TMCL schema. It is also possible to insert rules inside of type descriptions. In this case rules have simplified syntax.

Issues

  1. What to do with variants? - drop in under TopicNameSchema [mn] Drop it under TopicNameSchema for now, and request other ideas.
  2. merging of schemas.

    Schema Merging: Given 2 topicmap schemas where they constrain topics of the same type that the new components are union of all the schema components.

    Explicit notion of conflict. Leave resolution open.

    Define notion of conflict explicitly for each schema construct.

    Need use cases for schema merging. ANDing or OR or some combination (selective) merging.

  3. Does merging of topic maps merge schemas. [mn] May need to define exactly when the merging takes place for TMs and schemas for those TMs.

    Connection from TM to Schema SuperClass Subclass in Topic Map , constraints in schema Dependency issues. 'Informative' connection between and tm and schema. Schema for connections bewteen map topic and schema Seperate doc on how to connect them. Annex? [mn] Seems too important for an annex.

  4. computed properties / inferences - when to execute
  5. Subclass-Of influence - All constraints on super class apply to subclass - tightening is allowed by sub type. - Ability to open up and not always tighten. - Use of exceptions? - Use of named constraints to remove restriction. New can then override.
  6. OWL
  7. disjoint from (add in some form)
  8. Same individual as - not in tmcl.
  9. check other OWL pieces.
  10. Virtual Ontology, OWL ont inclusion, use case - OWL compatablity.
  11. Topic Merging - conflicts identifed - resolution open.
  12. NOT - maybe
  13. List of allowed types in a map: psi for topic type plus use of strict on map related to tmcl in tms
  14. Default values for properties. Inference/computed property.
  15. Restrict that a given topic can only play a given role once in tm. Wife in Married. - role restrictions back into topic. Deal with conflicts.
  16. Is Instance of Topic Allowed in scope - needed topic tyep set, with instances.
  17. Closed collection Open collections.
  18. Which types can have a given occurrence - occurrence schema
  19. What can be reified. Again PSIs for TM constructs. Class must be disjoint - default.
  20. Syntax - In XTM files/ in with the ontology topic map? a nice textual syntax. composer a nice textual syntax. a nice textual syntax. CSS like, TMQL?, AsTMa=, ltm [mn] I think it would be good to have two syntaxes in the standard: xml and ltm.
  21. Is there a core model?, hope so.
  22. Overall picture - 1st draft.

2.7 Topic Map Representation of Constraints

[gdm] To be done.

2.8 Topic Map Schema References

TMCL enables topic map authors to specify a schema to which the topic map is conformant. This is achieved by reifying the topic map with a topic and assigning an occurrence to that topic of type TMCLSchemaReference.

The following PSI is used to denote the occurrence type for schema references.

http://www.isotopicmaps.org/tmcl/tmcl.html#TMCLSchemaReference

This is used to type a occurrence on a topic that reifies the topic map in order to reference the schema for this topic map instance. The value of the occurrence must reference a valid TMCL XML representation or a topic of type Schema.

2.9 Schema Composition

Schema composition is the ability to take two or more schemas and compose them into a single schema. Given that a schema consists of a set of constraints, schema composition merely takes all the constraints from all schemas being composed and returns a single schema that consists of all constraints. Applications are free to identify and remove redundant constraints and through exceptions should any constraints be contradictory.

More formally:

Given
  Schema : s1, s2
  Constraint : c1, c2, c3, c4, c5
  s1 := {c1, c2, c3}  
  s2 := {c4, c5}
That
  Compose(s1, s2) => s3
  s3 := {c1, c2, c3, c4, c5}

3 Summary

[gdm] To be done.

4 References

  1. [13250] ISO/IEC 13250:2002 Topic Maps. ISO, Geneva, 2002.
  2. [TMCL] Summary of Voting on SC 34 N 226 - CD Ballot for Topic Map Constraint Language (New JTC 1 NP Number - ISO/IEC 19756). ISO/IEC JTC1 SC34 N0259. 4 October 2002.
  3. [TMQLreq] TMQL requirements (1.2.0). ISO/IEC JTC1 SC34 N0448. Lars Marius Garshol, Robert Barta. 7 November 2003.
  4. [DM] Topic Maps - Data Model. ISO/IEC JTC 1/SC34N443. Lars Marius Garshol, Graham Moore, 2 November 2003.
  5. [W3C Schema] XML Schema Part 2: Datatypes W3C Recommendation, Paul V. Biron, Ashok Malhotra, World Wide Web Consortium. 10 May 2001.
  6. [LTM] Linear Topic Map Notation: Definition and introduction version 1.2, Lars Marius Garshol. Ontopia AS. 15 May 2002.
  7. [PSI] Published Subjects: Introduction and Basic Requirements, OASIS Published Subjects Committee Recommendation, Steve Pepper, Editor. OASIS. 24 June 2003.
  8. [XMLinfoset] XML Information Set, W3C Recommendation. J. Cowan and R. Tobin, Editors. World Wide Web Consortium. 24 October 2001.
  9. [OSL] Ontopia Schema Language 2.0, Reference Specification. Ontopia. 22 December 2003.
  10. [TMS] TMSchema, TMSchema - proposal for TMCL Lite. Dmitry Bogachev. 11 April 2004.
  11. [DSDL] Document Schema Definition Languages, Rule-based validation - Schematron Draft International Standard. ISO/IEC 19757-3. 2004.