TITLE: |
TMQL requirements (1.0.0) |
SOURCE: |
Hans Holger Rath and Lars
Marius Garshol |
PROJECT: |
|
PROJECT EDITOR: |
|
STATUS: |
|
ACTION: |
For information and review |
DATE: |
2001-08-29 |
DISTRIBUTION: |
SC34 and Liaisons |
REFER TO: |
The attached file is in HTML format. |
REPLY TO: |
Dr. James David Mason |
Editors: |
Hans Holger Rath,
empolis GmbH |
Version: |
1.0.0 |
Last changed: |
2001-08-23 |
This document sets down the requirements that will guide
the work with the Topic Map Query Language (TMQL), a query language for topic
maps that is to become an ISO standard. These requirements document the
intentions of the standards editors, as informed by the user community. The
purpose of this document is to make it clear what can be expected to come out
of the TMQL process, and to encourage the user community to make their needs
known to the editors.
This document has requirements for the TMQL standard as
a whole, and for the query part of TMQL in particular. Additional requirements
for the modification part of TMQL will have to be defined at a later stage.
The following key words are used to indicate the degree
of certainty associated with each particular requirement:
Shall
Means that the requirement is absolute.
Should
Means that the requirement is a goal.
May
Means that the requirement is
considered important, but that it is not yet clear whether TMQL should conform
to it or not.
Please note that some requirements are only implicitly
specified through the form of other requirements. This document must be read
with care.
Feedback on this requirements document is requested.
This section contains the main TMQL requirements in
summarized form for easy reference:
1.
TMQL shall have a concise and human-readable syntax.
2.
The execution of TMQL queries shall be defined in terms of
operations on an abstract data model for topic maps and possibly also an
environment (about which, see section 3.4).
3.
TMQL query results shall be instances of an abstract TMQL data
model (see requirement 3.3:2).
4.
TMQL shall be independent of any particular interface between
clients and the query processor.
5.
TMQL shall support all natural languages equally well. That
is, TMQL shall be fully internationalized with respect to text representation,
text ordering, etc. (See also requirements 3.5:7 and 3.5:8.)
6.
The TMQL standard shall be defined in two parts, first one
with querying only, then one adding support for modifications.
7.
The TMQL standard shall not unduly constrain the form of
implementations.
8.
The TMQL standard shall be formal, fully define the results of
queries (so that any given query can only have one correct result in any given
context) and, in so far as possible, be human-readable.
9.
TMQL shall be usable across a wide range of foreseeable
platforms and applications over an extended lifetime (20-50 years).
The following general requirements apply, in addition to
those already mentioned.
1.
TMQL queries shall be able to span multiple topic maps.
2.
The TMQL standard should be defined based on a set of use
cases representing general classes of queries expected to be common.
3.
The TMQL standard shall define error situations, and how TMQL
processors are required to react to them.
4.
The TMQL language shall be extensible. TMQL shall define
controlled mechanisms for third-party extensions, e.g., domain-specific
extensions.
5.
The TMQL standard shall contain a conformance clause, stating
the conditions under which TMQL implementations may claim to conform to the
standard.
6.
The TMQL language may support the definition and use of logic
inferencing rules.
7.
The TMQL standard may define a distributed addressing scheme
for topic maps.
The following requirements apply to the TMQL syntax,
beyond those stated above.
1.
The syntax shall be defined in terms of a formal grammar.
2.
The TMQL query syntax shall be designed to be easily
embeddable into XML documents and programming language source code.
3.
The syntax should be designed so that queries expected to be
common be as easy to write as possible.
4.
An XML syntax for TMQL queries may be defined.
The following requirements apply to the formal
underpinnings of the TMQL standard, beyond those already stated.
1.
TMQL shall not define its own data model, but be based on one
common to the entire family of topic map standards.
2.
TMQL may extend the common data model in order to be able to
represent query results which are not topic maps, but merely sets and lists of
topic map objects, resources, and perhaps also primitive values like strings
and numbers.
3.
The definition of TMQL should be based on an abstract query
algebra, which again should be based on the data model.
4.
The TMQL language should support nesting of queries to form
sub-queries.
5.
The TMQL language should support sorting query results by
parts of the query results.
6.
The algebra may include operators such as: merge, comparison,
boolean logic operators, set operators, matching by scope operators,
cardinality/count, string matching operators, and aggregation.
7.
TMQL may support returning associations, topic names, and
occurrences that were not present in the queried topic map(s).
8.
TMQL may define operations for constructing topic maps from
the results of TMQL queries.
The following concepts relating to the
self-containedness of queries have been identified. They are described here in
order to clarify the list of requirements following the concepts. Please note
that these concepts, and how they may apply to TMQL as it will be defined is
not yet understood, and the descriptions themselves do not constitute
requirements.
TMQL environment
This is the environment in which TMQL
queries are evaluated. It may contain things like ID-to-topic map-mappings,
identifier to variable/function/predicate/etc mappings, base URIs used to
resolve relative URIs, and so on.
Inter-query context
This is the execution context for TMQL
queries, as modified by previous queries. It is not clear what this context may
contain.
Intra-query context
This is the execution context for a
TMQL query, as modified or set up by the query itself. It may contain
identifier to value mappings, a base URI for resolving relative URIs,
specifications of nested queries, specifications of local functions/predicates,
and so on.
The following requirements apply to the
self-containedness of queries, in addition to the requirements above this
section.
1.
The TMQL standard shall clearly define how it uses each of the
concepts above, and whether it uses any of them.
2.
Any particular TMQL query shall contain all the information
necessary to interpret it (with respect to its TMQL environment, inter-query
context, and intra-query context).
3.
The TMQL standard should make use of a TMQL environment, and
an intra-query context to specify the interpretation of queries.
4.
The TMQL standard may make use of the notion of an inter-query
context to specify how queries are interpreted.
The TMQL standard is part of a larger family and
community of standards, and the following requirements apply to its integration
in this community.
1.
TMQL shall be based on ISO 13250.
2.
TMQL shall be based on the topic map data model currently
being defined. (Thus, TMQL will also support XTM.)
3.
URIs used in TMQL queries shall be normalized in the manner
defined by the topic map data model.
4.
If value equality for topic map objects is not defined by the
topic map data model it shall be defined by the TMQL standard.
5.
TMQL shall relate to the Topic Map Conceptual
Model through the topic map data model.
6.
TMQL shall be harmonized with the Topic Map Constraint
Language currently being defined, in the sense that there shall be no overlap
of functionality, the standard shall be compatible, and features from TMCL
shall be reused in TMQL wherever suitable..
7.
The character set of TMQL shall be Unicode.
8.
Ordering of strings in TMQL shall be based on
externally-defined specifications for internationalized string collation.
Candidates are the
Unicode Collation Algorithm, and ISO 14651
9.
The syntax of URIs within TMQL queries shall be governed by
the rules of RFC 2396.
This section lists some general classes of queries which
TMQL may support. The queries are only informally defined since there currently
is no data model to define them in terms of. This list has not been tested for
completeness.
1.
Find all topics with specific names whose scopes match a
specific scope.
2.
Find all topics with specific resources as occurrences whose
scopes match a specific scope.
3.
Find all topics playing one of a set of roles in an
association of one of a set of types whose scopes match a specific scope.
4.
Find all topics playing one of a set of roles in an
association of one of a set of types, where one of a set of topics plays one of
a set of roles, whose scopes match a specific scope.
5.
Find the topic that has a specific resource as one of its
subject indicators.
6.
Find the topic that has a specific resource as its subject
address.
7.
Find all topics that play one of a set of roles in instances
of one of a set of association types.
1.
Find all associations whose scopes match a specific scope.
2.
Find all associations that are instances of a specific type.
3.
Find all associations where one of a set of topics play any
role, and whose scopes match a specific scope.
4.
Find all associations where one of a set of topics play one of
a set of roles, and whose scopes match a specific scope.
1.
Find the object that has a specific resource as its source
locator.
2.
Find all objects that are direct instances of a specific type.
3.
Find all objects that are instances of a specific type or any
of its subtypes.
1.
Find all the names of the topics in a particular set of topics,
whose scopes match a particular scope.
2.
Find all the occurrences of the topics in a particular set of
topics, whose scopes match a particular scope.
3.
Find all the occurrences of any of a particular set of types
of the topics in a particular set of topics, whose scopes match a particular
scope.
4.
Find all the resources that are subject indicators of the
topics in a particular set of topics.
5.
Find the resources that are the addressable subjects of the
topics in a particular set of topics.
Requirements listed in this section are for various
reasons not in the scope of TMQL.
1.
The TMQL standard shall not include an API to query processors
in parts 1 (querying) or 2 (modification) of the standard. One may be defined
in later parts.
2.
The TMQL standard shall not define mechanisms for specifying
validity constraints on topic maps. It may be used by other specifications and
software to define such constraints.
3.
The TMQL standard shall not define a natural language query
interface.
This document is based on input from
·
many TopicMaps.Org
meetings,
·
many ISO SC34 meetings,
·
the topicmapmail
and xtm-wg, and tmql-wg mailing lists, and
·
it is also based on work done by a former editor, Ann
Wrightson.
Changes from version 0.9.0 to version 1.0:
·
The word 'update' was replaced with 'modification' in
section 1, second paragraph.
·
Links were added to the text of requirements 3.5:1,
3.5:2, 3.5:6, and 3.5:7.
·
Requirements 2:2, 2:3, 2:6, 2:8, 3.1:4, 3.4:2, 3.5:4,
3.5:6, and 4:1 were modified.
·
Requirements 3.1:7 and 3.2:4 were added.
Changes from version 0.8.2 to version 0.9:
·
Section 3.4 was added.
·
Sub-section numbers were added.
·
Prose of introductory paragraphs slightly improved.
·
Requirements 2:3, 2:6, 2:8, 2:9, 3.1:3, 3.1:4, 3.2:2,
and 3.3:6 were modified.
·
Requirements 3.1:5, 3.1:6, 3.3:4, 3.3:5, 3.5:9, and
3.6.4:3 were added.
·
Revision history was added.