ISO/IEC JTC 1/SC 34N0851

Information Technology --
Document Description and Processing Languages

TITLE: Disposition of Comments to JTC 1/SC 34 N 799 - Text for CD ballot for ISO/IEC 19757-7: Document Schema Definition Language (DSDL) Part 7 - Character Repertoire Description Language (CRDL)
PROJECT: CD 19757-7: Information technology - Document Schema Definition Languages (DSDL) - Part 7: Character Repertoire Description Language (CRDL)
STATUS: Disposition of comments
ACTION: For information
DATE: 2007-03-24
DISTRIBUTION: SC34 and Liaisons
REFER TO: N0799 - 2006-11-21 - Text for CD ballot for ISO/IEC 19757-7: Document Schema Definition Language (DSDL) Part 7 - Character Repertoire Description Language (CRDL)
N0818 - 2007-02-22 - Summary of Voting on JTC 1/SC 34 N 799 - Text for CD ballot for ISO/IEC 19757-7: Document Schema Definition Language (DSDL) Part 7 - Character Repertoire Description Language (CRDL)

Dr. James David Mason
(ISO/IEC JTC 1/SC 34 Chairman)
Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Disposition of Comments to N818 Summary of Voting - Document Schema
Definition Language (DSDL) - Part 7: Character Repertoire Description
Language (CRDL)


23 March 2006

1. Canada

1) ISO/IEC 10646 should be referenced normatively in ISO/IEC 19757-7

Accept in princple, but ISO/IEC 10646 is already listed as a normative
reference in Clause 2.

2) The establishment of named collections should be done in
   consultation with ISO/IEC/JTC1 SC2

Accept.  SC34 will send a liaison report to SC2 and request for review
of the upcoming FCD.

3) Formally recognize ISO/IEC 10646 collections and collection names
   and ids and the Collection Registry as candidates for named

Accept.  The project editor is instructed to provide a mechanism for
referencing to fixed collections of 10646.  Furthermore, he is
instructed to provide mechanisms for referencing to the IANA registry,
the ISO/IEC 15897 cultural registry, and other registries.  These
mechanisms can specify either numbers or names (including aliases) 
of collections.

4) Formally recognize the IANA charset registry and the labels and
   collections derivable for each label as candidates for named

Accept.  Reference to the IANA regsitry as a normative reference in
Clause 2, but see the disposition of the second comment from Norway.

5) The IANA charset registry should be the source of collection labels
   and associated definitions instead of the referenced collections
   document from IANA.

Accept in principle.  However, Clause 8.7 is intended to allow 
charasets in the IANA charset registry already.

2. Japan

1) Drop grapheme clusters from the first version of this part since
   they make both standardization and implementations significantly

Accept.  As a result, the phrase "grapheme cluster" should not be used
in this standard.

The project editor is instructed to add a note that this part of 19757
can handle combining characters but cannot detect impermissible
combinations of base characters and combining characters (e.g., "b"
followed by the accent character).

Moreover, the upcoming FCD should provide a mechanism for
normalization as a pre-processor.  Users are allowed to specify which
normalization algorithm, or no normalization (default), shall be
applied, while implementations are allowed to skip normalization.
"W3C Character model for the Wold-Wide Web 1.0 (funamentals)" and
"Unicode Normalization Forms" (Unicode Standard Annex #15) shall be
referenced informatively so that any of the normalizations from W3C
and Unicode can be specified.

2) Introduce foreign elements and attributes using NVDL

The project editor is instructed to provide two sets of schemas: one
in RELAX NG and another using the combination of RELAX NG and NVDL.
Both schemas are normative.  No matter which of the two schemas is
used, validation results are guaranteed to be the same.  CRDL
mplementors are not required to implement NVDL.

3. Norway

1) Please be consistent with terminology of ISO 10646 on repertoires

Use "Repertoire" rather than "collection", because collections as
defined in 10646 does not have kernel/hull and also because the name
of this part is "Character Repertoire Description Language".  The
definition of "repertoire" in this part shall be consistent with
(though not identical to) that in 10646.

2) References to non ISO or IEC standards need clearance.

In response to other comments, references to W3C, Unicode, and IANA
are mandatory.  Moreover, several standards from SC34 already
reference to W3C and Unicode normatively.  However, the project editor
is instructed to make sure if a normative reference to the IANA
charset registry is allowed.

3) The ISO 10646 term "collection" does not include Unicode grapheme

Accept.  Drop grapheme clusters.

4) The term "character" is not the same in ISO 10646 and Unicode.

Accept, we normatively reference to 10646 (but not Unicode) for the
definition of characters.

5) We propose that terms that are used in this standard is based on
   other ISO standards.

Accept in principle.  Since grapheme clusters are dropped, the terms
in this part of 19757 are expected to be consistent with other ISO

6) That the regular expression definition varies from one Unicode
   version to another makes it difficult to use in an ISO standard.

Unfortunately, SC34 is handling XML, which is based on Unicode.  As a
result, it is not possible to stop using Unicode regular expressions
without dropping important features.

Clause 8.2 is intended to ensure that implementations can report an
error when they cannot use appropriate versions of Unicode.  However, 
Clause 8.2 does not refer to maxUcsVersion but refers to 
minUcsVersion only.  Clause 8.2 shall refer to maxUcsVersion.

The project editor is instructed to reference to regular expressions
of "XQuery 1.0 and XPath 2.0 Functions and Operators" from W3C rather
than XML Schema Part 2.  However, the project editor is instructed to
impose restrictions on regular expressions, as deemed necessarily.