ISO/IEC JTC 1/SC 34N0632

ISO/IEC logo

ISO/IEC JTC 1/SC 34

Information Technology --
Document Description and Processing Languages

TITLE: Disposition of comments on SC34 N0617: Document Schema Definition Languages (DSDL) - Part 7: Character Repertoire Validation Language (CRVL)
SOURCE: Mr. Martin Bryan
PROJECT: CD 19757-7: Document Schema Definition Languages (DSDL) Part 7 - Character repertoire validation
PROJECT EDITOR: Mr. MURATA Makoto [FAMILY Given]
STATUS: Agreed disposition of comments
ACTION: Editors to create next CD text
DATE: 2005-05-23
DISTRIBUTION: SC34 and Liaisons
REFER TO: N0593b - 2005-02-19 - Ballot due 2005-05-19 CD 19757-7 Document Schema Definition Language (DSDL) - Part 7: Character Repertoire Validation Language (CRVL)
N0593 - 2005-02-19 - Document Schema Definition Language (DSDL) - Part 7: Character Repertoire Validation Language (CRVL)
N0617 - 2005-05-20 - Summary of Voting on JTC 1/SC 34 N 593 - Document Schema Definition Language (DSDL) - Part 7: Character Repertoire Validation Language (CRVL)
REPLY TO:

Dr. James David Mason
(ISO/IEC JTC 1/SC 34 Chairman)
Y-12 National Security Complex
Bldg. 9113, M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
Network: [email protected]
http://www.y12.doe.gov/sgml/sc34/
ftp://ftp.y12.doe.gov/pub/sgml/sc34/

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: [email protected]
http://www.jtc1sc34.org



Comment Disposition for the CD ballot of DSDL Part 7 (Amsterdam, 2005 May)

ISO/IEC JTC1 SC34 WG1

2005 May

Canadian Comments

Canadian Comments Disposition

In section 4, the notation for "unknown whether character x is in collection A or not" should be "unknown(x, A)".

Accept.

On page 2, in the last Note on this page, the reference is missing from the text.

Accept. Reference to XML Schema Part 2.

In section 7.9, believe the first word, "Some" should be "See".

Accept

Japan

Japanese Comments Disposition

(1) Reference to the second edition of XML Schema Part 2.

Accept.

(2) Introduce an annex for the use of CRVL from Schematron.

Accept.

(3) Kernels and hulls should be more clearly presented in Section 5.

Accept. The editor is instructed to make the text much clearer.

(4) The intention of each operator should be presented in prose.

Accept.

(5) The semantics of the alt operator is doubtful. We propose that

a) in(x, A B) hold when 
- in(x, A) and not notin(x, B), or 
- in(x, B) and not notin(x, A)

 and

b) unknown(x, A B) hold when
 - unknown(x, A) and unknown(x, B),
 - in(x, A) and notin(x, B), or
 - in(x, B) and notin(x, A)

WG1 failed to provide a reasonable definition of the alt operator. In particular, we cannot make the alt operator return appropriate results when the CRDL processor cannot retrieve external collections and thus use "unknown" instead.

Therefore, drop the alt, kernel, hull operators and introduce an operator for creating a character collection from two character groups.

(6) When dereferencing the uri of a ref element causes a network error, what should happen? We propose that the processor should report an error and should be allowed to continue normal processing by assuming that "unknown" holds.

Accept.

(7) The intention of named collections is unclear. Are they represented by CRVL schemas that can be obtained by dereferencing http-URIs? Or, are they allowed to be imaginary? In particular, how do we describe "iso-8859-1" (to be precise, the set of characters in this charset) in CRVL?

Since XML already allows IANA charsets and other implementation-dependent encodings as the value of the encoding declaration, we will allow IANA charsets and other implementation-dependent encodings as named collections and introduce an operator (e.g., <namedCollection encoding="shift_jis"/>) for representing named collections. Conformant implementations need not implement all charset names.

(8) Introduce CRVL processors and define conformance.

Accept. In particular, (1) make the support of IANA charsets optional, and (2) make the support of graphemes optional.

Turkish Comments

Turkish Comments Disposition

in(x,A) has two different meanings. I believe this a typo.

We appreciate this comment. Yes, it is a typo.

the assertion of "in(x,<hull>A</hull>) does not hold" may not necessarily hold. It does not seem to be a sound statement. Especially, given that in(x,<kernel>A</kernel>) holds, in(x,<hull>A</hull>) must hold.

Because of the Japanese comment on the alt operator, we decided to remove the kernel and hull operators.

the assertion of "unknown(x,<kernel>A</kernel>) when in(x, A) or unknown(x, A)" does not make sense. wheneever the predicate in(x,A) holds, in(x,<hull>A</hull>) should hold as well, but not at the same time unknown(x,<kernel>A</kernel>).

Because of the Japanese comment on the alt operator, we decided to remove the kernel and hull operators.

the assertion of "notin(x,<kernel>A</kernel>) does not hold" may not necessarily hold. It does not seem to be a sound statement.

Because of the Japanese comment on the alt operator, we decided to remove the kernel and hull operators.

the assertion of "unknown(x,<kernel>A</kernel>) when notin(x, A) or unknown(x, A)" is not sound because whenever notin(x,A) becomes true unknown(x,<kernel>A</kernel>) does not hold.

Because of the Japanese comment on the alt operator, we decided to remove the kernel and hull operators.

the def of 7.7 offers no difference from that of 7.4.

Because of this comment and the Japanese comment, the alt operator is dropped.

Comments raised durint the meeting

Comments raised during the meeting Disposition

1)There is no way of uniquely identifying either a hull or a kernel in a way that would allow parts of a character set definition to be reused. In particular it should be possible for a URI specified in a <ref> element to be able to refer to a specific hull or kernel (and probably union, intersection, difference and alt) in an existing definition.

Allow xml:id as an optional attribute for every element so that fragment identifiers may use their values.

2)Only one character collection can be defined in a file. This limitation means that it is not possible to build up useful collections of collections. It should be possible to create a file that defines multiple sets of characterCollection elements. Each characterCollection should be required to have a unique identifier through which it can be referenced in a <ref> element.

Allow xml:id as an optional attribute for every element so that fragment identifiers may use their values.

3)There are no examples of how to use namedCollections based on existing UCS character sets. The named specified has, according to the definition, to be entered as a URI. UCS provides no URIs that can be used for this purpose. Without examples it is not possible to determine how to correctly reference UCS character set names, especially if early versions of US are identified in minUCSversion.

Accept in principle. Some examples illustrating the use of charsets shall be added.

Foreword Parts 7 and 8 are incorrectly named and have no acronym provided. (The versions used in Part 4 must be repeated in Part 7.)

Accept.

Page 1 The acronym CRVL should be appended to the title (as in Part 4)

Accept, but the name of this language is Character Repertoire Description Language and the acronym is CRDL.

2 Normative References A specific version of the Unicode standard should be referenced, together with at least the name (and/or URL) of the publisher.

Accept in principle. Since Unicode 1 is significantly incompatible with Unicode 2 (the reallocation of Korean characters), we have to be specific. The editor is instructed to provide a proposal.

3 Terms and Definitions Please add definitions for character (distinguishing it from glyph and code point), hull and kernel.

Accept.

5 Kernel and Hull In the note, and in subsequent notes that reference the same document, place the title of the referenced document in italic (or have the text in italic and the title in roman).

Accept.

A pictorial representation of a hull with a set of kernels, some of which are unions or intersections of kernels, should be added to clarify the concepts being used. (If you number the kernels the numbers could be referenced in subsequent clauses.)

Accept.

7.2 to 7.10 Add explanatory sentence at start of each clause, and provide example of use. (Examples should include things like ranges of characters, etc, to add realism.)

Accept.

Introduce a mechanism for handling grapheme collections.

Accept. The editor is instructed to provide the syntax. The support of grapheme collections shall be optional.