ISO/IEC JTC 1/SC34 N0245


Information Technology --

Document Description and Processing Languages

Title: Report of Official Foreign Travel
to Canada, 10-19 August 2001
Source: James D. Mason, Chairman, JTC1/SC34
Project: All SC34 Projects
Project editor: All SC34 Editors
Status: This report was submitted to the U.S. Department of Energy and the National Nuclear Security Agency as part of the requirements for official travel by the author.
Date: 12 September 2001
Distribution: SC34 and Liaisons
Refer to:
Supercedes: SC34 N228
Reply to: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mailk: mailto:[email protected]

Ms. Sara Hafele, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
11 West 42nd Street
New York, NY 10036
Tel: +1 212 642 4976
Fax: +1 212 840 2298
E-mail: [email protected]




Report of Official Foreign Travel to Canada
10-19 August 2001
10-19 August 2001

James David Mason
Internet, SGML, and Integration Services
Information Technology Services

5 September 2001

Prepared by the
Y-12 National Security Complex
Oak Ridge, Tennessee 37831
managed by
BWXT Y-12, L.L.C.
for the
under contract DE-AC05-00OR22800


Report of Official Foreign Travel to Canada
10-19 August 2001


This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.


In support of DOE's use of SGML, XML, HTML, and related standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my August 2001 trip, I attended the summer 2001 meeting of SC34/WG3 in Montréal, Canada. I also attended Extreme Markup Languages 2001, a major conference on the use of SGML and XML sponsored by IDEAlliance, and participated in the reorganizational meeting of, which is in transition from being an independent group to becoming a Member Section of OASIS (Organization for the Advancement of Structured Information Standards).

Supporting standards development allows the Department of Energy/National Nuclear Security Administration (DOE/NNSA) and the Y-12 National Security Complex (Y-12) the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and related topics.

Note: This report continues a series, the most recent of which, Y/WPP-017, reported on the Spring 2000 meeting of SC34 in Berlin, Germany. Copies of documentation for all SC34 meetings are available from the SC34 site on the Web: ( This report is available on the SC34 Web site at Hyperlinks in the online report connect it to the documents it references.


Over the course of the past two decades, SGML (Standard Generalized Markup Language, ISO 8879:1986) and its applications, including HTML (Hypertext Markup Language), and profiles, most notably XML (Extensible Markup Language), have come to dominate the interchange and use of structured data. SGML and many of the standards related to it were developed and are maintained by ISO/IEC JTC1/SC34 (SC34), which I chair.

The SC34 project gaining the most attention recently is Topic Maps (ISO/IEC 13250:2000), which describes metadata structures for organizing and indexing large collections of information resources. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Topic Maps are being used in the knowledge base for the Ferret analytical engine developed at Y-12 and are being investigated as a mechanism for maintaining and publishing classification guidance on a DOE-wide basis. Topic Maps also have good potential as a structuring tool in other knowledge-preservation activities.

In August 2001, I attended a series of meetings in Montréal related to the support of SC34 standards and their application. SC34's Working Group 3 (SC34/WG3), Information Association, which is responsible for Topic Maps, met on Saturday, 11 August. The Extreme Markup Languages 2001 conference, sponsored by IDEAlliance, followed during the next week. On Saturday, 18 August, held a restructuring meeting.

Summer Meeting of ISO/IEC JTC1/SC34/WG3, Montréal, Canada

The SC34/WG3 meeting on Saturday, 11 August 2001 was attended by nine experts representing five countries (France, Germany, Norway, the United Kingdom, and the United States) and one external liaison body (International SGML/XML Users' Group). The meeting was chaired by Steve Pepper, Convenor of WG3 and head of the Norwegian delegation to SC34.

SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. The new Topic Maps (ISO/IEC 13250, standard, published last year, occupies most of WG3's effort. At recent meetings WG3 has concerned itself with starting new projects for models (conceptual, data, processing) and languages (query, constraint) for Topic Map support. At this meeting, because some of these projects are still out for initial approval ballot, WG3 discussed issues of general concern, such as determining the relationships among the support facilities and planning a course of development that makes the best use of resources.

Two of the editors of ISO/IEC 13250 (Michel Biezunski and Steve Newcomb) presented a defect report that had been submitted by Japan (SC34 N238) and proposed a response to it (SC34 N239).

WG3 examined two documents related to Topic Map support models, a data model proposed by Lars Marius Garshol (SC34 N241), and a processing proposed model by Newcomb and Biezunski (SC34 N243). We decided to proceed with a Core Model (Level 0 of the overall Topic Map model) on the basis of the Newcomb and Biezunski document and an Infoset Model (Level 1) based on the core model, starting with the Garshol proposal. (An Infoset, short for "information set," might be seen as a collection of all the structural information a processor can extract from an SGML or XML document.) The model project may examine connections between the Topic Maps models and other modeling techniques, such as Express (the modeling language of ISO 10303, STEP) or UML and MOF (Unified Modeling Language and Meta Object Facility, from the Object Management Group).

ISO/IEC 13250 is specified in terms of HyTime (Hypermedia/Time-Based Structuring Language, SIS/IEC 10744). Although HyTime is immensely powerful and has heavily influenced other projects, such as the W3C's work on advanced hyperlinking, it has the reputation for being difficult to understand and apply. After the adoption of the Topic Map standard last year, a small group, a number of whom are active in SC34, began working on XTM, a project to create an XML interchange representation of Topic Maps, with hyperlinking according to W3C recommendations rather than full HyTime linking. The XTM development group, operating as ( has successfully completed its initial goals and is reorganizing (see below). At the Berlin meeting of SC34/WG3, it was decided to move the technical work on XTM models and interchange formats back into SC34, and that transfer is one of the sources from which the Topic Map models have begun. The XTM document type definition is out for ballot as a technical corrigendum for ISO/IEC 13250. At this meeting, WG3 endorsed establishing a liaison with when it completes its move to OASIS, a consortium in the structured-information industry (

The Recommendations of the WG3 meeting are available online at Documents distributed during the meeting are listed in Appendix C.

Conference: Extreme Markup Languages 2001

The Graphic Communications Association (GCA, started as an affiliate of Printing Industries of America, has been a supporter of SGML and its applications from the earliest days. Their conferences on SGML-related topics had already grown steadily over the years, but the arrival of first HTML and then XML has caused an explosion of participation in both North America and Europe. Earlier this summer, GCA became separate from PIA and changed its name to IDEAlliance. Extreme Markup Languages is IDEAlliance's most technical conference in the area of SGML, XML, and related technologies.

This year's Extreme Markup Languages conference in Montréal continued several themes from last year's, particularly the nature of markup languages, schema languages, and the relationship between RDF and Topic Maps.

Last year's conference, with Michael Sperberg-McQueen's keynote on the "Meaning and Interpretation of Markup" and Allen Renear's "The Descriptive/Procedural Distinction Is Flawed," seems to have started a trend of considering what SGML and XML actually mean. This year Wendell Piez took up the theme with his "Beyond the Descriptive vs. Procedural Distinction," taking a rhetorical approach to both the intent and the meaning of markup and providing a new look at validation strategies. In "XML, Stylesheets and the Remathematicalization of Formal Content," Andrea Asperti and colleagues at the University of Bologna examined formal proofs and the means of linking the logical content of mathematical expressions to their presentation. Schema languages for XML have proliferated recently, and at the conference they underwent formal analysis from Henry Thompson and Richard Tobin, as well as from Makoto Murata. (Murata also presented the most recent status of RELAX NG, which merges his RELAX schema language with James Clark's TREX.) The theoretical theme of the conference filtered its way down into specific technologies like Topic Maps, as could be seen in "Towards a General Theory of Scope," by Steve Pepper and Geir Ove Grønmo.

Elaine Svenonius's keynote, "The Intellectual Foundations of Knowledge Representations," which presented an ontology of cataloging and knowledge-representation systems, set the tone for much of the conference. More than a third of the presentations at the conference dealt with knowledge representation, usually through some aspects of Topic Maps or RDF. Indeed, a third of those knowledge-related presentations dealt with Topic Maps and RDF. The SGML community started out just trying to capture documents in an enduring electronic format. Now we've moved far beyond that: we're concerned with what those documents mean-and what the documents we can't capture mean. SGML and XML have moved beyond being metagrammars for tagging text and have spawned metalanguages for navigating information.

Another major theme of the conference was information transformation. There were a number of presentations and tutorials on XSL and related topics. Papers like "XSL and Hyperdocuments: Applying XSL to Arbitrary Groves and Hyperdocuments," by Eliot Kimber and his colleagues at DataChannel, show how transformation has moved far beyond simply applying stylesheets to prepare documents for printing.

The conference was quite lively, and there is a continuation of rapid growth in interest in the SGML/XML world and, more importantly, support for SGML/XML applications.

As mentioned above, is the operating name of the group that created the XTM (XML Topic Map) interchange specification. Having accomplished its initial goal of developing XTM, it has passed the technical work back to SC34/WG3. The group is now in the process of becoming a member section of OASIS. As part of OASIS, will be able to promote the use of Topic Maps and develop applications and profiles for using Topic Maps.

(The name is connected to a Web site, that supported the XTM effort. The Internet domain name "" belonged to Michel Biezunski, who has transferred it to OASIS, which will support the Web site on their servers. OASIS is an established industry consortium that was formerly known as SGML Open. OASIS, in cooperation with the United Nations, has developed the ebXML electronic-business specification. They have cosponsored numerous conferences with GCA , and one of their committees is responsible for maintenance of DocBook, an XML document specification that is widely used in publishing for the computer industry.)

As part of the reorganization under OASIS, is developing a new charter that will be submitted to the OASIS board of directors. At this meeting, we selected interim officers: Eric Freese, the chairman of the earlier organization, is the interim chairman of the new one. Steve Pepper, the Convenor of SC34/WG3, is the interim marketing lead. I am the interim technical lead, responsible for coordinating any technical committees we create.

Conclusion and Recommendations

The world of SGML appears to be quite healthy, whether one looks at the fundamental level of standards development or surface layers of application.

Although DOE has been involved with SGML and related standards since the late 1970s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, interpretive knowledge bases, and other mission-sensitive information, DOE should take an active position on the development and use of SGML-related standards.

The growth of Topic Maps and other XML-based mechanisms for knowledge engineering has potentially great impacts on mission-critical information for DOE and NNSA. As NNSA's weapons programs increasingly call for electronic data capture, there is a need for stable mechanisms for both capturing and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of collecting the data do not offer adequate assurance that that the data will continue to be usable. Adoption and implementation of standard methods based in SGML/XML should be a high priority for DOE and NNSA.

The application of XML and Topic Maps to knowledge management in projects such as that for the Ferret classification engine should be pursued. This technology will aid the creation and maintenance of knowledge bases and the extension of the Ferret engine beyond its current local application. Application of Topic Maps to classification guidance at the Office of Nuclear Security Information should lead to better distribution of classification information within DOE and NNSA.

Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34's work, particularly the work on Topic Maps. As DOE's use of these standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers' groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Extension of the NWIG metadata system and construction of a comprehensive records system such as that proposed by Y-12's WRAP project can profit from DOE's future support of SGML/XML. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. Y-12, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application.


Future meetings

SC34 has the following meetings scheduled for the next year:






8-13 December 2001




10 March 2002




May 2002



Project meetings may also be scheduled between SC34 meetings.

SC34 continues to schedule most of its meetings in conjunction with conferences sponsored by GCA. These conferences generally deal with SGML, XML, HyTime, DSSSL, and related topics; combining meetings with the GCA conferences allows a reduction in the number of trips for experts who participate in both activities. My travel to this meeting was supported in part by GCA.

