Information Technology —

Document Description and Processing Languages

TITLE: Report of Official Foreign Travel to France
June 7-20, 2000
SOURCE: James David Mason
PROJECT: All SC34 projects
PROJECT EDITOR: All SC34 editors
ACTION: For information
DATE: 14 July 2000
DISTRIBUTION: SC34 and Liaisons
REPLY TO: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Information Technology Services
Oak Ridge Y-12 Plant
Bldg. 9113, M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 423 574-6973
Facsimile: +1 423 574-18964
Network: [email protected]

Ms. Marisa Peacock, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
11 West 42nd Street
New York, NY 10036
Tel: +1 212 642 4976
Fax: +1 212 840 2298
Email: [email protected]





Report of Official Foreign Travel to France
June 7-20, 2000

James David Mason
Internet, SGML, and Integration Services
Information Technology Services

July 11, 2000




Prepared by the
Oak Ridge Y-12 Plant
Oak Ridge, Tennessee 37831
managed by
Lockheed Martin Energy Systems, Inc.
for the
under contract DE-AC05-84OR21400



This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.



The Department of Energy (DOE) has moved rapidly toward electronic production, management, and dissemination of scientific and technical information. The World-Wide Web (WWW) has become a primary means of information dissemination. Electronic commerce (EC) is becoming the preferred means of procurement. DOE, like other government agencies, depends on and encourages the use of international standards in data communications. Like most government agencies, DOE has expressed a preference for openly developed standards over proprietary designs promoted as "standards" by vendors. In particular, there is a preference for standards developed by organizations such as the International Organization for Standardization (ISO) and the American National Standards Institute (ANSI) that use open, public processes to develop their standards.

Among the most widely adopted international standards is the Standard Generalized Markup Language (SGML, ISO 8879:1986, FIPS 152), to which DOE long ago made a commitment. Besides the official commitment, which has resulted in several specialized projects, DOE makes heavy use of coding derived from SGML: Most documents on the WWW are coded in HTML ("Hypertext Markup Language"), which is an application of SGML. The World-Wide Web Consortium (W3C), with the backing of major software houses like Adobe, IBM, Microsoft, Netscape, Oracle, and Sun, is promoting XML ("eXtensible Markup Language"), a class of SGML applications, for the future of the WWW and the basis for EC.

In support of DOE's use of these standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my June 2000 trip, I chaired the spring 2000 meeting of SC34 in Paris, France. I also attended a major conference on the use of SGML and XML and led a meeting of the International SGML/XML Users' Group (ISUG).

In addition to the widespread use of the WWW among DOE's plants and facilities in Oak Ridge and among DOE sites across the nation, there are several SGML-based projects at the Oak Ridge Y-12 Plant. Our local project team developed an SGML-based publications system that has been used for several major reports at the Y-12 Plant and Oak Ridge National Laboratory (ORNL). SGML is a component of the Weapons Records Archiving and Preservation (WRAP) project at the Y-12 Plant and is the format for catalog metadata chosen for weapons records by the Nuclear Weapons Information Group (NWIG). The "Ferret" system for automated classification analysis will use XML to structure its knowledge base.

Supporting standards development allows DOE and the Y-12 plant the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML and related topics.

Note: This report continues a series, the most recent of which, Y/ES-341, reported on the Spring 1999 meeting of SC34 in Granada, Spain. Other meetings of SC34 during 1999 did not result in foreign trip reports; copies of documentation for these meetings are available from the SC34 site on the WWW (

This report is available on the SC34 Web site at Hyperlinks in the online report connect it to the documents it references on both the SC34 site and at other locations, particularly W3C.


In the Joint Technical Committee on Information Technology (JTC1) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), the responsibility for standards in the area of Document Description and Processing Languages lies with ISO/IEC JTC1/SC34 (SC34), which I chair.

One of SC34's standards-SGML-is among the most widely used of all ISO standards. It was adopted by the European Community and the U.S. Department of Defense in the 1980s and by DOE soon afterwards. SGML has been widely used for industrial documentation, legal and insurance publishing, and in many other areas. Within DOE, the Nuclear Weapons Information Group (NWIG) has adopted SGML as the form for metadata in catalogs of weapons data at DOE sites.

SGML is the base on which HTML (, the coding convention for most documents on the WWW, was built. W3C has recently been promoting a more flexible approach to coding systems that they call XML (, which is a potentially very large class of SGML applications that is already becoming dominant in EC on the WWW. Because HTML, as a single SGML application, has only one set of tags to identify information elements, developers of WWW content have been frustrated with its limitations. XML, which allows users to develop new SGML applications with elements and tags designed to reflect their particular information needs, is gaining wide acceptance. Both Microsoft and Netscape support XML in their WWW browsers, a nd Adobe, IBM, Microsoft, Netscape, Oracle, Sun, and other major software houses support it across their product lines. The W3C has replaced HTML 4.0 with a new XML application, XHTML (

The other projects under development in SC34 are also of interest to the government because they can be used with SGML to develop comprehensive and powerful publications and information management systems. Among the most notable projects are DSSSL (ISO/IEC 10179:1996) and HyTime (ISO/IEC 10744:1992, 1997). W3C is using these standards as the basis for a suite of simplified standards to support stylesheets and linking for XML, just as the full standards support complex SGML applications. Another standard that is gaining attention recently is Topic Maps (ISO/IEC 13250), which allows links to be applied to documents without the need to modify them. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Because these standards have the backing of the major software houses, they are likely to have wide influence in DOE.

Spring Meeting of ISO/IEC JTC1/SC34, Paris France

The SC34 meeting was held at the Palais de Congrès de Paris in Paris. The attendance at the spring meeting of SC34 included 23 experts representing 10 national bodies (Canada, France, Germany, Japan, Korea, the Netherlands, Norway, Sweden, the United Kingdom, and the United States) and two external liaison bodies (SGML Users' Group and ISO TC184/SC4, Industrial Data).

The opening plenary was held on Saturday, 10 June 2000, with reports from national bodies, liaison organizations, and project editors. After the opening plenary, SC34 broke into its component Working Groups: Markup Languages (WG1), Information Presentation (WG2), and Information Association (WG3). In past years, SC34 met the week before the GCA conferences; this year we chose to hold the SC34 plenaries at the beginning and end of the conference, with WG meetings scheduled at free intervals during the conference.

Working Group Meetings

WG1: Markup Languages

SC34's oldest ISO standard, SGML (ISO 8879:1986), the basis for many other SC34 standards as well as for the W3C's XML, XSL, and Xlink/XPointer/XPath, is stable and well supported. SC34 has published two Technical Corrigenda (TCs) to SGML to support internationalization of text (through UNICODE/ISO 10646) and to formalize expression of some of the constraints imposed on applications by XML.

At this meeting, SC34/WG1, only reviewed the status of the standard and undertook no development work. The WG1 convenor presented the SC34 report at the standards update session at the GCA conference.

WG2: Information Presentation

SC34/WG2 continued maintenance of its standards on fonts and related topics. WG2 presented a demonstration of a new commercial product that is the first complete implementation of DSSSL. WG2 is planning a meeting in Canada before the next SC34 meeting to work on a revision of DSSSL.

The Recommendations of the WG2 meeting are available online at

WG3: Information Association

SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. HyTime (ISO/IEC 10744), a standard for manipulating multimedia data and hyperlinking diverse forms of information, continues to inspire new applications. A new standard, Topic Maps (ISO/IEC 13250,, published earlier this year, uses the HyTime techniques that support the ability to add navigational tools to existing bodies of information. The Topic Map approach draws on a variety of sources, including thesaurus design and cataloging techniques from information science. What the HyTime technology adds to these is noninvasive means of adding metadata to a variety of sources and the means of creating new information assets from this metadata. The Topic Maps standar d is attracting wide attention and was the subject of a major track at the GCA conference.

Results of the Meeting

SC34 is pleased that its standards continue to attract attention and new applications. The group is particularly pleased by the high level of participation in its work and by the explosion of attention to Topic Maps.

The Resolutions of the SC34 Meeting ( are available online as formal statements of the accomplishments of the meeting. The SC34 library also includes the Report of the SC34 Secretariat (, which lists all the formal projects in SC34 and their editors.

Conference: XML Europe 2000

The Graphic Communications Association (GCA, an affiliate of Printing Industries of America) has been a supporter of SGML and its applications from the earliest days. Their conferences on SGML-related topics had grown steadily over the years, but the arrival of first HTML and then XML has caused an explosion of participation in both North America and Europe. A measure of the success of XML is the number of major computing suppliers that have links to XML topics featured on their WWW sites (e.g., Adobe, IBM, Microsoft, Oracle, Sun, Xerox) and are distributing free XML software (IBM, Microsoft, Sun).

The conference, which generally had six concurrent tracks, was too vast for me to absorb by myself (I have the proceedings in both paper and electronic form for anyone wanting to inspect them). Much of the attention at the conference (and the associated vendor showcase) is on EC technology. Many vendors are showing tools for putting existing databases and product catalogs on the Web using XML technology. However, there also seems to be a resurgence of some of the traditional SGML/XML applications, such as high-quality publishing. Occasionally the two streams merge, as when a publisher such as Barnes and Noble moves to make their traditional products available in electronic form (for "e-books") through their online bookstore. Another area that continues to attract attention is healthcare informatics; Europe seems to be leading the way, with resistance continuing from HMOs in this country. There continues to be interest in the application of SGML/XML to represent product da ta from STEP/EXPRESS (work begun in SC34 and continued in TC184/SC4).

The track on Topic Maps drew so much attention that GCA is thinking of adding a new conference just on that area next year. I attended almost all the sessions, looking for refinements in my ideas about how to apply Topic Maps to local projects and for tools to aid in the manipulation and visualization of data represented in maps.

The conference was quite lively, and interest in the SGML/XML world continues rapid growth.

ISUG (International SGML/XML Users' Group)

The SGML Users' Group was formed at GCA's 1984 conference at Oxford University. Incorporated as ISUG, a nonprofit organization with offices in the United Kingdom, it now has branches in most Western European countries, as well as affiliates in the U.S. and Canada ( ISUG regularly sends a delegation to SC34 meetings and provides editors for several standards. This is my second year as president of ISUG. At the Annual General Meeting, held in conjunction with XML Europe, we discussed ways of improving our outreach and services to members. One new service may be a reduced rate for individual memberships in OASIS (Organization for the Advancement of Structured Information Standards), an industry consortium developing applications of SGML, XML, and other standards. Copies of the ISUG newsletter are available in my office.

Conclusion and Recommendations

The world of SGML appears to be quite healthy, whether one looks at the fundamental level of standards development or surface layers of application.

Although DOE has been involved with SGML and its predecessors since the late 1970s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, and other mission-sensitive information, DOE should take an active position on the development and use of SGML-related standards.

The rapprochement between SGML/XML and STEP/EXPRESS, as reflected in the joint work from SC34 and TC184/SC4, has a potential for benefit to DOE and Defense Programs. As DOE's weapons programs increasingly call for electronic data capture, there is a need for stable mechanisms for both capturing and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of collecting the data do not offer adequate assurance that that the data will continue to be usable. Adoption and implementation of standard methods based in SGML/XML should be a high priority for DOE.

The application of XML and Topic Maps to knowledge management in projects such as that for the Ferret classification engine should be pursued. This technology will aid the creation and maintenance of knowledge bases and the extension of the Ferret engine beyond its current local application.

Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34's work, particularly the work on Topic Maps. As DOE's use of these standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers' groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Extension of the NWIG metadata system and construction of a comprehensive records system such as that proposed by the Y-12 Plant's WRAP project can profit from DOE's future support of SGML/XML. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. The Y-12 Plant, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application.


Future meetings

SC34 has the following meetings scheduled for the next year:






2-7 December 2000




May 2001

Berlin, Germany


Project meetings may also be scheduled between SC34 meetings.

SC34 continues to schedule most of its meetings in conjunction with conferences sponsored by GCA. These conferences generally deal with SGML, XML, HyTime, DSSSL, and related topics; combining meetings with the GCA conferences allows a reduction in the number of trips for experts who participate in both activities. My travel to this meeting was supported in part by GCA.