Title: | Report of Official Foreign Travel to Germany, 17 May-1 June 2001 |
Source: | James D. Mason, Chairman, JTC1/SC34 |
Project: | All SC34 Projects |
Project editor: | All SC34 Editors |
Status: | This report was submitted to the U.S. Department of Energy and the National Nuclear Security Agency as part of the requirements for official travel by the author. |
Action: | |
Date: | 18 June 2001 |
Summary: | |
Distribution: | SC34 and Liaisons |
Refer to: | |
Supercedes: | SC34 N170 |
Reply to: | Dr. James David Mason (ISO/IEC JTC1/SC34 Chairman) Y-12 National Security Complex Information Technology Services Bldg. 9113 M.S. 8208 Oak Ridge, TN 37831-8208 U.S.A. Telephone: +1 865 574-6973 Facsimile: +1 865 574-1896 E-mailk: mailto:[email protected] http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm Ms. Sara Hafele, ISO/IEC JTC 1/SC 34 Secretariat American National Standards Institute 11 West 42nd Street New York, NY 10036 Tel: +1 212 642 4976 Fax: +1 212 840 2298 E-mail: [email protected] |
Y/WPP-017 |
|
Y-12 National Security Complex |
James David Mason
Internet, SGML, and Integration Services
SAIC
18 June 2001
Prepared by the
Y-12 National Security Complex
Oak Ridge,
Tennessee 37831
managed by
BWXT Y-12, L.L.C.
for the
U.S.
DEPARTMENT OF ENERGY
under contract DE-AC05-00OR22800
DISCLAIMER
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
The Department of Energy (DOE) and associated agencies have moved rapidly toward electronic production, management, and dissemination of scientific and technical information. The World-Wide Web (WWW) has become a primary means of information dissemination. Electronic commerce (EC) is becoming the preferred means of procurement. DOE, like other government agencies, depends on and encourages the use of international standards in data communications. Like most government agencies, DOE has expressed a preference for openly developed standards over proprietary designs promoted as "standards" by vendors. In particular, there is a preference for standards developed by organizations such as the International Organization for Standardization (ISO) and the American National Standards Institute (ANSI) that use open, public processes to develop their standards.
Among the most widely adopted international standards is the Standard Generalized Markup Language (SGML, ISO 8879:1986, FIPS 152), to which DOE long ago made a commitment. Besides the official commitment, which has resulted in several specialized projects, DOE makes heavy use of coding derived from SGML: Most documents on the WWW are coded in HTML (Hypertext Markup Language), which is an application of SGML. The World-Wide Web Consortium (W3C), with the backing of major software houses like Adobe, IBM, Microsoft, Netscape, Oracle, and Sun, is promoting XML (eXtensible Markup Language), a class of SGML applications, for the future of the WWW and the basis for EC.
In support of DOE's use of these standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my May 2001 trip, I chaired the spring 2001 meeting of SC34 in Berlin, Germany. I also attended XML Europe 2001, a major conference on the use of SGML and XML sponsored by the Graphic Communications Association (GCA), and chaired a meeting of the International SGML/XML Users' Group (ISUG).
In addition to the widespread use of the WWW among DOE's plants and facilities in Oak Ridge and among DOE sites across the nation, there have been several past and present SGML- and XML-based projects at the Y-12 National Security Complex (Y-12). Our local project team has done SGML and XML development at Y-12 and Oak Ridge National Laboratory (ORNL) since the late 1980s. SGML is a component of the Weapons Records Archiving and Preservation (WRAP) project at Y-12 and is the format for catalog metadata chosen for weapons records by the Nuclear Weapons Information Group (NWIG). The "Ferret" system for automated classification analysis uses XML to structure its knowledge base. The Ferret team also provides XML consulting to OSTI and DOE Headquarters, particularly the National Nuclear Security Administration (NNSA).
Supporting standards development allows DOE and Y-12 the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and related topics.
Note: This report continues a series, the most recent of which, Y/WPP-003, reported on the Spring 2000 meeting of SC34 in Paris, France. Other meetings of SC34 during 2000 did not result in foreign trip reports; copies of documentation for these meetings are available from the SC34 site on the Web: (http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm).
This report is available on the SC34 Web site at http://www.y12.doe.gov/sgml/sc34/document/0228.htm. Hyperlinks in the online report connect it to the documents it references on both the SC34 site and at other locations, particularly W3C.
In the Joint Technical Committee on Information Technology (JTC1) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), the responsibility for standards in the area of Document Description and Processing Languages lies with ISO/IEC JTC1/SC34 (SC34), which I chair.
One of SC34's standards-SGML (ISO 8879)-is among the most widely used of all ISO standards. It was adopted by the European Community and the U.S. Department of Defense in the 1980s and by DOE soon afterwards. SGML has been widely used in industrial documentation, legal and insurance publishing, and many other areas. Within DOE, the Nuclear Weapons Information Group (NWIG) has adopted SGML as the form for metadata in catalogs of weapons data at DOE sites.
SGML is the base on which HTML (http://www.w3.org/TR/html4/), the coding convention for most documents on the WWW, was built. W3C has recently been promoting a more flexible approach to coding systems that they call XML (http://www.w3.org/XML/Activity), which is a potentially very large class of SGML applications that is already becoming dominant in EC on the WWW. Because HTML, as a single SGML application, has only one set of tags to identify information elements, developers of WWW content have been frustrated with its limitations. XML, which allows users to develop new SGML applications with elements and tags designed to reflect their particular information needs, is gaining wide acceptance. Both Microsoft and Netscape support XML in their WWW browsers, and Adobe, IBM, Microsoft, Netscape, Oracle, Sun, and other major software houses support it across their product lines. The W3C has replaced HTML 4.0 with a new XML application, XHTML (http://www.w3.org/TR/xhtml1/).
The SC34 project gaining the most attention recently is Topic Maps (ISO/IEC 13250:2000), which describes metadata structures for organizing and indexing large collections of information resources. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Topic Maps are being used for the Ferret knowledge base and are being investigated as a mechanism for maintaining and publishing classification guidance on a DOE-wide basis. Topic Maps also have good potential in other knowledge-preservation activities.
The SC34 meeting was held at the Internationales Congress Centrum in Berlin, Germany. The attendance at the spring meeting of SC34 included 22 experts representing 8 countries (Canada, France, Germany, Japan, the Netherlands, Norway, the United Kingdom, and the United States) and two external liaison bodies (SGML Users' Group and ISO TC184/SC4, Industrial Data).
The opening plenary was held on Saturday, 10 June 2001, with reports from national bodies, liaison organizations, and project editors. After the opening plenary, SC34 broke into its component Working Groups: Markup Languages (WG1), Information Presentation (WG2), and Information Association (WG3). Following the pattern established in 2001, we held SC34 plenaries at the beginning and end of the GCA conference, with WG meetings scheduled at free intervals during the conference.
SC34/WG1 is responsible for SC34's oldest ISO standard, SGML (ISO 8879:1986), the basis for many other SC34 standards as well as for the W3C's XML suite of recommendations. SGML is stable and well supported. SC34 has published two Technical Corrigenda (TCs) to SGML to support internationalization of text (through UNICODE/ISO 10646) and to formalize expression of some of the constraints imposed on applications by XML.
At this meeting, SC34/WG1examined the current status of RELAX (Regular Expression Language for XML, http://www.xml.gr.jp/relax/), a method for defining XML applications that has been proposed as an ISO technical report. While generally sympathetic to the RELAX mechanism and its goals, WG1 raised procedural questions about how the report has been processed. WG1 also proposed to SC34 the development of a standard (rather than a technical report) for advanced application definition, based on the latest designs by the developer of RELAX and other similar projects.
SC34/WG2 continued maintenance of its standards on fonts and related topics, as well as of SPDL (Standard Page Description Language, ISO/IEC 10180) and DSSSL (Document Style Semantics and Specification Language, ISO/IEC 10179). New DSSSL applications are entering the commercial market, including one for printing Braille. The WG2 Convenor also presented a report on WG2 activities at the GCA conference.
The Recommendations of the WG2 meeting are available online at http://www.y12.doe.gov/sgml/sc34/document/0217.htm.
SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. The new Topic Maps (ISO/IEC 13250, http://www.y12.doe.gov/sgml/sc34/document/0129.pdf) standard, published last year, occupies most of WG3's effort. Since the Paris meeting of SC34, WG3 has had new projects for a Topic Map Conceptual Model and a Topic Map Query Language approved. At this meeting, WG3 proposed a project for a Topic Map Constraint Language.
ISO/IEC 13250 is specified in terms of HyTime (Hypermedia/Time-Based Structuring Language, SIS/IEC 10744). Although HyTime is immensely powerful and has heavily influenced other projects, such as the W3C's work on advanced hyperlinking, it has the reputation for being difficult to understand and apply. Since the adoption of the Topic Map standard last year, a small group, a number of whom are active in SC34, has been working on XTM, a project to create an XML interchange representation of Topic Maps, with hyperlinking according to W3C recommendations rather than full HyTime linking. Although the XTM development group, operating as TopicMaps.org (http://www.topicmaps.org/) has been successful in many of its goals, it has been considering reorganization. At this meeting of WG3, it was decided to move the technical work on XTM models and interchange formats back into SC34. In particular, the XTM document type definition has been proposed as a technical corrigendum for ISO/IEC 13250. (Other parts of the work of TopicMaps.org, such as promotion of the standard and support for the user community, will probably move to OASIS, the Organization for the Advancement of Structured Information Standards, a consortium in the structured-information industry, http://www.oasis-open.org/.)
The Recommendations of the WG3 meeting are available online at http://www.y12.doe.gov/sgml/sc34/document/0222.htm.
SC34 is pleased that its standards continue to attract attention and new applications. The group is particularly pleased by the high level of participation in its work and by the excitement that Topic Maps is generating. The increase in the number of projects related to Topic Maps and the consolidation of the technical work on the standard in SC34/WG3 reflects the maturing of this area of standardization.
The Resolutions of the SC34 Meeting (http://www.y12.doe.gov/sgml/sc34/document/0213res.htm) are available online as formal statements of the accomplishments of the meeting. The SC34 library also includes the Report of the SC34 Secretariat (http://www.y12.doe.gov/sgml/sc34/document/0205.doc), which lists all the formal projects in SC34 and their editors. Documents distributed during the meeting are listed in Appendix C.
The GCA (an affiliate of Printing Industries of America) has been a supporter of SGML and its applications from the earliest days. Their conferences on SGML-related topics had already grown steadily over the years, but the arrival of first HTML and then XML has caused an explosion of participation in both North America and Europe.
The conference, which generally had several concurrent tracks, was too vast for me to absorb by myself (I have the proceedings in both paper and electronic form for anyone wanting to inspect them). Much of the attention at the conference (and the associated vendor showcase) is on EC technology. Many vendors are showing tools for putting existing databases and product catalogs on the Web using XML technology. However, there also seems to be a resurgence of some of the traditional SGML/XML applications, such as high-quality publishing. As I have done at many earlier GCA conferences, I participated in a session in which standards-developing bodies (e.g., ISO, W3C) reported on the current state of their work.
The track on Topic Maps and knowledge management continues to draw attention, as it did last year in Paris. I attended all the sessions, looking for refinements for my ideas about how to apply Topic Maps to local projects and for tools to aid in the manipulation and visualization of data represented in maps. I presented a paper on the use of topic maps for building the knowledge base for the Ferret classification engine developed by Y-12. I had previously presented a preliminary approach to an XML knowledge base at an August 2000 GCA conference in Montréal. The current approach represents the entire knowledge base in the XTM application; the paper was well received. My paper, Y/WPP-011, is available online at http://www.y12.doe.gov/~mxm/open/Papers/Ferret.PDF.
The conference was quite lively, and there is a continuation of rapid growth in interest in the SGML/XML world and, more importantly, support for SGML/XML applications.
The SGML Users' Group was formed at GCA's 1984 conference at Oxford University. Incorporated as ISUG, a nonprofit organization with offices in the United Kingdom, it now has branches in most Western European countries (http://www.isgmlug.org/). ISUG regularly sends a delegation to SC34 meetings and provides editors for several standards, including HyTime and Topic Maps. This is my third year as president of ISUG. At the Annual General Meeting, held in conjunction with XML Europe, we discussed ways of improving our outreach and services to members. One new service may be a reduced rate for individual memberships in OASIS. We are also looking at ways to support the emerging Topic Map community. Copies of the ISUG newsletter are available in my office.
The world of SGML appears to be quite healthy, whether one looks at the fundamental level of standards development or surface layers of application.
Although DOE has been involved with SGML and related standards since the late 1970s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, interpretive knowledge bases, and other mission-sensitive information, DOE should take an active position on the development and use of SGML-related standards.
The growth of Topic Maps and other XML-based mechanisms for knowledge engineering has potentially great impacts on mission-critical information for DOE and NNSA. As NNSA's weapons programs increasingly call for electronic data capture, there is a need for stable mechanisms for both capturing and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of collecting the data do not offer adequate assurance that that the data will continue to be usable. Adoption and implementation of standard methods based in SGML/XML should be a high priority for DOE and NNSA.
The application of XML and Topic Maps to knowledge management in projects such as that for the Ferret classification engine should be pursued. This technology will aid the creation and maintenance of knowledge bases and the extension of the Ferret engine beyond its current local application.
Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34's work, particularly the work on Topic Maps. As DOE's use of these standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers' groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Extension of the NWIG metadata system and construction of a comprehensive records system such as that proposed by Y-12's WRAP project can profit from DOE's future support of SGML/XML. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. Y-12, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application.
SC34 has the following meetings scheduled for the next year:
Group | Dates | Location | Host | |||
SC34/WG3 | 11 August 2001 | Montréal | GCA | |||
SC34 | 8-13 December 2001 | Orlando | GCA | |||
SC34 | May 2002 | Barcelona | GCA |
Project meetings may also be scheduled between SC34 meetings.
SC34 continues to schedule most of its meetings in conjunction with conferences sponsored by GCA. These conferences generally deal with SGML, XML, HyTime, DSSSL, and related topics; combining meetings with the GCA conferences allows a reduction in the number of trips for experts who participate in both activities. My travel to this meeting was supported in part by GCA.
The attendance list from the SC34 meeting.
The Resolutions from the SC34 meeting.
The documents issued during the meeting.