ISO/IEC JTC 1/SC34 N0332

ISO/IEC JTC 1/SC34

Information Technology --

Document Description and Processing Languages

Title: Report of Official Foreign Travel to Canada
2-10 August 2002
Source: James D. Mason, Chairman, JTC1/SC34
Project: All SC34 Projects
Project editor: All SC34 Editors
Status: This report was submitted to the U.S. Department of Energy and the National Nuclear Security Agency as part of the requirements for official travel by the author.
Action:
Date: 30 August 2002
Summary:
Distribution: SC34 and Liaisons
Refer to:
Supercedes: SC34 N324
Reply to: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mailk: mailto:[email protected]
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Ms. Sara Hafele Desautels, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
11 West 42nd Street
New York, NY 10036
Tel: +1 212 642 4976
Fax: +1 212 840 2298
E-mail: [email protected]


Y/WPP-095

Y-12 National Security Complex

Report of Official Foreign Travel to Canada
2-10 August 2002




James David Mason
Internet, SGML, and Integration Services
Information Technology Services
SAIC



26 August 2002

Prepared by the
Y-12 National Security Complex
Oak Ridge, Tennessee 37831
managed by
BWXT Y-12, L.L.C.
for the
U.S. DEPARTMENT OF ENERGY
under contract DE-AC05-00OR22800

DISCLAIMER

This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

 

 

Abstract

In support of DOE's use of SGML, XML, HTML, and related standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my August 2002 trip, I attended the summer 2002 meeting of SC34/WG3 in Montréal, Canada. I also read a paper at Extreme Markup Languages 2002, a major conference on the use of SGML and XML sponsored by IDEAlliance.

Supporting standards development allows the Department of Energy/National Nuclear Security Administration (DOE/NNSA) and the Y-12 National Security Complex (Y-12) the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and related topics.

Note: This report continues a series, the most recent of which, Y/WPP-035, reported on the Spring 2002 meeting of SC34 in Barcelona, Catalonia, Spain. Copies of documentation for all SC34 meetings are available from the SC34 site on the Web: http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm. This report is available on the SC34 Web site at http://www.y12.doe.gov/sgml/sc34/document/0324.htm. Hyperlinks in the online report connect it to the documents it references.

Introduction

Over the course of the past two decades, SGML (Standard Generalized Markup Language, ISO 8879:1986) and its applications, including HTML (Hypertext Markup Language), and profiles, most notably XML (Extensible Markup Language), have come to dominate the interchange and use of structured data. SGML and many of the standards related to it were developed and are maintained by ISO/IEC JTC1/SC34 (SC34), which I chair.

One of the SC34 projects gaining the most attention recently is Topic Maps (ISO/IEC 13250:2002), which describes metadata structures for organizing and indexing large collections of information resources. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Topic Maps are being used in the knowledge base for the Ferret analytical engine developed at Y-12 and are being investigated as a mechanism for maintaining and publishing classification guidance on a DOE-wide basis. Topic Maps also have good potential as a structuring tool in other knowledge-preservation activities.

In August 2002, I attended a series of meetings in Montréal related to the support of SC34 standards and their application. SC34's Working Group 3 (SC34/WG3), Information Association, which is responsible for Topic Maps, met on Saturday, 3 August. The Extreme Markup Languages 2002 conference, sponsored by IDEAlliance, followed during the next week.

Summer Meeting of ISO/IEC JTC1/SC34/WG3, Montréal, Canada

The SC34/WG3 meeting on 3-5 August 2001 was attended by 14 experts representing six countries (France, Germany, Japan, Norway, the United Kingdom, and the United States) and two external liaison bodies (International SGML/XML Users' Group, OASIS). I chaired the meeting, in the absence of Steve Pepper, Convenor of WG3.

SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. The newly revised Topic Maps standard (ISO/IEC 13250, http://www.y12.doe.gov/sgml/sc34/document/0322.htm), which was reissued this year, occupies most of WG3's effort. Since last year, the standard has been revised to include the XML Topic Maps (XTM) interchange data structure developed by TopicMaps.org (now operating under OASIS). At this meeting, WG3 examined two documents related to Topic Map support models, the Standard Application Model, edited by Lars Marius Garshol and Graham Moore (SC34 N329, http://www.y12.doe.gov/sgml/sc34/document/0329.htm), and a Reference Model, edited by Steve Newcomb and Michel Biezunski (SC34 N298, http://www.y12.doe.gov/sgml/sc34/document/0298R1.htm).

The major result of the examination of the models was a series of decisions that are documented in the meeting report. Among the subjects discussed in considerable detail were naming of topics, merging, and the implications of new interpretations for syntax. Probably the most significant decision was a reinterpretation of the "Topic Naming Constraint" to remove some of the undesirable effects of automatic merging that it imposed. Some of these decisions may result in amendments/corrigenda to the Topic Maps standard at a later date.

The Report of the WG3 meeting is available online at http://www.y12.doe.gov/sgml/sc34/document/0331.htm. No new documents were distributed at the meeting.

Conference: Extreme Markup Languages 2002

IDEAlliance (formerly the Graphic Communications Association, an affiliate of Printing Industries of America) has been a supporter of SGML and its applications from the earliest days. Their conferences on SGML-related topics had already grown steadily over the years, but the arrival of first HTML and then XML has caused an explosion of participation in both North America and Europe. Extreme Markup Languages is IDEAlliance's most technical conference in the area of SGML, XML, and related technologies.

This year's Extreme Markup Languages conference in Montréal revisited several themes from the previous two years, particularly the nature of markup languages, schema languages, and the relationship between RDF and Topic Maps.

This year's discussion of the nature of markup continued with several papers on algebraic approaches to fundamental subjects and on analysis of ways of applying markup to complex subjects. Gavin Nichol's presentation on "Core range algebra" led the algebraic approach and was frequently mentioned by later presenters. Michael Sperberg-McQueen, Allen Renear, David Dubin, and Claus Huitfeldt discussed inferences that can be drawn from markup, returning to a theme that started with Sperberg-McQueen's keynote from two years ago. Wendell Piez, who last year took a rhetorical approach to markup, this year approached it from the perspective of semiotics and structural linguistics. Jeni Tennison showed an algebraic approach to comparing markup languages. Simon St. Laurent examined the possibilities of using stand-off markup (rather than the more conventional embedded tags), an approach favored by hypertext pioneer Ted Nelson. Stand-off markup requires pointers such as can be derived from Nichol's range algebra. Tennison and Piez delivered another paper that showed how to use pointer-based approaches to do layered annotation and multiple hierarchies in a single document. Patrick Durusau presented a different approach to hierarchies, with techniques for selecting among concurrent or overlapping trees in conventional tag-based markup.

Jack Park's keynote on the Open Hyperdocument system proposed by Douglas Engelbart examined the implications of having a massively hyperlinked online system for collaboration. The idea of a hyperlinked system for managing information is usually traced back to Vannevar Bush's 1945 article, "As We May Think" (http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm), but Bush's vision could not be implemented with the technology of the day. His intellectual successor, Engelbart, created the ancestor of all modern hypertext systems, NLS (for "oN-Line System"), in the late 1960s; one of the many secondary contributions of this system was the invention of the computer mouse. The continuing theme of Englebart's research, and of Park's keynote, is the "augmentation of human intellect" through technology. Among the projects that Park discussed are some to put tools for interaction and collaboration in the hands of schoolchildren (e.g., the "Nexist" project, http://nexist.sourceforge.net/, and his paper from an earlier IDEAlliance conference, "Bringing Knowledge Technologies to the Classroom," http://www.thinkalong.com/JP/ParkKT2001.pdf). Eugene Kim, from Englebart's Bootstrap Institute, and Ken Holman presented a paper on the data structures and interchange formats being used in the Open Hyperdocument system.

Bush's original article proposed that his hypothetical tool for knowledge management, the "Memex," go beyond mere indexing. To Bush, what distinguishes the human mind is its ability to form associations, and the Memex was to be, among other things, a tool for collecting webs of associations. Today, among the most promising tools for collecting associations are systems based on Topic Maps. As at most recent conferences, there were several papers on Topic Maps, ranging from Eric Freese's question "So why aren't Topic Maps ruling the world?" to Mary Nishikawa's presentation of how a large corporation uses Topic Maps in an intranet. Vinh Lê, from the DOE office of Information Classification and Control Policy, and I presented a paper on "Topic Maps for Managing Classification Guidance."

The other papers at the conference were spread over a wide range of subjects, including automatic indexing, instructional workstations, synchronized multimedia, and development tools. There were several papers on querying and database techniques. Because there were parallel tracks, I was unable to attend all the sessions.

As in past years, the conference was quite lively, and there is not only a continuation of rapid growth in interest in the SGML/XML world but also, and probably more importantly, a wide range of intellectual inquiry into techniques and new areas of application. Other XML conferences have increasingly become dominated by electronic-business concerns. Extreme Markup Languages, however, remains centered on the original concerns of those of us who developed structured markup more than twenty years ago, enhancing the effectiveness of communications among humans.

Conclusion and Recommendations

The world of markup languages appears to be quite healthy, whether one looks at the fundamental level of standards development or the upper layers of application.

Although DOE has been involved with SGML and structured markup since the late 1970s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, interpretive knowledge bases, and other mission-sensitive information, DOE should take an active position on the development and use of SGML-related standards.

The growth of Topic Maps and other XML-based mechanisms for knowledge engineering has potentially great impacts on mission-critical information for DOE and NNSA. As NNSA's weapons programs increasingly call for electronic data capture, there is a need for stable mechanisms for both capturing and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of collecting the data do not offer adequate assurance that that the data will continue to be usable. Adoption and implementation of standard methods based in SGML/XML should be a high priority for DOE and NNSA.

The application of XML and Topic Maps to knowledge management in projects such as that for the Ferret classification engine should be pursued. The application of Topic Maps to classification guidance at the office of Information Classification and Control Policy, on which we reported at this conference, should lead to better distribution of classification information within DOE and NNSA. The work of SC34/WG3 at this meeting has resulted in some rethinking of the design of the Topic Map being planned for managing classification guidance. Participation in the meeting was highly beneficial. DOE should look for ways of extending the Topic Map technique beyond its current applications.

Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34's work, particularly the work on Topic Maps. As DOE's use of these standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers' groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Extension of the NWIG metadata system and construction of a comprehensive records system such as that proposed by Y-12's WRAP project can profit from DOE's future support of SGML/XML. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. Y-12, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application.

 

Future meetings

SC34 has the following meetings scheduled for the next year:

Group

Dates

Location

SC34

7-12 December 2002

Baltimore

SC34

May 2003

Amsterdam or Brussels

SC34

December 2003

Philadelphia

Project meetings may also be scheduled between SC34 meetings.

SC34 continues to schedule most of its meetings in conjunction with conferences sponsored by IDEAlliance. These conferences generally deal with SGML, XML, HyTime, DSSSL, and related topics; combining meetings with the IDEAlliance conferences allows a reduction in the number of trips for experts who participate in both activities.

Supplementary documents

The attendance list from the SC34/WG3 meeting.

The Report of the SC34/WG3 meeting.