ISO/IEC JTC 1/SC 34N0651
ISO/IEC JTC 1/SC 34
Information Technology --
Document Description and Processing Languages
TITLE: | Official trip report of SC34 Chairman - Amsterdam May 2005 |
SOURCE: | Dr. James David Mason |
STATUS: | Trip report |
ACTION: | For information |
DATE: | 2005-06-21 |
DISTRIBUTION: | SC34 and Liaisons |
REPLY TO: |
Dr. James David Mason
Mr. G. Ken Holman |
Y/WPP-115
Report of Official Foreign Travel to
the
Netherlands
20 May1 June 2005
James David Mason
Internet, SGML, and
Integration Services
Information Technology Services
SAIC
Prepared by the
Y-12 National Security
Complex
Oak Ridge, Tennessee 37831
managed by
BWXT Y-12,
L.L.C.
for the
U.S. DEPARTMENT
OF ENERGY
under contract DE-AC05-00OR22800
Report of Official Foreign Travel to the
Netherlands
20 May1 June
2005
James David Mason
Abstract
How can DOE, NNSA, and Y-12 best handle the integration of information from diverse sources, and what will best ensure that legacy data will survive changes in computing systems for the future? Although there is no simple answer, it is becoming increasingly clear throughout the information-management industry that a key component of both preservation and integration of information is the adoption of standardized data formats. The most notable standardized format is XML, to which almost all data is now migrating. XML is derived from SGML, as is HTML, the common language of the World Wide Web. Y-12 already has a knowledge preservation and integration project, the Production Readiness Assessment Topic Map (PRATM), based on XML data, and Y-12 is supporting similar projects at DOE Headquarters, such as the Guidance Streamlining Initiative (GSI).
In support of DOEs use of SGML, XML, HTML, Topic Maps, and related standards, I have served since 1985 as Chairman of the international committee responsible for SGML and standards derived from it, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my May 2005 trip, I chaired the spring 2005 meeting of SC34 in Amsterdam, Netherlands. I also attended XTech 2005, a major conference on the use of SGML and XML sponsored by IDEAlliance.
Supporting standards development allows the Department of Energy/National Nuclear Security Administration (DOE/NNSA) and the Y‑12 National Security Complex (Y‑12) the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Y-12 has numerous projects that depend on XML and its applications. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and XML-based information.
Note: This report continues a series, the most recent of which, Y/WPP-101, reported on the May 2003 meeting of SC34/WG3and the IDEAlliance conference XML Europe 2003 in London, England. Copies of documentation for all SC34 meetings are available from the SC34 site on the Web: (http://www.y12.doe.gov/sgml/sc34/sc34.htm and http://www.jtc1sc34.org/). This report is available on the SC34 Web site at http://www.y12.doe.gov/sgml/sc34/document/0651.htm. Hyperlinks in the online report connect it to the documents it references.
Introduction
Over the course of the past two decades, SGML (Standard Generalized Markup Language, ISO 8879:1986) and its applications, including HTML (Hypertext Markup Language), and profiles, most notably XML (Extensible Markup Language), have come to dominate the interchange and use of structured data. SGML and many of the standards related to it were developed and are maintained by ISO/IEC JTC1/SC34 (SC34), which I chair.
The SC34 project gaining the most attention recently is Topic Maps (ISO/IEC 13250:2002), which describes metadata structures for organizing and indexing large collections of information resources. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Topic Maps are being used in a knowledge base for Production Readiness planning (PRATM) at Y-12 and are being investigated as a mechanism for maintaining and publishing classification guidance on a DOE-wide basis. Topic Maps also have good potential as a structuring tool in other knowledge-preservation activities.
Another project gaining attention is Document Schema Definition Languages (DSDL, ISO/IEC 19757), which is drawing participation partially because of reactions to the World Wide Web Consortiums XML Schema project.
In May 2005, I attended a series of meetings in Amsterdam related to the support of SC34 standards and their application. SC34 and its Working Group 3 (SC34/WG3), Information Association, which is responsible for Topic Maps, met on Sunday, 22 May. The XTech 2005 conference, sponsored by IDEAlliance, followed during the next week. SC34 meetings continued during the conference, ending on 26 May.
Spring Meeting of ISO/IEC JTC1/SC34, Amsterdam, Netherlands
The SC34 meeting was held at the Amsterdam RAI Conference Center in Amsterdam, Netherlands. The attendance at the spring meeting of SC34 included 28 experts from 8 countries (Australia, Canada, Japan, Korea, the Netherlands, Norway, the United Kingdom, and the United States) and one external liaison body (ISUG, the International SGML/XML Users Group, of which I am President).
The opening plenary was held on Sunday, 22 May 2005, with reports from national bodies, liaison organizations, and project editors. After the opening plenary, SC34 broke into its component Working Groups: Markup Languages (WG1), Information Presentation (WG2), and Information Association (WG3). Following the pattern established in 2001, we held SC34 plenary sessions at the beginning and end of the IDEAlliance conference, with WG meetings scheduled before the conference and then at free intervals during it.
Working Group Meetings
WG1: Markup Languages
SC34/WG1 is responsible for SC34s oldest ISO standard, SGML (ISO 8879:1986), the basis for many other SC34 standards as well as for the W3Cs XML suite of recommendations. SGML is stable and well supported. SC34 has published two Technical Corrigenda (TCs) to SGML to support internationalization of text (through UNICODE/ISO 10646) and to formalize expression of some of the constraints imposed on applications by XML.
At this meeting, SC34/WG1 concentrated on its project Document Schema Definition Languages (DSDL, ISO/IEC 19757), which is to provide a pipelined facility for combining mechanisms for defining XML document structures and validating instance documents. This standard now consists of nine normative parts plus a technical report. All parts were worked on at this meeting: Part 2 (RELAX NG) is already an approved standard but is being extended through amendment, and the other parts are at various stages of development.
The Recommendations of the WG1 meeting are available online at http://www.y12.doe.gov/sgml/sc34/document/0623.htm.
WG2: Information Presentation
SC34/WG2 continued maintenance of its standards on fonts and related topics, as well as of SPDL (Standard Page Description Language, ISO/IEC 10180) and DSSSL (Document Style Semantics and Specification Language, ISO/IEC 10179). An amendment to DSSSL and one to ISO/IEC TR 19758 (DSSSL Library) were approved for processing as separate documents. Amendments were also approved for ISO/IEC 9541 (Glyph Shape Representation). Work was begun on a new standard for describing the capabilities of document rendering systems.
The Recommendations of the WG2 meeting are available online at http://www.y12.doe.gov/sgml/sc34/document/0622.htm, and the report of their meeting is http://www.y12.doe.gov/sgml/sc34/document/0332.htm.
WG3: Information Association
SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. The new Topic Maps (ISO/IEC 13250, http://www.y12.doe.gov/sgml/sc34/document/0322.pdf) standard, published in 2002, occupies most of WG3s effort.
WG3 is currently concentrating on the development of a Reference Model (ISO/IEC 13250-5) and a Data Model (ISO/IEC 13250-2) for topic maps in preparation for the development of topic-map Query Language (ISO/IEC CD 18048) and Constraint Language (ISO/IEC CD 19756). New versions of both models were presented by their editors and analyzed by the WG. Several documents related to the constraint and query languages were also discussed. Work was also done on the Canonical Syntax (ISO/IEC 13250-4) and XML Syntax for Topic Maps(ISO/IEC 13250-3).
The World Wide Web Consortium (W3C) Semantic Web Best Practices and Deployment Working Group has begun work on interoperability between RDF (Resource Definition Framework) and Topic Maps. WG3 members involved in the project urged further support from the ISO community.
The Recommendations of the WG3 meeting are available online at http://www.y12.doe.gov/sgml/sc34/document/0625.htm.
Results of the Meeting
SC34 is pleased that its standards continue to attract attention and new applications. The group is particularly pleased by the high level of participation in its work and by the excitement that DSDL and Topic Maps are generating. The increase in the number of projects related to schema languages and Topic Maps, as well as the consolidation of the technical work in SC34, reflects the maturing of these areas of standardization.
The Resolutions of the SC34 Meeting (http://www.y12.doe.gov/sgml/sc34/document/0626res.htm) are available online as formal statements of the accomplishments of the meeting. The SC34 library also includes the Report of the SC34 Secretariat (http://www.y12.doe.gov/sgml/sc34/document/0647.htm), which lists all the formal projects in SC34 and their editors. Documents distributed during the meeting are listed in Appendix C and are available online through links at http://www.y12.doe.gov/sgml/sc34/document/0650.htm.
Conference: XTech 2005
IDEAlliance, an industry association, has been a supporter of SGML and its applications from the earliest days. Their conferences on SGML-related topics had already grown steadily over the years, but the arrival of first HTML and then XML has caused an explosion of participation in both North America and Europe. The conferences in Europe, first known as Markup, then XML Europe, and now XTech, have been going on for more than 20 years (I spoke on technical publishing at ORNL at Markup 1984 in Oxford).
The XTech 2005 conference, which generally had several concurrent tracks, was too vast for me to absorb by myself (the proceedings are at http://www.idealliance.org/proceedings/xtech05/). This years conference was in my opinionand that of many others to whom I have spokena considerable improvement over those in the past few years. The coverage was wider, and there was more technical content. The conference Web site is http://www.xtech-conference.org/. There is a good review of the conference by Micah Dubinko on XML.com: http://www.xml.com/pub/a/2005/06/01/deviant.html, and there is a collection of conference blogs at http://www.planetxtech.org/.
The track on Topic Maps and knowledge management continues to draw attention; I attended most of the sessions, looking for refinements for my ideas about how to apply Topic Maps to local projects and for tools to aid in the manipulation and visualization of data represented in maps. One of the papers, which ISUG has already published in interChange, the ISUG journal, was Kal Ahmeds Topic Mapping the Restoration17th Century London on the Semantic Web. Although Ahmeds immediate subject, conversion of Samuel Pepys celebrated Diary into an online topic map, might seem remote from DOEs current work, it is actually quite relevant to the problems of knowledge representation and preservation: it is necessary to model events, persons and their functions, locations, and resources. A paper that is more clearly related to manufacturing issues is How can ontologies help repair your car? by Martin Bryan and Reuben Wright (to be published in the next issue of interChange). Both of these papers have immediate relevance to the PRATM. Kal Ahmed gave a second interesting paper, on building automated tools for generating software documentation as topic maps (particularly in Java-based environments). Other topic maps papers included studies of their application to healthcare and an overview of how they are being used by LexisNexis in a Web-services environment. Another significant presentation was by Lars Marius Garshol and Steve Pepper, of Ontopia (the vendor supporting much of DOEs topic map work), on the work at W3C on harmonization of Topic Maps and RDF.
Another major theme of the conference was Web-browser tools. One of the opening keynotes was by Mike Shaver, project coordinator at the Mozilla Foundation. Much of the attention paid to the Mozilla Firefox browser has been about its security advantages over Internet Explorer, but for this conference the important issue was its XML interfaces. There is a section on the conference on the mozillaZine at http://www.mozillazine.org/talkback.html?article=6750.
I was particularly interested in the conference thread on open data, which began with the opening keynote by Paula Le Dieu, executive director of Creative Commons. The thread might be summarized as knowledge preservation and tools for creative reuse of legacy information. John Wilbanks stirred up a lively discussion with his paper, Towards a Science Commons. (His father is Tom Wilbanks, at ORNL; I hope to republish his paper in interChange.)
The closing keynote was by Jean Paoli, the XML Architect at Microsoft. Paoli reiterated Microsofts commitment to XML as a native data format, predicting that 75% of the worlds new documents would be created in XML by 2010. (Microsoft has announced that it will use XML as the native data format in the next release of Office, rather than the binary formats it has used heretofore.)
The conference was quite lively, and there is a continuation of rapid growth of interest in the SGML/XML world and, more importantly, support for SGML/XML applications. A report on the conference will appear in the June 2005 issue of interChange.
Conclusion and Recommendations
The world of SGML/XML appears to be quite healthy, whether one looks at the fundamental level of standards development or surface layers of application.
Although DOE has been involved with SGML and related standards since the early 1980s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, interpretive knowledge bases, and other mission-sensitive information, DOE should take an active position on the development and use of XML-related standards.
The growth of Topic Maps and other XML-based mechanisms for knowledge engineering has potentially great impacts on mission-critical information for DOE and NNSA. As NNSAs weapons programs increasingly call for Electronic Data Capture (EDC), there is a need for stable mechanisms for capturing, integrating, and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of EDC do not offer adequate assurance that the data will continue to be usable. Adoption and implementation of standard methods based in XML should be a high priority for DOE and NNSA.
The application of XML and Topic Maps to knowledge management in projects such as that for the PRATM and the knowledge base for the Ferret classification engine should be pursued. This technology will aid the creation and maintenance of knowledge bases, as well as the extension of the Ferret engine beyond classification to new applications. The GSI projects in the Information Classification and Control Policy organization to develop an XML-based publishing system for classification guidance and a topic-map guidance-management system are examples of how this technology can be applied.
Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34s work, particularly the work on Topic Maps. As DOEs use of these standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. Y-12, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application. The systems for publishing and managing classification guidance will perhaps show a way for even wider DOE application of XML techniques.
Future meetings
SC34 has the following meetings scheduled for the next year:
Group |
Dates |
Location |
|
SC34/WG3 |
2831 July 2005 |
Montréal |
|
SC34 |
November 2005 |
Atlanta |
|
SC34 |
May 2006 |
Seoul |
Project meetings may also be scheduled between SC34 meetings.
SC34 continues to schedule most of its meetings in conjunction with conferences sponsored by IDEAlliance. These conferences generally deal with SGML, XML, HyTime, Topic Maps, and related topics; combining meetings with the IDEAlliance conferences allows a reduction in the number of trips for experts who participate in both activities. My participation in the XTech 2005 conference was supported in part by IDEAlliance.
Appendix A
James David Mason: Itinerary, 20 May1 June 2005
Dates |
Location |
Contacts |
Purpose |
2021 May 2005 |
Knoxville, Amsterdam |
|
Travel |
2226 May 2005 |
Amsterdam |
Marion Elledge |
Meeting of ISO/IEC JTC1/SC34 |
2527 May 2005 |
Amsterdam |
Marion Elledge |
Conference: XTech 2005 |
28310 May 2005 |
Amsterdam |
|
Weekend, holiday, and vacation |
1 June 2005 |
Amsterdam, Knoxville |
|
Return travel |
Appendix C
Literature Acquired
ISO Technical Committees are literature intensive. ISO/IEC JTC1/SC34 distributed documents 623-647 in the course of the Amsterdam meeting. These documents are available over the WWW through links from SC34's site; the current document register that covers the documents discussed at the meeting is at http://www.y12.doe.gov/sgml/sc34/document/0650.htm