ISO/IEC JTC 1/SC 34 N0855
ISO/IEC JTC 1/SC 34
Information Technology --
Document Description and Processing
Languages
TITLE: | Official trip report of SC34 Chairman - Oslo March 2007 |
SOURCE: | Dr. James David Mason |
STATUS: | Trip report, published as Y/WPP-129 by the Y-12 National Security Complex |
ACTION: | For information |
DATE: | 2007-04-09 |
DISTRIBUTION: | SC34 and Liaisons |
REPLY TO: |
Dr. James David Mason
Mr. G. Ken Holman |
Y/WPP-129
ISO/IEC JTC1/SC34 N855
Report of Official Foreign Travel
to Oslo, Norway
1426 March 2007
James David Mason
Internet, SGML, and Integration Services
Information Technology
Services
SAIC
9 April 2007
Prepared by the
Y-12 National
Security Complex
Oak Ridge, Tennessee 37831
managed by
BWXT Y-12,
L.L.C.
for the
U.S. DEPARTMENT OF
ENERGY
under contract DE-AC05-00OR22800
Abstract
How can DOE, NNSA, and Y-12 best handle the integration of information from diverse sources, and what will best ensure that legacy data will survive changes in computing systems for the future? Although there is no simple answer, it is becoming increasingly clear throughout the information-management industry that a key component of both preservation and integration of information is the adoption of standardized data formats. The most notable standardized format is XML, to which almost all data is now migrating. XML is derived from SGML, as is HTML, the common language of the World Wide Web.
XML is becoming increasingly important as part of the Y-12 data infrastructure. Y-12 is implementing a new generation of XML-based publishing systems. Y-12 already has a knowledge preservation and integration project, the Production Readiness Assessment Topic Map (PRATM), based on XML data, and Y-12 is supporting similar projects at DOE Headquarters, such as the Guidance Streamlining Initiative (GSI). XML data is also used in Engineering Releases, and a project is beginning on the capture of product build histories in XML.
In support of DOEs use of SGML, XML, HTML, Topic Maps, and related standards, I have served since 1985 as Chairman of the international committee responsible for SGML and standards derived from it, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my March 2007 trip, I chaired the spring 2005 meeting of SC34 in Oslo, Norway. I also spoke at Topic Maps 2007, a new international conference on applications of SC34s ISO/IEC 13250. After having chaired this organization for 22 years, I have decided not to seek another term of office, though I plan to continue in the technical work of the committee.
Supporting standards development allows the Department of Energy/National Nuclear Security Administration (DOE/NNSA) and the Y‑12 National Security Complex (Y‑12) the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Y-12 has numerous projects that depend on XML and its applications. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and XML-based information.
SC34 is currently the base in ISO for some very controversial standardization work involving XML representation of the output for common suites of office applications. Because DOE and NNSA are dependent on these applications in their daily operations, they have a vital stake in being active in the standardization work.
Note: This report continues a series, the most recent of which, Y/WPP-115, reported on the May 2005 meeting of SC34 and the IDEAlliance conference XML Europe 2005 in Amsterdam, the Netherlands. Copies of documentation for all SC34 meetings are available from the SC34 site on the Web: (http://www1.y12.doe.gov/capabilities/sgml/sc34.htm and http://www.jtc1sc34.org/). This report is available on the SC34 Web site at http://www.y12.doe.gov/sgml/sc34/document/0855.htm. Hyperlinks in the online report connect it to the documents it references.
Introduction
Over the course of the past two decades, SGML (Standard Generalized Markup Language, ISO 8879:1986) and its applications, including HTML (Hypertext Markup Language), and profiles, most notably XML (Extensible Markup Language), have come to dominate the interchange and use of structured data. SGML and many of the standards related to it and XML were developed and are maintained by ISO/IEC JTC1/SC34 (SC34), which I have chaired for 22 years.
SGML- and XML-based publishing systems have been developed and deployed at numerous DOE and NNSA facilities over the past fifteen or so years. The most recent efforts at Y-12 are currently in progress with the Arbortext Epic system from PTC. The Arbortext system is also being installed at Headquarters and at other field sites, including Sandia and Livermore.
One of the SC34 projects gaining the attention recently is Topic Maps (ISO/IEC 13250:2002), which describes metadata structures for organizing and indexing large collections of information resources. The Topic Map standard seems poised to have a major effect on knowledge-management applications. Topic Maps are being used in a knowledge base for Production Readiness planning (PRATM) at Y-12 and are being investigated as a mechanism for maintaining and publishing classification guidance on a DOE-wide basis. Topic Maps also have good potential as a structuring tool in other knowledge-preservation activities.
Another project gaining attention is Document Schema Definition Languages (DSDL, ISO/IEC 19757), which is drawing participation partially because of reactions to the excessive complexity of the World Wide Web Consortiums XML Schema project.
In 2006, OASIS, a standards-developing organization with liaison to SC34, submitted the Open Document Format for Office Applications (ODF), to JTC1. Processed as a Publicly Available Specification, ODF has become ISO/IEC 26300, assigned to SC34. ODF has been widely adopted, particularly by governmental organizations that are encouraged to use Open Source software. Perhaps in response to this, Microsoft submitted the XML format implemented in Office 2007 to the European Computer Manufacturers Association (ECMA International), which has forwarded it to ISO as Office Open XML (OOXML). This latter has been issued for ballot as DIS29500 since I returned from Oslo. Attention to these two projects has led to a major increase in participation in SC34.
In March 2007, I attended a series of meetings in Oslo related to the support of SC34 standards and their application. The Topic Maps 2007 conference was held on March 20 and 21. SC34 met Thursday, 22 March through Saturday, 24 March.
Conference: Topic Maps 2007
Topic Maps 2007 is a new international conference that has developed out of a series of Norwegian conferences called Emnekart (i.e., Topic Maps). The conference was sponsored by OASIS, a standards body, the Norwegian Computer Society, the University of Oslo, and a number of vendors working in the Topic Maps area. The conference opened with a day of tutorials and then held a day with several concurrent tracks for technical papers.
The conference organizers asked me to deliver the opening keynote and to present a paper that I had done two years ago for the conference Extreme Markup Languages 2004 in Montréal. Both the keynote and the paper discussed Topic Mapbased projects I had done at Y-12 over the past six years; it also included some new original research into topic maps (methodology for representing the complexities of pipe organs).
Topic Maps are being widely adopted for a bewildering variety of projects. In Norway, governmental agencies from the local to the national level are standardizing on Topic Map technology for portals to deliver services and information to the citizens. Topic Maps unique ability to merge diverse sources of information has led to their use as a national aggregator of information. Norway has long used Topic Map technology as a means of coodinating assignments for students in their public schools. Norway is not alone in such portals: half a world away, Nepal is building a Topic Mapbased portal for knowledge about the Himalaya Mountains. The conference also included an update on the five years of experience by the U.S. Internal Revenue Service in delivering regulatory information to tax professionals.
Topic Maps were originally developed for cataloging and indexing tools. The conference included a variety of papers on catalogs, ranging from the collections of the Finnish National Gallery to a Web-based service selling indie films. Topic maps are also being investigated in academic cataloging projects, ranging from student work at the University of Leipzig to collections of Korean folksongs.
Spring Meeting of ISO/IEC JTC1/SC34, Oslo, Norway
The SC34 meeting was held at the offices of Bouvet AS in Oslo, Norway. The attendance at the spring meeting of SC34 included 37 experts from 10 countries (Australia, Canada, Germany, Italy, Japan, Korea, the Netherlands, Norway, the United Kingdom, and the United States) and one external liaison body (ECMA International, the European Computer Manufacturers Association). This was one of the largest meetings of SC34 in recent years.
The opening plenary was held on Thursday, 22 March 2007, with reports from national bodies, liaison organizations, and project editors. After the opening plenary, SC34 broke into its component Working Groups: Markup Languages (WG1), Information Presentation (WG2), and Information Association (WG3). A closing plenary was held on Saturday, 24 March.
Working Group Meetings
WG1: Markup Languages
SC34s work began with SGML (ISO 8879:1986), the basis for many other SC34 standards as well as for the W3Cs XML suite of recommendations. WG1 has moved beyond its original project to others that reflect current emphasis on XML.
At this meeting, SC34/WG1 concentrated on its project Document Schema Definition Languages (DSDL, ISO/IEC 19757), which is to provide a pipelined facility for combining mechanisms for defining XML document structures and validating instance documents. This standard now consists of nine normative parts plus a technical report. All but one of the parts were worked on at this meeting, either to progress texts to higher stages of approval or to peocess amendments. WG1 also worked with WG2 on revisions of ISO TR9573, which documents the application of XML to publication of ISO standards.
WG1 is responsible for handling ODF, ISO/IEC 26300, and will be responsible for OOXML, ISO/IEC DIS 29500, when it arrives in SC34.
The Recommendations of the WG1 meeting are available online at http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0836.htm.
WG2: Information Presentation
SC34/WG2 continued maintenance of its standards on fonts and related topics. Work was begun on ISO/IEC DIS24754, a new standard for describing the capabilities of document rendering systems.
The Recommendations of the WG2 meeting are available online at http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0837.htm.
WG3: Information Association
SC34/WG3 works mainly on matters of hypertext and multimedia documents and linking. The Topic Maps (ISO/IEC 13250, http:// www1.y12.doe.gov/capabilities//sgml/sc34/document/0322.pdf) standard, published in 2002, occupies most of WG3s effort.
WG3 is currently conducting a major revision of ISO/IEC 13250, breaking it into a multipart standard, with the addition of the Topic-Map Query Language (ISO/IEC CD 18048) and Constraint Language (ISO/IEC CD 19756). These projects are at varying states of completion, with the Reference Model (Part 5) and Data Model (Part 2) being most advanced. A project for representing Dublin Core Metadata in Topic Map form is just starting (ISO/IEC WD29111).
The Recommendations of the WG3 meeting are available online at http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0838.htm.
Results of the Meeting
SC34 is pleased that its standards continue to attract attention and new applications. The group is particularly pleased by the high level of participation in its work and by the excitement that DSDL and Topic Maps are generating. The increase in the number of projects related to schema languages and Topic Maps, as well as the consolidation of the technical work in SC34, reflects the maturing of these areas of standardization. There was no work on either ODF or OOXML at this meeting, but together they promise to keep attention on SC34 in the near future.
I have chaired SC34 and its predecessor organizations since 1985. I joined INCITS V1, the U.S. committee that participates in SC34, in 1981. During this time I have observed SGML and its offspring go from exotic technology that was fighting for its existence against the now-forgotten technolgies that were then in fashion to being the primary means of representation and exchange of structured data. In the past few months, Ken Holman, who has been a long-time worker in SC34 and has for the past five years managed its Secretariat for the Standards Council of Canada, has decided not to take another term. A new Secretariat host is currently being sought. There is some uncertainty about when my term as Chairman will end, but a new Secretariat has the right to nominate a new Chairman. Accordingly, I have decided that when the new Secretariat is selected, I shall join Mr. Holman in standing down from offical positions and returning to standards development rather than administration.
The Resolutions of the SC34 Meeting (http://www1.y12.doe.gov/capabilities/sgml/sc34/ document/0839res.htm) are available online as formal statements of the accomplishments of the meeting. The SC34 library also includes the Report of the SC34 Secretariat (http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0842.htm), which lists all the formal projects in SC34 and their editors. Documents distributed during the meeting are listed in Appendix C and are available online through links at http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0850.htm.
Conclusion and Recommendations
The world of SGML/XML appears to be quite healthy, whether one looks at the fundamental level of standards development or surface layers of application. At the very least, contention over the standardization of XML for office application suites suggests that SC34 will be lively, to say the least, in the next year or so.
Although DOE has been involved with SGML and related standards since the early 1980s, interest in these subjects has tended to reside in specialized groups. The rise of the WWW brought a casual, if frequently effective, use of SGML (in the form of HTML) to a wide community but did not spread wide understanding of the underlying technology. The rise of XML and its adoption by major software houses suggests that use will become even more widespread. For some uses, a casual approach to XML may suffice. However, for records, product data, interpretive knowledge bases, and other mission-sensitive information, DOE should take an active position on the development and use of XML-related standards.
The growth of Topic Maps and other XML-based mechanisms for knowledge engineering has potentially great impacts on mission-critical information for DOE and NNSA. As NNSAs weapons programs increasingly call for Electronic Data Capture (EDC), there is a need for stable mechanisms for capturing, integrating, and cataloging the information. Particularly in the case of stockpile life-extension programs, there is a need for this data to be usable for decades after it is collected. Current methods of EDC do not offer adequate assurance that the data will continue to be usable. Adoption and implementation of standard methods based in XML should be a high priority for DOE and NNSA.
The application of XML and Topic Maps to knowledge management in projects such as that for the PRATM and the knowledge base for the Ferret classification engine should be pursued. This technology will aid the creation and maintenance of knowledge bases, as well as the extension of the Ferret engine beyond classification to new applications. The GSI projects in the Information Classification and Control Policy organization to develop an XML-based publishing system for classification guidance and a topic-map guidance-management system are examples of how this technology can be applied.
Because DOE is one of the organizations adopting SC34 standards, it should continue active participation in SC34s work, particularly the work on Topic Maps. Because its daily operations involve applications such as Microsoft Office, DOE has a very large stake in the outcome of standards now in SC34 or coming to it soon. As DOEs use of SC34s standards increases, the need for continued commitment to their maintenance and extension will increase as a consequence. DOE should also keep aware of developments in the realm of applications by participating in conferences and developers groups. Furthermore, DOE should establish more internal means for sharing tools, techniques, and applications. Ferret technology seems a good candidate for extension to other DOE facilities and perhaps for commercialization as well. Y-12, as the leader in development of SGML-related standards, is in a good position to continue also as a leader in their application. The systems for publishing and managing classification guidance will perhaps show a way for even wider DOE application of XML techniques.
Future meetings
SC34 has the following meetings scheduled for the next year:
Group |
Dates |
Location |
|
|
SC34/WG1 and WG3 |
August 2007 |
Montréal |
|
SC34 |
October or November 2007 |
Kyoto or Leipzig |
|
SC34 |
March 2006 |
Oslo? |
Project meetings may also be scheduled between SC34 meetings.
My participation in the Topic Maps 2007 conference and attendance at the SC34 meetings was supported in part by the organizers of the conference.
Appendix A
James David Mason: Itinerary, 1426 March 2007
Dates |
Location |
Contacts |
Purpose |
1415 March 2007 |
Knoxville, Oslo |
|
Travel |
1620 March 2007 |
Oslo |
|
Weekend, holiday, and vacation |
21 March 2007 |
Oslo |
Are Gulbrandsen |
Conference: Topic Maps 2007 |
2224 March 2007 |
Oslo |
Lars Marius Garshol |
Meeting of ISO/IEC JTC1/SC34 |
25 March 2007 |
Oslo |
|
Weekend |
26 March 2007 |
Oslo, Knoxville |
|
Return travel |
Appendix B
Principal Contacts
The attendance list for the SC34 meeting is at SC34 840.
Literature Acquired
ISO Technical Committees are literature intensive. ISO/IEC JTC1/SC34 distributed documents 623-647 in the course of the Amsterdam meeting. These documents are available over the WWW through links from SC34's site; the current document register that covers the documents discussed at the meeting is at http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0850.htm
DISCLAIMER
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.