Submission from the UK of an initial Working Draft for the Proposed DSDL
standard that identifies users requirements for the proposed standard in Annex
1. This document is submitted as originally supplied and although the User
Requirements are contained in an annex which is marked as normative, the UK
does not consider that these requirements, which are instructions to the
Project Editor, should remain as normative requirements on the users of the
published standard. SC 34 may like to consider whether these requirements
should be contained in a separate User Requirements document that could form
definitive instructions to the editor.
ISO/IEC JTC 1/SC34 N264
ISO/IEC JTC 1/SC34
Information Technology --
Document Description and Processing Languages
TITLE: |
U.K. National Body Contribution to First Working Draft of Document Schema Definition Language (DSDL) |
SOURCE: |
G. Williams, U.K. |
PROJECT: |
|
PROJECT EDITOR: |
M.
Bryan |
STATUS: |
First Working Draft |
ACTION: |
This document was included in the NWI comments, but the U.K. intended to have it distributed separately in its entirety to serve as a base document for further development. |
DATE: |
|
DISTRIBUTION: |
SC34 and Liaisons |
REFER TO: |
|
REPLY TO: |
Dr. James David Mason |
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75% of the national bodies casting a vote.
International Standard ISO/IEC 13240 was prepared by Joint Technical Committee JTC1, Information technology.
Introduction
SGML Document Type Definitions (DTDs) allow document structures to be formally modelled but do not allow details of data types or data relationships to be recorded in an XML-compatible way. While the W3C XML Schema Definition language (XSD) does allow data types to be used to validate the contents of SGML elements and values of attributes, it does not allow the relationships between the values of different attributes and contents of elements to be validated. A new, compact, efficient and XML-based document type definition for the integrated description of document structures, data types and data relationships will make it possible to automate the processing of structured information resources to the level required by business users, which has a higher level of requirements than those identified from the publishing community for which SGML was originally developed. The standard will also define the scope and notation for converting and interworking a core subset of document structure, data type, and data relationship constraint models among the three notations: DSDL, DTD declarations and XSD.
1 Scope
1.1 Definition of scope
This International Standard, known as the Document Schema Definition Language (DSDL), allows the definition of document structures, data types and data relationship constraints that can be applied to data represented using the ISO/IEC 8879 Standard Generalized Markup Language and its derivatives, such as ISO/IEC 10744, Hypermedia/Time-based Structuring Language (HyTime), and the W3C Extensible Markup Language (XML).
2 Conformance
To be defined
3 Normative references
ISO 8879:1986, Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML)
W3C Extensible Markup Language (XML) (http://www.w3.org/TR/REC-xml)
W3C XML Schema Part 2: Datatypes (http://www.w3.org/TR/xmlschema-2/)
4 Definitions
5 Symbols and abbreviations
DSDL
Document Schema Definition Language
SGML
Standard Generalized Markup Language (ISO/IEC 8879)
XML
W3C Extensible Markup Language
6 Documentation Conventions
Any references in this document to industry and proprietary standards, products, user groups, and publications are not normative, and do not imply endorsement by ISO, IEC, or their national member bodies or affiliates. Any brand names or trademarks mentioned are the property of their respective owners.
The formal definitions are expressed as using the W3C XML subset of SGML.
The formal definitions are part of the text of this International Standard and are protected by copyright. In order to facilitate conformance to DSDL, the formal definitions may be copied as specified in the following copyright notice: Copyright (C) 200? International Organization for Standardization. Permission to copy in any form is granted for use with conforming DSDL systems and applications as defined in ISO/IEC ????, provided this notice is included in all copies. The permission to copy does not apply to any other material in this International Standard.
Note 5. This document uses editorial conventions mandated by the ISO with which the reader should be familiar in order to understand the implications of certain words.
The text describing each construct emphasizes semantics, while the formal XML definition provides the rigorous syntactic definitions underlying the text descriptions.
Note 6. For this reason, it is recommended that the reader refer to the XML definitions while reading the textual descriptions. Although the XML definition always follows the related text, the user may find it helpful to read the XML first in some cases.
When a construct is first introduced, it is described in the text. If the construct occurs in the formal XML specification, both the formal XML name and a full name in English are presented, as follows:
- The element form full construct name (XMLname) ...
- The attribute full construct name (XMLname) ...
7 ???
Annex 1 (normative): Requirements
This standard is designed to provide the following functionality:
- The standard shall provide a means of expressing, in SGML/XML instance format, all of the markup declarations permitted by the WebSGML profile of ISO/IEC 8879 and in Version 1.0 of the W3C Extensible Markup Language (XML)
- The standard shall be capable of identifying external data resources that may validly be included within document instances that conform to the model, including data instances that are in notations other than that defined in this standard.
- The standard shall be capable of identifying the notations required to process those parts of document instances that are not encoded according to the standard.
- The standard should allow the representation within document instances of data in a clearly identified notation that is not intended to be processed by programs that are conformant with this standard.
- The standard shall be capable of importing parts of models from external sources
- The standard shall provide a means of constraining the number of times a particular element may occur at a given point in a document model to be within a range with specified minimum and/or maximum values.
- The standard shall be capable of identifying the character set to be used to constrain the contents of elements or attributes.
- The standard shall provide a means of constraining the content of attribute values and elements to conform to a particular datatype or pattern based on a formally named, standardized, set of datatyping rules.
- The standard shall provide a means of identifying a set of permitted values against which the content of a particular element or attribute value shall be checked for validity. The set of permitted values may be provided as an external resource, or by reference to an external service using a standardized API.
- The standard shall provide a means by which the model of a document can be altered in response to the contents of a particular element or attribute (e.g. if the contents of an element or attribute recording the sex of a person is set to "Male" the use of any elements or attributes related to pregnancy should be forbidden).
- The standard shall provide facilities for defining "model types" that can form the basis for the models of elements in multiple document type definitions in such a way that users can restrict the use of parts of the model and add application-specific elements to the models at those points at which they are appropriate.
- The standard shall provide a means by which the authority responsible for defining part or all of a document structure can be uniquely identified, with elements defined by different authorities being identifiable as such within document instances.
- The standards shall provide a means by which sections of a document structure can be temporarily disabled without having to define a new document structure.
- The standard shall provide a means by which the rationale for an element, attribute or other information component can be recorded as an annotation to its declaration
- The standard shall be designed in such a way that it can be extended to include the functions of ISO/IEC 8879 not included in the normative part of this standard.
Annex 2 (normative): XML DTD for DSDL
Annex 3 (normative): DSDL Description of DSDL
Annex 4 (informative): Alphabetical List of DSDL Components
4.1 DSDL components common to SGML and XML
The following DSDL components can be used to describe
documents conforming to the WebSGML subset of ISO/IEC 8879:
Possible DSDL
element/attribute |
Defined in clause |
Equivalent ISO
8879 Construct |
Equivalent XML
DTD construct |
Equivalent XML
Schema element |
<attribute |
|
[143] attribute definition |
AttlistDecl |
<attribute |
<attribute
|
|
[144] attribute name |
Name |
<attribute
|
<attribute
|
|
[35] declared value |
AttType |
<attribute
|
<attribute
|
|
[147] default value |
DefaultDecl |
<attribute
|
<attribute
|
|
[147] default value ["FIXED"] |
DefaultDecl |
<attribute
|
<attribute
|
|
[147] default value
["IMPLIED"|"REQUIRED"] |
DefaultDecl |
<attribute
|
<characterSet
|
|
[173] character set description
|
EncodingDecl |
encoding
|
<comment |
Should this be <annotation? |
[91] comment declaration |
Comment |
N/A |
<data
|
|
From Relax-NG |
N/A |
<simpleType
|
<element |
|
[116] element declaration |
elementdecl |
<element |
<element
|
|
[30] generic identifier |
Name |
<element
|
<element
|
Do we need this? Does it need to conflate with type? |
[125] declared content |
contentspec |
<any |
<element
|
|
From Relax-NG |
N/A |
<element
|
<element
|
|
Extension based on W3C XML Schema that generalizes the specifically named options provided in Relax-NG |
N/A |
<element
|
<element
|
|
From W3C XML Schema (Relax-NG uses a separate ref element) |
N/A |
<element
|
<externalEntity |
|
[108] external entity specification |
GEDecl |
N/A |
<externalEntity
|
|
[102] entity name |
Name |
N/A |
<externalEntity
|
|
[73] external identifier |
ExternalID |
N/A |
<externalEntity
|
|
[41] notation name |
NDataDecl |
N/A |
<group
|
|
[127] model group (with modifications based on W3C XML Schema that generalize the specifically named options provided in Relax-NG) |
children (as modified by W3C XML Schema) |
<complexType
|
<inclusion name |
|
[104] parameter entity name |
PEDecl |
N/A |
<inclusion
|
Do we still need to separate out the definition of external parameter entities from their call, or should we move these two properties to the <include element? |
|
PEDef |
N/A (moved to the import request) |
<include
|
|
[60] parameter entity reference |
PEReference |
<import (but unnamed, with direct reference to the source, see above) |
<localEntity
|
|
[101] entity declaration |
GEDecl |
N/A |
<localProcess
|
Do we need this? |
[44] processing instruction |
PI |
N/A |
<markedSection
|
|
[93] marked section declaration |
CDSect |
N/A |
<notation |
|
[148] notation declaration |
NotationDecl |
<notation |
<notation
|
|
[41] notation name |
Name |
<notation
|
<notation
|
|
[149] notation identifier |
ExternalID |
<notation
|
<permittedValue
|
|
Based on W3C XML Schema enumeration and Relax-NG value elements. Extends [145] declared value [name token group] to constrain contents of text fields as well as attribute values |
Enumeration (as extended to element content by W3C XML Schema and Relax-NG) |
<enumeration
|
<schema
|
Do we need a public identifier? |
[110] document type declaration [external identifier] |
doctypedecl External ID |
<schema + <import or <include |
<schema
|
|
[111] document type name |
doctypedecl Name |
N/A |
<text |
|
[47] character data |
#PCDATA |
|
4.2 DSDL components specific to SGML
The following extensions could be made if it is decided that
DSDL should be able to express all constructs in SGML document instances as
well as the WebSGML subset.
Possible DSDL
element/attribute |
Defined in
clause |
Equivalent ISO
8879 Construct |
<applicationInfo |
Do we need this? |
[199] application-specific information |
<attribute source |
|
[147] default value
["IMPLIED"|"REQUIRED"| |
<capacitySet publicIdentifier |
Do we need this? |
[180] capacity set |
<characterDescription |
Do we need this? |
[176] character description |
<characterDescription startingFrom |
Do we need this? |
[177] described character set number |
<characterDescription for |
Do we need this? |
[179] number of characters |
<characterDescription becomes |
Do we need this? |
[178] base character set number, "UNUSED" or literal |
<externalEntity
|
Do we need this? |
[109] entity type |
<externalEntity
|
Do we need this? should the data attributes be defined as the contents of the entity defintion? |
[149.2] data attribute specification |
<dataTagGroup elementName |
Do we need this? Could the data tag details somehow be added directly to the element declaration? |
[133] data tag group |
<dataTagGroup paddingTemplate |
|
[137] data tag padding template |
<dataTagTemplate |
|
[136] data tag template |
<delimiterAssignment name literal |
Do we need this? |
[191] general delimiters |
<delimiters |
Do we need this? |
[190] delimiter set |
<element documentTypes |
|
[28 document type specification |
<element end-character |
Do we need this? |
[17] NET-enabling start-tag |
<element mixed |
Do we need this? |
[25] mixed content |
<element omitStart |
|
[123] start-tag minimization |
<element omitEnd |
|
[124] end-tag minimization |
<element rankStem |
Do we need this? |
[120] rank stem |
<element rankSuffix |
Do we need this? |
[121] rank suffix |
<element unclosed |
Do we need this? |
[17] unclosed start-tag |
<exclusions elementNames |
|
[140] exclusions |
<explicitLink sourceDocType resultDocType |
|
[158] explicit link specification |
<features |
Do we need this? |
[195] feature use |
<functionChars |
Do we need this? |
[186] function character identification |
<idLinkSet |
|
[168.1] ID link set declaration |
<implicitlink sourceDocType |
|
[157] implicit link source |
<inclusions elementNames |
|
[139] inclusions |
<linkRule sourceElementNames |
|
[163.1] link rule {source element specification] |
<linkRule resultElementNames |
|
[166.1] explicit link rule {result element specification] |
<linkSet name |
|
[164] link set name |
<linktype |
|
[154] link type declaration |
<linktype name |
|
[155] link type name |
<linktype href publicIdentifier |
|
[73] external identifier |
<markedSection status |
|
[93] marked section declaration |
<namingRules |
Do we need this |
[189] naming rules |
<quantities |
Do we need this? |
[194] quantity set |
<reservedName changeFrom changeTo |
Do we need this? |
[193] reserved name use |
<schema sgmlDeclaration |
Do we need this? |
[171] SGML declaration |
<sgmlDeclaration name |
Do we need this? |
[171] SGML declaration |
<shortRefDelimiters |
Do we need this? |
[191] short reference delimiters |
<shortRefSet name |
|
[150] short reference mapping declaration |
<shunnedChars useControls |
Do we need this? |
[184] shunned character number |
<simpleLink |
|
[156] simple link specification |
<syntax
publicIdentifier |
Do we need this? |
[183] public concrete syntax |
<useLink linkSetName postLinkSetName |
|
[165] source element specification [USELINK] |
<useMap name elementNames |
|
[152] short reference use declaration |