ISO/IEC JTC 1/SC34 N0355

ISO/IEC Logo

ISO/IEC JTC 1/SC34

Information Technology --

Document Description and Processing Languages

Title:

Summary of Voting on JTC 1/SC 34 N 320 Rev – FCD Ballot to ISO/IEC 19757-2 – Document Schema Definition Languages (DSDL) – Part 2: Grammar-based validation – RELAX NG

Source:

SC 34 Secretariat

Project:

ISO/IEC 19757 – Document Schema Definition Languages (DSDL)

Project editors:

J. Clark

Status:

 

Action:

Project Editors are requested to review comments and take them into consideration when preparing revised text.

Date:

26 November 2002

Summary:

Based on the result of voting, this document has been APPROVED.

Distribution:

SC34 and Liaisons

Refer to:

 

Supercedes:

 

Reply to:

Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mail:  [email protected]
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Mrs. Sara Desautels, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
25 West 43rd Street
New York, NY 10036
Tel: +1 212 642-4937
Fax: +1 212 840-2298
E-mail: [email protected]

 

SC 34 Voting Summary on JTC 1/SC 34 N 320 Rev

FCD to ISO/IEC 19757-2 – Document Schema Definition Languages (DSDL) –

Part 2: Grammar-based validation – RELAX NG

 

 

P-Member

APPROVAL OF THE DRAFT AS PRESENTED

APPROVAL OF THE DRAFT WITH COMMENTS AS GIVEN ON THE ATTACHED

DISAPPROVAL OF THE DRAFT FOR REASONS ON THE ATTACHED

Acceptance of these reasons and appropriate changes in the text will change our vote to approval

ABSTENTION (For Reasons Below):

 

Brazil

 

 

 

 

 

 

Canada

X

 

 

 

 

 

China

 

 

 

 

 

 

Denmark

 

 

 

 

 

 

France

 

 

 

 

 

 

Ireland

 

 

 

 

 

 

Italy

 

 

 

 

 

 

Japan

 

 

 

 

 

 

Republic of  Korea

 

X

 

 

 

 

Netherlands

X

 

 

 

 

 

Norway

X

 

 

 

 

 

United Kingdom

 

 

X

X

 

 

United States

 

X

 

 

 

 

TOTAL

3

2

1

1

 


  SC 34 National Body Comments on N 320 Rev

Japan

1) 10.2 Prohibited Paths

Infinite nameclass <attribute> not having <text> as the content should be prohibited.

2) 10.5 Restrictions on interleave

Relax the constraints on <interleave> without causing non-determinism.

3) 10.5 Restrictions on interleave

Disallow <interleave> occuring in <oneOrMore>.

 

United Kingdom

UK Vote: DISAPPROVAL OF THE DRAFT FOR REASONS APPENDED BELOW.

Acceptance of these reasons and appropriate changes in the text will change our vote to

approval

 

General

The UK is concerned about the potential confusion between the short names assigned to the

existing ISO TR 22250-1 (RELAX Core) and the proposed Part 2 for ISO 19575 (RELAX NG).

The UK notes that the formal standard does not fully conform to the technical report's

recommendations, so that RELAX NG cannot be said to be an extension of RELAX Core. If

the TR is not withdrawn, the short form of one of the two documents should be changed to

avoid the expectation that the standard is dependent on a TR. (It is noted that there is no

connection between the short form of the name and the full title of the document in either

case. Perhaps there should be, and RELAX NG should be changed to something like DSDL -

GBV.)

 

Clause 3

1) Terms should be ordered alphabetically to simplify the finding of relevant definitions.

2) Add definitions for the following terms used without definition in the text:

_ match

_ weakly match

_ content type

_ in-scope grammar

_ mixed sequence

_ union

 

Clause 3.1

The definition of resource is ambiguous: the word “potentially” does not clarify whether a

resource must or must not be addressable by a URI. If the resource is something that has

identity but cannot be addressed using a URI (e.g. “Acts Ch5 V2”) how can it be used by

19757-2?

 

Clause 3.4

The term “another URI” is misleading given that another URI could still be a relative URI.

Change to “a complete URI”.

 

Clause 3.10

Change “an NCName” to “a local name”. (The definition of local name introduces the need for

it to be an NCName: the important point for the name definition is that the second part is the

local name, not that it conforms to the NCName rule.)

 

Clause 3.14

It is not clear what form a “specification” takes. Either explain the format or remove the words

“specification of”.

 

Clause 3.22

The term equivalence relation is not explained in the text, or in the definitions. There is no

clear explanation as to why a datatype should consist of a “set of strings”: for example, rules

for validating dates are certainly not defined in terms of sets of strings, neither are rules for

defining ranges of numeric values, etc, which are typically used to constrain datatypes.

 

Clause 4.2.1

The word “thus” should not be used in ISO standards (see definition of m). The definition of m

mentions the possibility of consecutive strings, which is not permitted in XML or the rules in

19757-2. The definition might be better stated as:

_ m ranges over sequences of elements and strings; a sequence with a single member is

considered the same as that member; there are sequences ranged over by m that cannot

occur as the children of an element because the sequences ranged over by m may

contain consecutive strings and may contain strings that are empty.

 

Clause 4.2.2

The definition of the subscript c in p :c ct is not explained, and neither is the term content-type

(see comment on clause 3).

 

Clause 4.2.3

For the third listed entry, something needs to be said about the de-duplication implied by the

union process.

 

Clause 6

The URL for identifying the specification in ISO 19757-2 should start by a reference to ISO,

and not to an outside organization. I would suggest it takes the form:

http://www.iso.ch/jtc1/sc34/ISO19757/Part2/1.0

The text of the second paragraph will make it impossible to use any extensions or updates to

IETF RFC 2396 to identify resources. Given that we know of at least one planned extension,

to allow the full Unicode code set to be used for parts of URI’s that will form an

Internationlized Resource Identifier (IRI), it would seem wise to add add the phrase “or any

IETF approved standard that replaces or extends this specification” after the bracketed

reference to RFC 2732.

The EBNF constructs defined for the full grammar in this clause are not formally defined

within the standard. Their use is explained and constrained by rules defined in the Relax NG

Tutorial, which is not a normatively referenced part of this standard (or of the OASIS Relax

NG specification). A brief explanation of the purpose of each of the elements permitted in the

full model, and a full explanation of any rules that constrain their application, should be

included in this clause.

 

Clause 7.2

This rule would seem to require that any xml:lang attribute associated with an element would

need to be removed as part of the simplification process. Why should this attribute, or any

other needed to be added for the purpose of managing schema objects within specific

applications, have to be removed?

 

Clause 7.5

The phrase “a type attribute is added with value token” can be misread. Suggest changing it

to “a type attribute whose content is the token ‘value’.” (Alternatively, use fonts that distinguish

type and value as the code to be entered, rather than having them in the same face as the

adjacent text.)

 

Clause 7.8

The 2nd and 3rd, and 4th and 5th, sentences of the third paragraph are self contradictory. The

instruction of the second part of the 2nd sentence states that “the grammar element shall have

a start component” while the second part of the 3rd sentence reads “all start elements are

removed from the grammar element”. A similar conflict occurs when the 5th paragraph states

that “all define components with the same name are removed from the grammar element”,

which requires that there be no definition left for the name. Adding the word “other” after the

occurrence of “all” in the 3rd sentence is likely to correct the first error, presuming that the

purpose is to select the first start element as the valid one and discard all subsequent ones.

However, it is unclear that this is the case for the define element covered by the 5th paragraph

as the rules in 7.18 for combining define elements with the same name would seem not to be

apply-able if the rules in 7.8 have been applied.

 

Clause 7.9 and 7.10

In 7.10 the value of the ns attribute of a name is inherited from the nearest ancestor with an

ns attribute (as you would expect in XML) but for some reason 7.9 forces the ns attribute to

be empty for the name of an attribute definition. XML requires that attributes that are not

assigned a namespace are assigned the namespace of their parent element. How do the

rules in 7.9 ensure this?

 

Clause 7.13

In the 2nd paragraph remove the word “Similarly” and the following comma, and start a new

paragraph at this point. (The need to use a different font for codes mentioned above is

highlighted by the problems with the phrase “An element element”!)

 

Clause 7.19

It is unclear what the term “the in-scope grammar of the in-scope grammar” in the 2nd

paragraph means. (Adding a decent definition of the term in-scope grammar to section 3 may

help here, but the real problem is the explanation of how in-scope grammars nest, which is

not explained anywhere in this specification.

 

Clause 8

The definition of the simplified grammar element given here would allow the following to be a

legal definition:

<grammar><start><notAllowed/></start></grammar>

There does not seem to be a constraint to be prevent this option within 10.2.6, or an

explanation as to when such a production might be valid.

NB: The production also allows <grammar><start><empty/></start></grammar>, but 10.2.6

prohibits the use of this.

 

Clause 9.1

The purpose of the “subscript” mentioned in the second paragraph is not defined.

 

Clause 9.2 and 9.3

The meaning of all productions in these sections should be explained textually, as occurs for

the first two examples, but not for the rest. Specifically following should be explained:

1) The relationship between (name choice 1) and (name choice 2) in 9.2

2) The relationship between (choice 1) and (choice 2) in 9.3.1

3) The role of the optional [cx2] attribute in the definition of (value) in 9.3.8, and why this

is always a valid attribute.

4) How (token equal) can determine the equality of unsorted token lists such as “a b c

d” and “b d c a”

 

Clause 10.2.1

Referring to both x/p as a path and to p on its own as a path is confusing. x/p is a path but p is

really only a descendent which may or may not be a direct child of the parent (x). It might be

better to use p for parent, c for child and d for descendants and refer to the paths p/c and p//d

rather than x/p and x//p.

foo and foobar are not paths but names, one of which is undefined. While there is a case for

leaving foo in the list, foobar should definitely be removed from the penultimate paragraph.

 

Clause 10.3

It is not clear whether the restrictions shown apply to the full syntax as well as the simplified

syntax (partly because there are no constraints specified in the standard for the full syntax). If

these restrictions only apply to the simple syntax this should be clearly stated.

Textual explanations of the meanings of each of the constraints should be provided for those

without a mathematical background.

 

Annex A

It should be made clear whether the schema is defined using the full or the simplified syntax.

Doesn’t the reference to any within the definition of any (the last definition) conflict with the

rules in 7.20 that restrict looping references?

 

Annex B

The example is unrealistic, and fails to show key features of the language. At least one of the

elements should have an attribute other than a namespace attribute for which specific values

have been defined as valid. At least one of the elements should have nested elements, and at

least one of the nested elements should be allowed to contain data that is managed using a

datatype.

 

Bibliography

The URLs assigned to the two referenced documents need to be switched to refer to the

correct documents.

 

 

United States

 

The U.S. is concerned about potential confusion between 19757-2, Part 2 (RELAX NG) and TR 22250-1 (RELAX Core).  We request that the name of 19757-2, Part 2, be changed to something different (and more appropriate to an International Standard), such as, for example, "International Standard Schema Language (ISSL).