| TITLE: | Japanese Contribution on the Urgent TC for the Web |
| SOURCE: | SC18/WG8 Japan |
| PROJECT: | JTC1.18.15.1 |
| PROJECT EDITOR: | Charles F. Goldfarb |
| STATUS: | National Body Contribution for WG8 meeting in Barcelona, May 1997 |
| ACTION: | For discussion |
| DATE: | 5 May 1997 |
| DISTRIBUTION: | WG8 and Liaisons |
| REFER TO: | the draft text of "Urgent TC for the Web" (dated on 15 March 1997) |
| REPLY TO: | Dr. James David Mason
(ISO/IEC JTC1/SC18/WG8 Convenor) Lockheed Martin Energy Systems Information Management Services 1060 Commerce Park, M.S. 6480 Oak Ridge, TN 37831-6480 U.S.A. Telephone: +1 423 574-6973 Facsimile: +1 423 574-0004 Network: masonjd@ornl.gov http://www.ornl.gov/sgml/wg8/wg8home.htm ftp://ftp.ornl.gov/pub/sgml/wg8/ |
WG8/Japan met two times to discuss the topic of "Urgent TC for the Web". As the conclusion of these meetings, we can say that WG8/Japan besically supports the TC. However, we believe that the number of items and the degree of changes proposed in the draft TC text are too much to be made as an urgent technical corrigendum.
The technical changes which will be made by the TC should be minimized to fulfill only the requirements explicitly stated in the specification document of XML. In other words, all other stuff which are not nesessary to make valid XML documents into conforming SGML documents, should be realized in the next revision of SGML after enough technical study and discussion.
> 2. All valid and well-formed XML documents will be conforming SGML > documents.
XML does not request to make well-formed (but not valid) XML documents into conforming SGML documents. (see 1.2 of XML spec.) This objective should be deleted. (It seems better to be moved to the next revision of SGML.)
> 3. XML will not be a "special case". SGML tools that support the TC > will be able to process XML documents without special knowledge of > XML. For this reason, the special XML PIs have been replaced with > equivalent SGML markup.
We support the principle stated in the second sentence.
However, it seems very difficult to accomplish the objective stated in the first sentence. For example, if an XML document contains a "LOCATOR" element which has a "HREF" attribute whose value contains some "XPointer"s, how to process these "XPointer"s correctly without special knowledge of XML?
If the "processing" concerned in this item means just the correct "parsing" of XML documents, we support this.
> 4. A validating SGML parser that supports the TC will be able to > validate documents for conformance to the SGML language aspects > of the XML spec.
If this item requests to check all the violation of XML specific restrictions on SGML syntax, we believe that these validation should be done by XML processors, not by SGML parsers.
> 2R. It must be possible to infer the document type name from the > document instance, so that subelements of a document type can > be used as document instances without maintaining and downloading > multiple variations of the same DTD. > > 2S. The keyword "#IMPLIED" is allowed as an alternative to the > document type name in a DOCTYPE declaration. When used, the document > type name is inferred from the start-tag of the document element. For > example: > > <!DOCTYPE #IMPLIED SYSTEM "some.dtd"> > <docelem>
This extention looks useful. We would support this. However, this feature does not nessesary for XML. (XML allows only Names in doctypedecl.) This extention should be realized in the next revision of SGML.
> 3R. It must be possible for well-formed documents: > > a) to have no DOCTYPE declaration; > b) to be validated for well-formedness without one; and > c) to be parsed with one, without one, or without respect to one if > it exists. > > 3S. The SGML declaration provides a new minimization feature, > well-formed documents (WELLFORM), with the values YES or NO (the > default), plus several related facilities.
Well-formedness should not be treated as a "minimization" feature, because it will also affect many aspects of SGML other than minimizations.
> 3S2. When WELLFORM YES is specified, OMITTAG NO and SHORTTAG ATTDFLT > must also be specified.
We oppose this. Any constraint should not be introduced amang minimization features.
Furthermore, it seems inconsistent with XML itself. In XML, defaulting of attribute values requires the existence and processing of a DTD (see 2.10 of XML spec.). Isn't it inconsistent with what you saying in "3S3"?
> 3S3. When WELLFORM YES is specified, the document is well-formed and > the DOCTYPE declaration is optional. If present, its document type > declaration subset can be ignored, except when validating whether the > document is a type-valid SGML document. > > An omitted or ignored DOCTYPE declaration causes the parser to behave > as though it had parsed the document type declaration: > > <!DOCTYPE #IMPLIED>
As mentioned above, well-formedness itself does not make the DOCTYPE declaration optional.
> 3S4. When WELLFORM YES is specified, ... > > When OMITXTRN or OMITALL is specified, but the document is parsed > without omitting the subset(s), white space that occurs in element > content will be so identified in the grove. It is a reportable markup > error if different groves result from parsing a document with and > without reference to an omissible subset, unless the only difference > is in the identification of white space characters as described above.
We have a big question on this item. Is it possible? For example, if there are some ATTLIST declarations in the omissible subset, the result grove must be different in interpretation of attributes, depending wether the subset have been ignored or not.
> 3S5. When WELLFORM YES is specified, a validating parser shall be > capable of validating that the document is a conforming SGML document, > and that it is type-valid, well-formed, or both.
We oppose this. This requirement makes a validating SGML parser more difficult to implement.
> 4R. Correct the definition of comment declaration (maintaining > backward compatibility) so that hyphens will not cause an error. > > 4S. Production [91] is replaced with the following productions: > > [91] comment declaration = > empty comment declaration | > multiple comment declaration | > single comment declaration > > [91.1] empty comment declaration = mdo,mdc > > [91.2] single comment declaration = > mdo,scomo,SGML character*,scomc,mdc > > [91.3] multiple comment declaration = > [same as existing 91] > > Two new delimiters are added: "single comment open" (SCOMO) and > "single comment close" (SCOMC); for example, "--*" and "*--". SCOMO > cannot be the same as COM. A concrete syntax need not define SCOMO and > SCOMC, but both must be specified if either is. SCOMO begins SCOM > mode, in which the only delimiter recognized is SCOMC when followed by > MDC.
It is not required to introduce a new kind of comment declaration to support XML, so this item should be treated in the next revision of SGML.
Furthermore, the constructs proposed in this draft TC makes the sytax of SGML more complex than that of ISO 8879:1986. We believe that the changes to SGML syntax should be done in the direction to simplify it. About the syntax of comment declaration, we think that the proposal shown below would be better.
[91] comment declaration =
CDO,
SGML character*,
CDC
This change to comment declaration will not affect any documents conforming to ISO 8879:1986.
> 5R. A simplified version of the markup is needed for online use. > > 5S. The SGML declaration provides a new feature, "markup level" > (MARKUP), with the values SIMPLE or FULL, the default. > Specifying MARKUP SIMPLE changes the following aspects of the > markup:
We oppose to introduce such an "XML specific" markup mode, because:
> 7S1) ... > An entity can be physically stored in multiple storage > objects, and an SGML header must always begin at the beginning of a > storage object. When an SGML header occurs at the start of the first > storage object of an SGML document entity, the referenced SGML > declaration body is bracketed with "<!SGML " and ">" and presented to > the parser as an SGML declaration
We oppose this. This requirement makes an existing SGML document (if it is stored in multiple strage objects) non-conforming.
> 7S2) An optional, profile-defined, system information (SYSINFO) > parameter is added as an optional 1st parameter of an SGML > declaration, in both the full and header forms. It consists of the > keyword SYSINFO followed by a single SPACE followed by a minimum > literal. It is separated from "<!SGML" by a single SPACE. For example: > > <!SGML SYSINFO "Shift-JIS" ...> > > The interpretation and use of the SYSINFO literal must be explained in > comments in the SGML declaration body.
Is a comment (to explain the SYSINFO parameter) allowed in an SGML header?
> 7S3) The SGML declaration provides a new feature, "multiple headers" > (MULTIHDR), with the values YES or NO, which is the default. If YES is > specified, an SGML header is permitted at the start of all storage > objects in which SGML entities are stored. For example: > > <!SGML SYSINFO "Shift-JIS" "IDN//XML.ORG//SD XML 3.0//EN"> > > The header is stripped and processed by the entity manager, as > described earlier.
It seems inconsistent with what you said in "7S1)". According to the text of "7S1)", an SGML header must be present at the beginning of every strage object. Which solution is correct?