| TITLE: | Extended Naming Rules External Syntax |
| SOURCE: | Rick Jelliffe |
| PROJECT: | |
| PROJECT EDITOR: | |
| STATUS: | WG8 approved statement |
| ACTION: | For adoption as national standards by national bodies as appropriate |
| Summary of major points: | This document gives the syntax recommended by WG8 for external syntax declarations, to support of SGML names in non-Latin scripts. Though the syntax given is extra-standard, an SGML document can validly refer to a syntax declaration that uses a non-standard syntax, using a public identifer. |
| DATE: | 24 May 1996 |
| DISTRIBUTION: | WG8 and Liaisons |
| REFER TO: | WG8 N1861 |
| REPLY TO: | Dr. James D.
Mason
(ISO/IEC JTC1/SC18/WG8 Convenor) Oak Ridge National Laboratory Information Management Services Bldg. 2506, M.S. 6302, P.O. Box 2008 Oak Ridge, TN 37831-6302 U.S.A. Telephone: +1 423 574-6973 Facsimile: +1 423 574-6983 Network: masonjd@ornl.gov http://www.ornl.gov/sgml/wg8/wg8home.htm ftp://ftp.ornl.€ |
This document describes a recommended extension of SGML known as the "Extended Naming Rules". The extension should be used only in SGML documents for which the normal naming rules are unsuitable (usually because of the size of the natural language character set). An SGML system need not support these Extended Naming Rules in order to be a conforming SGML system.
This recommendation is phrased in terms of revisions to be made to the body of the International Standard ISO 8879:1986. However, these revisions are only applicable in an entity referred to using the public identifier in the Syntax parameter of the SGML Declaration. This variant SGML syntax declaration syntax has the public identifier 'ISO/IEC JTC1/SC18/WG8 N1854//NOTATION Extended Naming Rules//EN'.
For many languages the distinction made in production [189] between uppercase and lowercase is not relevant. It is, therefore, necessary to modify clause 13.4.5 to allow for both an extended character set and for the use of character sets that do not have different cases. The changes required, in the order of their occurrence in 13.4.5, are:
[189] naming rules =
"NAMING", ps+,
"LCNMSTRT", (ps+, extended naming value)+,
"UCNMSTRT", (ps+, extended naming value)+,
("NAMESTRT", (ps+, extended naming value)+)?,
"LCNMCHAR", (ps+, extended naming value)+,
"UCNMCHAR", (ps+, extended naming value)+,
("NAMECHAR", (ps+, extended naming value)+)?,
"NAMECASE", ps+,
"GENERAL", ps+, ("NO"| "YES"), ps+,
"ENTITY", ps+, ("NO"| "YES")
[189.1] extended naming value = parameter literal | character number | character range
A character number may be used to specify a character that is defined in the syntax-reference character set but is not permitted in an SGML declaration.
[189.2] character range = character number, ps*, "-", ps*, character number
Specifying a character range is equivalent to specifying every character number from (and including) the character number that starts the range to (and including) the character number that ends the range.
The following statements occur in clause 0.2 of ISO 8879:1986:
The characters used for names can be augmented by any special national characters.
This is contradicted by the restriction, in production [189] of the current specification, that only a single parameter literal, whose length may not exceed 240 characters, can be used to specify name characters. This means that, for characters outside the ISO 646 character set which have to be specified using numeric character references, no more than 40 additional name characters can be specified. Clearly this is insufficient to support most languages, especially those with large character sets such as Japanese, Chinese and Korean.