Date: 1996 - 03 - 22
To: ISO/IEC JTC1/WG8
From: US National Body
Subject: ISO 8879 revision cycle
Status: US National Body contribution
The US National Body requests that WG8 consider the following
when revising ISO 8879:
1. Allow a form of SGML declaration in which parameters for
human consumption (particularly the document character set)
are separated from parameters that are machine readable.
Also allow for parameters that require a private agreement
among users of a document (again the document character set)
to be omitted from the sgml declaration.
2. Clause 6.2.3 needs to be clarified. The user community has
widely interpreted it to mean that the SGML declaration is
optional, although particular software may require it. Is there
any other possible meaning?
3. The word "number" should not be used when "numeral" is clearly
intended. For example, 4.36 describes "a number that represents
[a] base-10 integer". There are base-10 numerals, but numbers
are inherently baseless. Also, the first note in 4.43 has the
same confusion.
4. Clause 4 defines several terms related to character sets. Are
they all needed? Are they all used in other clauses?
5. The note in 4.147 states that graphic characters normally have
a visual representation when a document is presented. When
do graphic characters not have a visual representation? Are
white space characters considered graphic characters? Is the
tab character really a control character?
6. Definitions 4.332 and 4.141 use a mix of very specific and
very general terminology.
7. 4.298 should clarify that significant characters must be
defined in the document character set.
8. In 4.98 what does "initially (at least)" mean? Does it mean
that the first data character in the document must be in the
document character set?
The note should be reworded:
When ..., the user at the receiving end is responsible
for translating to the receiving system's character set
if it differs from the sending character set.
9. Is the term "system character set" (4.311) used?
10. Inert function characters are defined only for use in named
character references. Can inert function characters be dedicated
data characters?
11. The discussions of ISO 2022 and associated definitions should be
greatly reduced in this standard.
12. Consider generalizing markup suppression characters to
strings.
13. In the second note in 13.1, the conclusions are
counter-intuitive. It is unexpected that external communication
is needed if a standard or registered character set is used,
but not otherwise. Also, in b), "will provide" should be
"should provide".
14. In note 1 in clause 13, remove "(in printed form!)".
15. The reserved name "UNUSED" identifies a character as a non-SGML
character. There is no need for users to learn the two
terms "UNUSED" and "NONSGML". One should be replaced with the
other.
16. Clarify the first note (second paragraph) in 13.1.2. What is
the relationship between control characters and unused
characters?
17. In 13.4.2, motivate the discussion of shunned character codes.
Why does this construct exist? Make it very clear that it is
the codes and not the characters that are shunned.