Date: 1996 - 03 - 22

To: ISO/IEC JTC1/WG8

From: US National Body

Subject: ISO 8879 revision cycle

Status: US National Body contribution

The US National Body requests that WG8 consider the following

when revising ISO 8879:

1. Allow a form of SGML declaration in which parameters for

human consumption (particularly the document character set)

are separated from parameters that are machine readable.

Also allow for parameters that require a private agreement

among users of a document (again the document character set)

to be omitted from the sgml declaration.

2. Clause 6.2.3 needs to be clarified. The user community has

widely interpreted it to mean that the SGML declaration is

optional, although particular software may require it. Is there

any other possible meaning?

3. The word "number" should not be used when "numeral" is clearly

intended. For example, 4.36 describes "a number that represents

[a] base-10 integer". There are base-10 numerals, but numbers

are inherently baseless. Also, the first note in 4.43 has the

same confusion.

4. Clause 4 defines several terms related to character sets. Are

they all needed? Are they all used in other clauses?

5. The note in 4.147 states that graphic characters normally have

a visual representation when a document is presented. When

do graphic characters not have a visual representation? Are

white space characters considered graphic characters? Is the

tab character really a control character?

6. Definitions 4.332 and 4.141 use a mix of very specific and

very general terminology.

7. 4.298 should clarify that significant characters must be

defined in the document character set.

8. In 4.98 what does "initially (at least)" mean? Does it mean

that the first data character in the document must be in the

document character set?

The note should be reworded:

When ..., the user at the receiving end is responsible

for translating to the receiving system's character set

if it differs from the sending character set.

9. Is the term "system character set" (4.311) used?

10. Inert function characters are defined only for use in named

character references. Can inert function characters be dedicated

data characters?

11. The discussions of ISO 2022 and associated definitions should be

greatly reduced in this standard.

12. Consider generalizing markup suppression characters to strings.

13. In the second note in 13.1, the conclusions are

counter-intuitive. It is unexpected that external communication

is needed if a standard or registered character set is used,

but not otherwise. Also, in b), "will provide" should be

"should provide".

14. In note 1 in clause 13, remove "(in printed form!)".

15. The reserved name "UNUSED" identifies a character as a non-SGML

character. There is no need for users to learn the two

terms "UNUSED" and "NONSGML". One should be replaced with the

other.

16. Clarify the first note (second paragraph) in 13.1.2. What is

the relationship between control characters and unused

characters?

17. In 13.4.2, motivate the discussion of shunned character codes.

Why does this construct exist? Make it very clear that it is

the codes and not the characters that are shunned.