ISO/IEC JTC 1/WG4 N1982

ISO/IEC JTC 1/WG4

Information Technology ---

Document Description Languages

TITLE: Discussions Around the Module Feature
SOURCE: Toru Takahashi
PROJECT: JTC1.18.15.1
PROJECT EDITOR: Charles Goldfarb
STATUS: Paper for discussion: This paper reflects the discussions of modules early in the Paris meeting of WG4.
ACTION: For information
DATE: 14 May 1998
DISTRIBUTION: WG4 and Liaisons
REFER TO: WG8 N1873
REPLY TO: Dr. James David Mason
(ISO/IEC JTC1/WG4 Convenor)
Lockheed Martin Energy Systems
Information Management Services
1060 Commerce Park, M.S. 6480
Oak Ridge, TN 37831-6480 U.S.A.
Telephone: +1 423 574-6973
Facsimile: +1 423 574-0004
Network: masonjd@ornl.gov
http://www.ornl.gov/sgml/wg4/
ftp://ftp.ornl.gov/pub/sgml/wg4/

Discussions Around the Module Feature

May 14, 1998
Toru Takahashi, WG4 Japan


This is a  memorandum about the discussions around the SGML module feature and its related issues made in WG4 Paris meeting, and my current thought on them. In this document, "current proposal" refers to my document 'A Proposal to Introduce "Module" Structures into SGML' dated May 12, 1998.

Namespaces for Parameter Entities

Current proposal says that a parameter entity defined at outside of a module can't be referred to from inside of the module, and also a parameter entity defined at inside can't be refferred to from the outside. This isolation of namespaces for parameter entity names is suitable in many cases, but this approach has some disadvantages.

Now I think that the same mechanism (name qualification) should be permitted to access parameter entities defined in a module from its outside. If it is enabled, a designer can make a module which provides a set of useful parameter entities to the users as the components to construct their element types.

For example, suppose that a module contains definitions of element types commonly used in the content of paragraphs.

<!ELEMENT em      - -  (#PCDATA) -- emphasized phrase -->
<!ELEMENT q       - -  (#PCDATA) -- quoted phrase -->
<!ELEMENT fnref   - -  (#PCDATA) -- reference to a footnote -->
<!ELEMENT bibref  - -  (#PCDATA) -- reference to a bibliography --> According to the current proposal, you have to define a parameter entity by yourself, to use all of them as a single component to construct your DTD. (Suppose that the module "phrase" contains the definitions shown above.)

<!ENTITY % phrase  SYSTEM "some location" MODULE>
%phrase;

<!ENTITY % ph   "phrase:em|phrase:q|phrase:fnref|phrase:bibref">
<!ELEMENT para  - -  (#PCDATA|%ph;)*>
... If the name qualification is permitted for parameter entity names, the module "phrase" can provide the parameter entity "ph" by itself.

DTD:
<!ENTITY % phrase  SYSTEM "some location" MODULE>
%phrase;

<!ELEMENT para  - -  (#PCDATA|%phrase:ph;)*>
...

module "phrase":
<!ENTITY % ph     "em|q|fnref|bibref">
<!ELEMENT em      - -  (#PCDATA) -- emphasized phrase -->
<!ELEMENT q       - -  (#PCDATA) -- quoted phrase -->
<!ELEMENT fnref   - -  (#PCDATA) -- reference to a footnote -->
<!ELEMENT bibref  - -  (#PCDATA) -- reference to a bibliography --> It is not only a saving of typing effort. If the parameter entity "ph" is provided by the module itself, you don't have to modify your DTD even if the module added one more element type or changed an element type name. This feature can improve the reusability of modules.

Reference to the Outside

From a module's point of view, the outside (the parent module, the grandparent module, ...) defines the environment for the module. In some cases, it seems convenient if some means is provided to refer to the names defined outside of the module.

Ordinary programming languages provides the means to access names defined in the environment. For example, in programming language C, if a variable name used in a function are not be declared as the name of a local variable, it is interpreted as the name of a "global" variable which is defined somewhere outside of the function. It is a convenient way to pass information implicitly to a function, but on the other hand, it is dangerous because it makes the function dependent to the surrounding environment and changes the behavior of the function according to the context where the invocation of the function happened.

According to the purpose of the module feature, I can't find any benefit of outside reference which can compensates the risk it may introduce. If an entity refers to some names defined at outside and depends on it, this entity should be declared as an ordinary external parameter entity, not as a module entity. If you have to pass some information to a module, they should be passed explicitly as parameter strings.

Syntax for Parameter Passing

The syntax for parameter passing used in current proposal has some problems.

  1. All parameters must be specified every time the module is referred to. There are no defaulting mechanism which enables to omit a parameters when its desired value is equal to its default value.
  2. In a module, you have to identify a parameter with a number which specifies the appeared order of the parameter in the list of  parameter strings. It is inconvenient if the module requires many parameters.
  3. The current syntax requires a new delimiter role (npro: numbered parameter reference open) and two new productions ("numbered parameter reference" and "reference parameter group"). Also, to allow a numbered parameter reference anywhere a parameter entity reference is allowed, several productions are modified. These changes to the syntax may make its implementation and users' acceptation difficult.

To solve these problems, a new syntax for parameter passing is proposed. This syntax uses the production for data attribute specification to pass parameters to a module. Here is a comparing example of them.
 

DTD:
<!ENTITY % module1  SYSTEM "some location" MODULE>
%module1("a,b", "c");

module1:
<!ELEMENT foo  - O  ($1;)>
<!ELEMENT bar  - O  (#PCDATA|$2;)*>

DTD:
<!ENTITY % module1  SYSTEM "some location" MODULE
  [ foomodel="a,b" barcompo=c ]>
%module1;

module1:
<!PARAMLIST  -- defines types and defaults for parameters --
   foomodel "#PCDATA" #REQUIRED
   barcompo NAME      x
>
<!ELEMENT foo  - O  (%foomodel;)>
<!ELEMENT bar  - O  (#PCDATA|%barcompo;)*> Enabling parameter reference by its name and enabling default value for each parameter is desireble. I have no objection to do so. But I feel it is somewhat overspec to allow a parameter has its type. The usages of passed parameters are almost identical to that of ordinary parameter entities defined in the module. (The only difference is the namespace which is used to interprete the names in the referred string.) I believe that passed parameters need not to be typed if ordinary parameter entities need not to be typed.  Here is my revised proposal for parameter passing.

DTD:
<!ENTITY % module1  SYSTEM "some location" MODULE
  [ foomodel="a,b" barcompo=c ]>
%module1;

module1:
<!ENTITY % foomodel "#PCDATA">
<!ENTITY % barcompo "x">
<!ELEMENT foo  - O  (%foomodel;)>
<!ELEMENT bar  - O  (#PCDATA|%barcompo;)*> Parameter strings specified in a module entity declaration overrides the entity text of corresponding parameter entity declared in the module. If a parameter string is omitted, the entity text defined in the module remains effective. It is the default value of the parameter. If no default value is required, you merely use the parameter entity name without defining the enity text. In this case, omitting the parameter string causes an markup error.

In this approach, references to passed parameters and parameter entities cannot be distinguished syntactically. Here is an alternative to make this distinction clear.

DTD:
<!ENTITY % module1  SYSTEM "some location" MODULE
  [ foomodel="a,b" barcompo=c ]>
%module1;

module1:
<!ENTITY % foomodel "#PCDATA" PARAMETER>
<!ENTITY % barcompo "x"       PARAMETER>
<!ELEMENT foo  - O  (%foomodel;)>
<!ELEMENT bar  - O  (#PCDATA|%barcompo;)*> In this approach, only the parameter entity which is declared as a parameter could be replaced its text with the value of passed parameter.

Handling of Attribute Names

*TBD*

Handling of Short Reference Map Names

*TBD*

Name Qualification and Minimization Feature

*TBD*

Parameter Passing for Enities other than Module Enities

*TBD*

Harmonization with Namespaces in XML

*TBD*