![]() | ![]() | Copyright © 1992, 1997 International Organization for Standardization. All rights reserved. This electronic document is for use during development and review of International Standards. Official printed copies of International Standards can be purchased from the ISO and the national standards organization of your country. | ||
| Next Clause | Previous Clause | |||
A.4 Property Set Definition Requirements (PSDR)
A "notation processor" operates on source data that is represented in the notation known to it, recognizing the information described by the source data.
NOTE 441 For example, SGML parsers and HyTime engines are notation processors for their respective notations, ISO 8879 (SGML), and ISO/IEC 10744 (HyTime).
A "property set" defines the types of information that can be returned by a notation processor for use by DSSSL and HyTime applications. The property set for a notation that defines every type of information that a processor for that notation can recognize is known as a "complete property set".
NOTE 442 The SGML property set and the HyTime property set defined in this International Standard are complete property sets. Whether complete property sets can be defined for other notations depends on how closely the notations resemble SGML and HyTime.
A "grove construction process" uses a notation processor to recognize instances of "classes" and their "properties" as defined in a property set, and represents the recognized instances as "nodes" in a graph structure known as a "grove".
NOTE 443 Grove is an acronym for "Graph Representation Of property ValuEs".
NOTE 444 For example, an SGML parser is a process that operates on source data that is represented in the SGML notation defined in ISO 8879. An SGML grove construction process uses an SGML parser to recognize instances of the classes and properties defined in the SGML property set, and then represents the recognized instances as nodes in an SGML grove.
There are two types of grove construction process: "primary" and "auxiliary". A primary grove construction process operates on data represented in a particular notation and produces a grove known as a "primary grove". An auxiliary grove construction process operates on nodes in other groves and produces a grove known as a "auxiliary grove". Property sets used by primary grove construction processes are called "primary property sets"; property sets used by auxiliary grove construction processes are called "auxiliary property sets".
A grove construction process need not recognize and include in groves all instances of all classes and properties defined in the result grove's property set. A "grove plan" is used to specify which of the classes and properties within a property set are to be included in (or excluded from) a grove.
A grove plan that includes all of the classes and properties in a property set is known as a "complete grove plan". A grove constructed in accordance with a complete grove plan is known as a "complete grove".
NOTE 445 The completeness of a grove is in relation to its property set; the completeness of a property set is in relation to its notation.
A property set is defined in a "property set definition document", which is an SGML document that conforms to the "property set definition architecture" defined in this clause. The property set definition architecture meta-DTD is identified by the public identifier "ISO/IEC 10744:1997//NOTATION AFDR ARCBASE Property Set Definition Architecture//EN".
Note that, although they are technically architectural forms, the element types and attribute definitions that appear in this clause are normally used directly as the document type of property set definition documents. They are therefore referred to as element types in this document.
A property set consists of "property set components": classes, properties, enumerated values (of properties), and normalization rules (for comparing string property values). Some property set components are deemed to occur in all property sets; they are known as "intrinsic components". The property set definition architecture defines an element type for each component type; elements conforming to those types define components of the corresponding types.
The components of a property set can be organized into one or more separate "modules". Modules allow classes and properties to be grouped to reflect important distinctions among them that might be a basis for including or excluding them from grove plans.
NOTE 446 In the SGML property set, these distinctions include:
the applicable specification document (that is SGML, DSSSL, or HyTime) whether the property exists only when an optional facility is used
(e.g. implicit link); and whether the property relates to the source data (e.g. general
delimiters), to the abstraction described by the source data (e.g.
elements and attributes), or to both (e.g. data characters).
A class identifies a type of information, giving it a name and "providing" an ordered set of properties. Each property is also given a name, as well as a "property number".
Class names are unique within the property set that defines them; property names are unique within the class that provides them.
Property numbers are assigned to the properties provided by a class according to the order of their definition within the property set definition document. The first property provided by a class is assigned the number 1; the number of each subsequently defined property is determined by adding one to the number of the nearest previously defined property provided by the same class.
NOTE 447 For the purpose of determining the property number, it is immaterial in which module a property definition occurs, or whether it occurs in the content of a class definition or independently.
NOTE 448 Because intrinsic property definitions are considered present at the beginning of each property set definition document, the first few properties of every class are intrinsic properties; this has the effect that the property numbers assigned to the intrinsic properties are the same for all classes.
Property numbers remain constant for all groves, regardless of which properties are included in any given grove plan.
A property has a "declared datatype" which specifies the type of value nodes are permitted to exhibit for the property. A declared datatype is one of the following:
A node.
An ordered list of zero or more nodes.
An ordered list of zero or more nodes, in which each node exhibits a "name property" whose value uniquely identifies the node within the list. A named node list is also called a "name space". The datatype of the name property must be string or node, and all of the name properties within a single named node list must have the same datatype.
Note that in effect, there are two varieties of named node list, "string-named" and "node-named". In the first, each node is identified by a string; in the second, each node is identified by another node. Collecting all the values of the "name" property of the nodes in a string-named node list yields a list of names that are unique within the list. Similarly, collecting all the values of the name properties of the nodes in a node-named node list yields a list of nodes that are unique within the list.
A value which represents one of an enumerated set of values; an enumerator.
An abstract character. The concrete representations of "char" values in groves must allow any "char" value to be distinguished from all other "char" values, and for the semantics of each "char" to be determined. The concrete representation may or may not rely on the DCS, UNICODE, and/or other character sets and repertoires.
An ordered list of zero or more abstract characters.
An ordered list of zero or more strings.
An integer.
An ordered list of zero or more integers.
A boolean value.
A property set component name.
An ordered list of zero or more component names.
The node, enum, character, string, integer, and compname datatypes are known as "primitive datatypes"; the nodelist, strlist, intlist, and cnmlist datatypes are known as "list datatypes".
NOTE 449 The string datatype, though described as a list of characters, is still considered a primitive datatype.
A property whose declared datatype is node, nodelist, or nmndlist is said to be "nodal". Properties with other declared datatypes are "non-nodal".
A property whose declared datatype is string may have an associated "normalization rule", which should be consulted when making comparisons involving values exhibited for the property, including looking up a node by its name within a string-named node list.
A property definition can specify conditions under which the property will not apply to an instance of the class that provides it. A node for which a property is inapplicable exhibits a "null value" for the property. (Note that this is not the same as exhibiting an empty value for a list property.)
A node is an ordered set of "property assignments", each of which associates a property name with a "value". The property names are the names of the properties provided by the node's class, the ordering of which determines the ordering of the property assignments.
A node is said to "exhibit" a value "for" a property if that node has a property assignment associating a value with the name of that property. Informally, the "properties of a node" are the properties for which the node exhibits a value. A node is said to be the "owner" of such properties.
Nodes exist in a directed graph in which each node may be connected to other nodes by labeled arcs. For each node occurring in the value exhibited by a node for a nodal property, there is one such arc from the exhibiting node to the value node, the label for which is the name of the nodal property.
NOTE 450 Informally, a node "points at" another node by having the node occur in the value of one of its properties.
The nodes occurring in the value of a nodal property are related to the node that exhibits the property by one of three possible "node relationship types": "subnode", "irefnode" (internal-reference node), or "urefnode" (unrestricted-reference node). Nodal properties can be qualified by their associated relationship type as "subnode properties", "irefnode properties", or "urefnode properties"; likewise, the arcs in the graph can be qualified by the relationship type assigned to the property to which the arc corresponds.
A node occurring in the value of a subnode property is called a "subordinate node" (or subnode) with respect to the node exhibiting the value, and the node exhibiting the value is the called the "origin" of the subnode. Each node may occur in the value of only one subnode property. The "siblings" of a node occurring in the value of a subnode property are the other nodes, if any, occurring in the value of the same subnode property. The subnode arcs of the graph connect nodes together into "subnode trees".
Irefnode, or "internal referenced node", arcs further connect nodes within one subnode tree. Such arcs may cause the graph to contain cycles and convergences.
A grove is a set of nodes connected together as a subnode tree and further connected by the irefnode arcs between the nodes. Each grove has exactly one node that is the root of the subnode tree; this node is the "grove root", and it is the only node within the grove that has no origin.
The remaining category of nodal properties (and therefore arcs) is urefnode, or "unrestricted referenced node". A urefnode arc connects a node to nodes in the same or other groves. A set of groves thus connected is a "hypergrove".
NOTE 451 A hypergrove should not be confused with a hyperdocument or the hypergrove that may result from the processing of a hyperdocument. Any set of groves connected by urefnode arcs is a hypergrove, not just those groves resulting from the processing of hyperdocuments.
At most one property of a node is designated the "content property" of the node. The content property of a node must be either a subnode property or a string or character property.
If the content property of a node is a subnode property, the property is also the "children property" of the node. The nodes occurring in the value exhibited for the children property of a node are called the "children" of the node, and the node itself is called the "parent" of the children. A node that has children but does not have a parent is a "content tree root". The set of nodes reachable from a content tree root through children properties forms an ordered tree called a "content tree".
NOTE 452 If a node has a parent, then the parent is also the node's origin.
NOTE 453 The fact that a grove could contain a collection of disjoint content trees is the reason why it is called a grove.
NOTE 454 Tree addressing of a grove is normally based on content trees rather
than on the subnode tree. That is because subnode tree ordering
depends on the order of property definitions while content tree
ordering depends on the order of node lists in property values -- that
is, on the real information represented by the grove. For this reason, when referring to a grove the unmodified term "tree"
means "content tree".
If the content property of a node is a character or string property, the property is also known as the "data property" of the node. The data of the node is the value of the data property. The data of a node with a children property is the data of each of its children, separated by the value of the node's "data separator property", if it has one.
For any given processing context, a grove plan may be used to specify what classes of node and what properties are significant within that context. This information can be used to prevent a process from wasting resources on inconsequential data while constructing a grove, or may be used as a mask over an existing grove, hiding those parts that are not of interest.
NOTE 455 The syntax used to specify a grove plan may vary. For instance, HyTime and DSSSL each define ways to specify grove plans (see 7.1.4.1 Grove Plan).
The effect of applying a grove plan is to remove, from the complete grove theoretically constructible by a given grove construction process with a given grove source, all instances of classes and properties not included by the grove plan.
When an instance of a class (that is, a node) is removed from a grove:
The node is removed from the value exhibited by any node for any nodal property in which it appears. If the property is a node property (that is not a list property), a null value is then exhibited for the property.
If the node has a children property, and the reason the node is being removed from the grove is only that its class has been excluded from the grove plan:
If the class specifies that content trees rooted at nodes of the class are to be "pruned" from the grove, each node in the value of the children property of the node is removed from the grove.
If the class specifies that content trees rooted at nodes of the class are not to be pruned from the grove, the node or list of nodes exhibited as the value of the node's children property are inserted into the subnode property value in which the node occurred, at the point at which it occurred.
If the node has a children property, and is being removed for a reason other than that its class has been excluded by the grove plan, each node in the value of the children property is removed from the grove.
Each node occurring in the value exhibited by the node for non-children subnode properties is removed from the grove.
When an instance of a property (that is a property assignment) is removed from a node:
The name of the property is removed from the value exhibited by the node for the "all property names" intrinsic property.
If the property is a subnode property, the name of the property is removed from the value exhibited by the node for the "subnode property names" intrinsic property.
If the property is the node's children property, the value exhibited by the node for the "child property name" intrinsic property is replaced by a null value.
If the property is the node's data property, the value exhibited by the node for the "data property name" intrinsic property is replaced by a null value.
If the property is the node's data separator property, the value exhibited by the node for the "data separator property name" intrinsic property is replaced by a null value.
If the property is a subnode property, the nodes occurring in the value exhibited for the property are removed from the grove.
If the property is a non-nodal property, the property is removed from the list of properties of the relevant nodes.
| Next Clause | Previous Clause |
HTML generated from the original SGML source using a DSSSL style specification and the SGML output back-end of the JADE DSSSL engine.