Copyright © 1992, 1997 International Organization for Standardization. All rights reserved.

This electronic document is for use during development and review of International Standards. Official printed copies of International Standards can be purchased from the ISO and the national standards organization of your country.

Next ClausePrevious Clause  

homeParent clauseNext major clausePrevious major clauseNext clause at this level


6 Base module

6.1 Concepts and definitions

Subclauses:


Some key concepts relating to the facilities of the base module are described in this sub-clause.

6.1.1 Object representation

SGML is used for source object representation in a HyTime document. Groves are used for the representation of abstract objects derived from source objects.

NOTE 70 While the source of HyTime documents is represented in SGML, all HyTime semantics are defined in terms of operations on nodes in groves, e.g., on the abstract data objects described by SGML documents (or any other type of data for which a suitable grove representation can be provided).

This sub-clause describes the key SGML constructs on which HyTime relies.

NOTE 71 Components of a HyTime hyperdocument that are not HyTime documents need not be represented in SGML. A document that is used as a hub document of a HyTime hyperdocument must be a HyTime document, and therefore must be represented in SGML.

SGML is a conceptual tool for the modeling of information structures intended for human perception (called "documents"), as well as a notation for representing them. Documents are analyzed and represented in terms of three main constructs:

element

A structural building block that can be defined to contain data, subordinate elements, or both. In typical documents, examples of elements would be paragraphs, chapters, headings, figures, and tables. In hypermedia documents, some examples are hyperlinks of various types, event schedules, events, and objects encoded in a variety of data notations.

attribute

A property, associated with elements of a given type, whose value describes the element but is not part of its content. For example, the owner of a chapter or revision date of a table. Many element types have an ID attribute, which provides a label for each element of that type so that it can be referenced explicitly from any place in the document. The references are made by elements that have an ID reference attribute whose value is the label of the element referenced (as defined by that element's ID attribute).

entity

A unit of virtual information storage that contains part of a document (sometimes all of one or even more than one). An entity can be referenced by a name from one or more places in a document, thereby causing its information to be included in the document at the points of reference (see 6.1.1.1 Entity structure).

Entities are independent of elements. For example, an entity could contain part of a picture element or one and one-half paragraph elements.

The particular instances of those constructs that are permitted in a given document are declared in a "document type definition" (DTD) to which the document conforms. The designer of a DTD can choose to allow parts of it to be extended or modified (typically by redefining entities that are referenced in the DTD). This technique provides the flexibility necessary to accommodate a wide variety of applications.

Processors that act on the semantic objects and their data content do not operate directly on the source representation. Instead, they must first create an abstract, "in-memory" representation of the structures and data in the source. In the case of SGML, an SGML parser processes the source document to recognize markup and data and provides information about what it found to a processing application. The processing application then operates on its particular abstraction derived from the result of parsing the document.

The HyTime standard uses a standard form of abstract structure representation called "groves". All HyTime-defined processing operates on groves (rather than on the unparsed source data from which groves are constructed). Groves consist of "nodes", which exhibit "properties". Each node is of a particular node class. The node classes and their properties are defined in "property sets".

The combination of property sets and groves provides a standard for the representation of application-specific data views of SGML documents (and, potentially, other data notations) and enables interoperation of applications by, at a minimum, providing a common design formalism by which applications can define the data objects on which they operate and the results of those operations, irrespective of how they actually represent or process data objects internally. In other words, groves and property sets make it possible to discuss the processing of structured data without first defining the implementation of the processors themselves. Groves are discussed in detail in 7.1.4 Groves and Location Addressing and in A.4 Property Set Definition Requirements (PSDR).

6.1.1.1 Entity structure

SGML includes a virtual storage model called the "entity structure" which allows a user to divide a document arbitrarily for ease of management. The entity structure is "virtual" because the mapping from entities to real storage is implementation-dependent, and there need not be a one-to-one relationship between entities and storage objects.

The entity structure is not typically reflected in the part of the document type definition that defines the structure that the application processing is concerned with -- the "element structure". The independence of the entity structure is one of the great strengths of SGML, and is essential for hypertext. For example, it relieves an application designer from the burden of having to predict whether chapters will span multiple storage objects or whether a single storage object will contain multiple chapters.

6.1.1.2 Data

The term "data" is used to distinguish the information in a document that is not part of the document structure. For example, the character text within a paragraph, or the raster information representing a photograph, is data. An external entity containing a document that uses a different document representation from the document referencing it is also data.

NOTE 72 Similarly, a recursive instance of an SGML document or subdocument is data with respect to the current instance, because, as it has its own DTD, it is not parsed in the current SGML parsing context.

Data is sometimes referred to informally as "content", but this usage is ambiguous because the content of an element is not restricted to data; it could also be subelements exclusively, or mixed subelements and data. Data can also occur outside the content of an element in attribute values or external data entities.

NOTE 73 In SGML, data is also defined as "that which is not markup", but the two definitions are consistent: In an SGML document, information that is considered data because it is not structural will also not be markup, and vice versa.

HyTime provides facilities for addressing all forms of data, either in terms of HyTime constructs, or by interfacing to "location addresses" that understand the data representation (see 7 Location address module).

6.1.2 Object identification and addressing

Hypertext linking and multimedia time synchronization are applications of the same basic function -- addressing.

A hyperlink may address its anchors by names that are unique within some name space or by position within a list or tree. Time synchronization uses coordinate addresses of events on a time axis: the address of event A may be expressed in terms of its position and size on the axis, or in terms of its relationship to the address of event B. Spatial alignment is similar, except that the axes are measured in spatial units, rather than temporal.

In HyTime, addressing is applied to nodes in groves and the result of any location address is the list of nodes addressed. Syntactically, the list of nodes addressed can be given a unique name by specifying a unique ID for the location address element that addresses the nodes. Whether it has a name or not, every such node list is uniquely addressable because it becomes the property of a node in the "HyTime semantic grove". The nodes exhibiting these properties can then be addressed by name, if they have one, or by other node addressing methods, if supported.

NOTE 74 If an application only supports ID-based addressing, all elements on which other elements depend and use by reference must have IDs.

6.1.2.1 Name space addressing

In the grove model, any node class may exhibit a "name" property that uniquely identifies it within a "name space". In groves, a name space is a property whose value is a list of nodes, all of which share the same "name" property and each of which exhibits a unique value for its name. SGML defines two primary name spaces that are made directly accessible through the syntax of SGML documents: application-defined information components, called elements, and units of storage segmentation of information, called entities.

By and large, applications deal only with elements, while storage is managed transparently. Therefore, the fundamental form of name is one that is unique among the elements of a document, known as a unique identifier (ID). The fundamental means of specifying an object to reference is called an "ID reference" (IDREF). Like the ID, it is an attribute of an element.

Conceptually, all location addresses begin as ID references.

NOTE 75 HyTime also provides "shortcut" forms of address in which the ID reference or references are implicit. For example, when coordinate addressing is supported, the reftype facility allows the direct use of coordinate addresses in place of ID references. However, for all such shortcuts, the equivalent reference can be constructed using ID references and other fundamental forms of addressing.

For locations outside the current document, HyTime uses a standardized system of identification, including public and private, local and global, unique identifiers (see ISO 9070) in order to address the entities that contain those locations.

6.1.2.2 Coordinate addressing

A coordinate system in HyTime is called a "finite coordinate space". It consists of a set of coordinate axes and a system for measuring along them.

Each axis is treated as an ordered set of "quanta". A coordinate address consists of a position (the first quantum of interest) and a specific number of subsequent contiguous quanta for each of the axes of the coordinate space. This combination of position and size is called an extent.

When the scheduling module is supported, occurrences of objects ("events") can be given extents in coordinate spaces. Events can be aligned with one another by defining their extents with reference to the extents of other events.

When the location address module is supported, "location address" elements can be defined that associate an ID (directly or indirectly) with a coordinate address. This type of location address allows references to be made to objects that can be identified only by their position.

NOTE 76 For example, "the third word in the sentence".

It also allows references to arbitrary portions of an object.

NOTE 77 For example, "the second and third characters of the third word in the sentence".

6.1.2.3 Semantic addressing

Any object, in any notation, can be represented in a HyTime hyperdocument. The object as a whole could be an element, and therefore could have an ID. The object could be included in an event, and thereby have a coordinate address.

When the location address module is used, it is also possible to address an object by a query against its properties. Nodes in any grove, regardless of the data notation from which they were derived, are addressable by HyTime location addresses. Because the properties of nodes in groves are formally defined in a property set definition, it is possible to have general query facilities that can operate on any grove. Groves can, potentially, be constructed from any data notation by a "grove construction process" that understands the syntax and semantics of the notation (however the degree to which a grove can fully express the structure or content of a given notation is dependent on its similarity to SGML -- groves are not intended nor guaranteed to be capable of fully representing any possible notation).

NOTE 78 In practical terms, "constructing a grove" may simply mean providing an interface that takes as input a HyTime location address and returns the data that would have been addressed had a literal grove been constructed. In other words, there is no requirement to literally build a grove; it is only necessary to behave as if one had been built.

Next ClausePrevious Clause  

Copyright © 1992, 1997 International Organization for Standardization. All rights reserved.

This electronic document is for use during development and review of International Standards. Official printed copies of International Standards can be purchased from the ISO and the national standards organization of your country.


HTML generated from the original SGML source using a DSSSL style specification and the SGML output back-end of the JADE DSSSL engine.