ISO/IEC JTC 1/WG4 N1942

ISO/IEC JTC 1/WG4

Information Technology ---

Document Description Languages

TITLE : Requirements for Standardizing Light Weight Page Description Specification

SOURCE : JTC1/WG4 Japan (Masato Toho, Fuji Xerox Co.,LTD.)
STATUS : Document for the JTC1/WG4 Discussion
REQUESTED ACTION : Discussion for preparation of NP
DATE : Dec. 1, 1997
DISTRIBUTION : JTC1/WG4




1. Back Ground

Page Description Languages (PDL) are used for describing formatted documents to be sent to printers. Many kinds of PDLs have been developed so far, and are used widely for network printer systems. Examples of PDLs are PostScript (Adobe), Interpress (Xerox), and PCL (HP), to name a few. PostScript is the PDL most widely used among others, especially in Japan. In the U.S. and Asian countries, PCL is also widely used. Printer makers are also providing their proprietary PDLs. Examples of these proprietary PDLs includes LIPS (Cannon), and ART (Fuji Xerox).

Printer makers need to support some others' proprietary PDLs, such as PostScript and PCL for maintaining compatibility and their proprietary PDL for many reasons, such as more efficiency and flexibility on their own printers, compatibility with current products, and cost. Since, royalty is needed to use others' proprietary PDLs like PostScript in many cases, printer makers' own PDLs cost less.

Printer makers need to pay higher development cost and to spend longer development periods for supporting multiple PDLs. Existence of many kinds of PDLs also cause incompatibles among printers even from a same manufacturer. Customers need to choose printers based on PDL to be used in their environment in addition to printer's basic functionality such as color capability or printing speed. Choosing PDL is not a fruitful duty, since main purpose of PDLs, describing formatted pages, is same among all PDLS, and drawing capability of PDLs are almost same for all PDLs.

After a lot of efforts, Standard Page Description Language (SPDL) had became the international standard for PDL. It provides both powerful functionality like PostScript does and document structure like Interpress does. So, it is powerful enough to cover all types of formatted documents for all kinds of printers.

Unfortunately, SPDL has not been used yet, so that it has not solved the issues and the situation of PDLs is unchanged. SPDL is very large and complicated standard, so that it requires a lot of effort to understand it, to implement a processing system, and to implement printer drivers. Printer makers seem to be already too busy to develop a new printer systems based on this large standard.

Another issues of PDLs are processing time of printers and requirement of extensive computing resources to execute it. These issues are becoming more apparent recently, because of increments of number of color documents. Since color documents are far more complex than black and white documents, they require longer processing time and more computing resources than black and white documents do.

PDLs were originally designed to be independent from printers and creators. For this purpose, they are general purpose programming languages for providing extendibility and flexibility. To process PDL files, programming language processors (interpreters) are implemented on printers.

Although they are designed as printer independent, current page description files are no longer printer independent, but they are heavily dependent to printers that the page descriptions are targeted to. Since PDLs have many print instruction operations and printer dependent operations to specify instructions for printers. Thus, page description files are now just temporary files created whenever applications print documents, and be discarded as soon as printing is done. In addition to this, most PDL files are automatically generated by printer drivers that are software provided by printer makers and are resident in clients' systems. Thus, most page descriptions are stereo typed, so most high level programming capabilities are useless .

Even most PDL files do not use most programming capability, interpreters on printers should not relies on that fact. Since interpreters must accept any valid descriptions according to the language specifications. Interpreting of PDLs is time consuming task and requires some computer resources for execution.

Therefore, typical steps for printing data in an application program on a personal computer is following. At first, the application calls operating system interface functions with the drawing data, so that suitable entry of a printer driver is called. In a printer driver, data passed by the call is converted to a fragment of PDL program. The printer driver sends the created PDL program to printers through a network. On the printer, PDL interpreter execute the program to extract the data to be drawn. Finally, after an appropriate processing, the data are actually drawn on a page by the printer. As you see, data are first converted to function calls, then, fragments of a program and finally back to drawing data. These conversion steps are also time consuming work.

SPDL is not a solution to these issues. Since, SPDL is another interpretive programming language and is as complex as other PDLs. So, processing of SPDL files should requires fairy amount of time and complicated steps.

2. User Requirements

To solve the issues described above, we'd like to propose to establishing a standard for the page description specification that is used for describing formatted documents for printing and distributing. The purpose of this standard is to provide compatibility across ordinary documents, such as web pages, word processing documents, spread sheets, and presentations.

In order to cover such ordinary documents, it should provide sufficient drawing capability to fully describe them.

One of the purposes of this standard is to establish widely used criterion for describing formatted documents, so the standard should be easily understood and implemented. It also need be processed easily without extensive computing resources on printers. In other words, it should be a "light weight" specification. For using by many printers, the standard should be printer independent.

Since color documents have become fairly common with the increase in color printers in the field, it should have adequate color capability.

Independency among pages is useful to manupulate document pages.

Programming capability is not necessary to describe documents for most applications.

By settlement of this kind of standard, most printers will be able to provide same functionality to most common applications and client systems. Printer makers and application software venders will easily develop software for it. Even some individuals will be able to make their own small program to utilize it.

3. Light Weight Page Description Specification

3.1 Abstract

Light Weight Page Description Specification (LPDS) has drawing capability to cover ordinary documents, such as web pages and business documents. Although, it has a drawing capabilities to describe most documents, it is light weight specification comparing to existing PDLs.

For providing simplicity of description, drawing objects are directly described in LPDS, contrary to PDLs, in which language interpreters produce drawing objects as side effects of execution of page descriptions as computer programs. This direct description of objects provides shorter processing time, and eliminate necessity of expensive language interpreters.

Files of LWPD will simply consist of list of drawing objects in pages.

Drawing objects in LPDS are chosen to cover most ordinary documents. Since, most pages consist of three kinds of objects that are text strings, raster image, and two dimensional graphics, LPDS provides this three kinds of objects.

As a standard, LPDS is independent from particular printers as well as possible.

For easy processing and manipulation of pages in documents, pages in LPDS are basically independent from each other.

3.2 Relationship with other data formats

Main purpose of LPDS is to describe pages printed by printers. From this point of view, PDLs and print commands have same kind of functionality. LPDS focuses on describing ordinary documents, thus it covers most of common documents but the most complicated ones. Unlike print commands, LPDS is printer independent as well as PDLs are. It also have drawing capability as almost well as PDLs.

SPDL and other PDLs are powerful enough to describe any kind of document for any type of printers. They also have programming capability to possibly describe any kind of control of printers. So, PDLs are used for high end network printers that provide programming capability for controlling printers from client systems. They may also be used for future printers that are more complicated and programmable.

Another possible application for LPDS is a format for distributing documents just for browsing. It may also be an exchange format among various printing systems and publishing systems.

4. Schedule

  1. WP: 1998/02
  2. CD: 1998/08
  3. DIS: 1999/05

Annex
An excerpt from syntax of light weight page description specification

Document	= Document-Header [Page]*
Document-Header	= lpds Version;
Version		= Numeric

Page		= Start-Page Page-Header [Object]* End-Page
Start-Page	= {
End-Page	= }
Page-Header	= [ User-Coordinate ] Default-Attributes
Object		= Text | Graphics | Image
Graphics		= Area | Figure
Area		= area Trajectory [Area-Attributes] [Color]
Figure		= figure Trajectory [Figure-Attributes] [Color];
Text		= text Character-String Text-Location [Text-Attributes] [Color];
Image		= image Image-Location [Image-Attributes] Pixel-Data;

Area-Attributes	= < Winding-Rule, Flatness >
Figure-Attributes	= < Width, Join, Cap, Miter-Limit, Flatness >
Text-Attributes	= < Font-Matrix, Font-Name, Stroke-Type, Figure-Attributes>
Image-Attributes	= < Width, Height, Size, Interleave, Depth, ColorSpace >

User-Coordinate	= Numeric-Array
Font-Matrix	= Numeric-Array
Font-Name	= Character-String
Text-Location	= Point
Image-Location	= Rectangle

Rectangle	= Point ~ Point
Point		= Numeric @ Numeric
Numeric-Array	= [ [Numeric]* ]
Character-String	= "String"
Numeric		= Real | Integer