Markup languages have been around since 1969, when three IBM researchers created the Generalized Markup Language. That was the grandfather of Hypertext Markup Language (HTML), which makes the Web work, and of Extensible Markup Language (XML), which has become the primary means of defining, storing and formatting data in a multitude of areas, including documents, forms and databases.

At the heart of these languages is a system called tagging, where text or data is marked by indicators enclosed in angled brackets, always at the beginning <tag> and often at the end </tag>.



HTML pages use standardized, predefined tags. For example, <p> means a paragraph, <h1> means a header and <b> followed by </b> means the enclosed text is to be bold. Web browsers interpret these tags and format the text accordingly when they display the pages on-screen.

With XML, however, programmers can make up tags, and browsers have no built-in way of knowing what the tags mean or what to do about them. Further complicating matters, we can use tags to describe data itself (content) or to give formatting instructions (how to display or arrange an element).

For instance, <table> could refer to a matrixlike arrangement of items on an HTML page, or it could signify a piece of furniture. This flexibility makes XML powerful, but it confuses the distinction between content and format.

In order to display XML documents usefully, we need a mechanism that identifies and describes the meaning of formatting tags and shows how they affect other parts of the document. Past mechanisms have included the Document Style Semantics and Specification Language, and Cascading Style Sheets . Both have now been extended and superseded by Extensible Stylesheet Language, a standard recommended by the World Wide Web Consortium (W3C) in 2001.

XSL provides a comprehensive model and vocabulary for writing stylesheets using XML syntax. It is used to define how to transform an XML file into a format (such as HTML) that a browser can recognize and understand.

XSL can add elements to the output file or remove or ignore existing elements. It can rearrange and sort the elements, test and make decisions about which elements to display, and a lot more.

Components of XSL
XSL is actually a family of three tools produced by the W3C's XSL Working Group: XPath, XSLT and XSF-FO.

  • XPath, or XML Path Language, is used to specify the parts of an XML document that will be transformed by XSL Transformations (XSLT). XPath interprets an XML document as a hierarchical tree of nodes, which can include elements, attributes or text. The hierarchical tree is called the source-node tree.
  • XSLT describes how to filter or convert (transform) XML documents into other types of XML documents, including XSL Formatting Object (XSL-FO) files. An XSLT stylesheet contains a set of template rules for transforming a source tree by matching a pattern against elements in the source tree. When a match is found, the rules are used to create a new node in the result tree. The result tree's structure can be completely different from that of the source tree because elements can be filtered and reordered and arbitrary structure added. An XSLT stylesheet is like a sophisticated search-and-replace routine.
  • XSL-FOs are instructions that define exactly how a document will be formatted for a specific medium or device. For a document to be printed, formatting objects can include characters, blocks of text, images, tables, borders, master pages and the like.

    XSL-FO specifies various layout rules (e.g., where page breaks can occur) and requirements (e.g., placement of footnotes), but the XSL-FO file itself doesn't determine exactly where each element is positioned. That's done by a separate formatting engine that interprets the file.

  • XSL-FO isn't restricted to printed pages and on-screen appearance; it can also specify audio reproduction, for example. Confusingly, XSL-FO is sometimes referred to as XSL.

    Why XSL?
    XSL is designed for repetitive situations where documents are dynamically generated and formatted on demand, not for documents that require a creative professional to modify the layout, content and typography to get an acceptable (albeit static) result. XSL is thus an ideal fit for documents that have to be output in a variety of formats and on many different types of devices, ranging from printers and computer screens to handhelds and phones.

    Kay is a Computerworld contributing writer in Worcester, Mass. Contact him at russkay@charter.net.

    XSL in Action
    Please click on image above to view a readable version.

    See additional Computerworld QuickStudies

    Copyright © 2004 IDG Communications, Inc.

    Bing’s AI chatbot came to work for me. I had to fire it.
    Shop Tech Products at Amazon