2008-10-22
Tyng-Ruey Chuang
trc@iis.sinica.edu.tw
Institute of Information Science
Academia Sinica, Taipei, Taiwan
We shall study XSL Transformations (XSLT), Version 1.0,
which is a W3C Recommendation published on November 16, 1999.
Link:
http://www.w3.org/TR/xslt
We shall show how to use the Xalan-Java XSLT processor.
Useful resources on the Web:
This set of slides mostly follows Chapter 8 of the reference book XML in a Nutshell (3rd edition), as well as the tutorial of Paul Grosso and Norman Walsh listed above.
The Extensible Stylesheet Language (XSL) has two parts: XSL Transformation (XSLT) and XSL Formatting Objects (XSL-FO). We will only talk about XSLT for now.
XSLT is an XML application for specifying rules by which one XML document is transformed into another XML document. First, a few things about XSLT:
An XSLT program is in itself an XML document. This document is called an XSLT stylesheet, and can be referred to in a processing instruction in another XML document to specify how that XML document shall be transformed.
An XSLT program contains template rules. Each template rule has a pattern and a template. An XSLT processor compares the nodes in an input XML document to see if they match the pattern. If yes, the matched nodes are transformed into nodes in the output document using the template.
XPath is used to specify patterns in the template rules.
One can think of XSLT as a language for tree-to-tree transformation for XML.
<?xml version="1.0"?> <people> <person born="1912" died="1954"> <name> <first_name>Alan</first_name> <last_name>Turing</last_name> </name> <profession>computer scientist</profession> <profession>mathematician</profession> <profession>cryptographer</profession> </person> <person born="1918" died="1988" id="p4567"> <name> <first_name>Richard</first_name> <middle_initial>P</middle_initial> <last_name>Feynman</last_name> </name> <profession>physicist</profession> <hobby>Playing the bongoes</hobby> </person> </people>
Note: The above XML document and the examples that follows in this set of slides are taken from Chapter 8 of XML in a Nutshell (3rd edition), by Elliotte Rusty Harold & W. Scott Means.
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> </xsl:stylesheet>
The output XML document that will be produced by the about stylesheet when applied to the example XML document:
<?xml version="1.0" encoding="utf-8"?> Alan Turing computer scientist mathematician cryptographer Richard P Feynman physicist Playing the bongoes
Each template rule is represented by an xsl:template
element.
This element has a match
attribute which is an XPath expression
for identifying nodes to match. The matched nodes are selected and output
by instantiating the template enclosed in the xsl:template
element.
Example:
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="person"> <p>A Person</p> </xsl:template> </xsl:stylesheet>
The output XML document after applying the above stylesheet to the example XML document:
<?xml version="1.0" encoding="utf-8"?> <p>A Person</p> <p>A Person</p>
xsl:value-of
The xsl:value-of
element calculates the string value of an XPath
expression, and replace itself with this string value in the template.
The string value of an element is the text content of the element
after all the tags have been removed and entity and character references have been resolved.
The element whose value is taken is identified by a select
attribute
containing the XPath expression. Example:
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="person"> <p> <xsl:value-of select="name"> </p> </xsl:template> </xsl:stylesheet>
The output XML document after applying the above stylesheet to the example XML document:
<?xml version="1.0" encoding="utf-8"?> <p> Alan Turing </p> <p> Richard P Feyman </p>
xsl:apply-templates
By default, nodes in the input XML document are matched from top to bottom in a preorder tree traversal. Template rules are activated whenever the nodes encountered during the traversal match the patterns specified in the rules.
However, one can change the order of tree traversal by specifying
which elements should be visited next in a template rule.
The xsl:apply-templates
element makes the processing order explicit.
Its select
attribute contain an XPath expression telling
the XSLT processor which nodes to process at that point.
Example:
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="name"> <p> <xsl:value-of select="last_name">, <xsl:value-of select="first_name"> </p> </xsl:template> <xsl:template match="person"> <xsl:apply-templates select="name"> </xsl:template> </xsl:stylesheet>
The output XML document after applying the above stylesheet to the example XML document:
<?xml version="1.0" encoding="utf-8"?> <p> Turing, Alan </p> <p> Feyman, Richard </p>
Note that the order of the template rules in the stylesheet doesn't matter. It is only the order of element traversal that matters.
xsl:apply-templates
Even if you don't need to reorder the elements,
xsl:apply-templates
is still useful when
you want to specify where in a template shall
the child elements of the current element be processed and their results be placed.
Also, if you would like to apply templates to all
types of children of the current element,
you can omit the select
attribute in the
xsl:apply-templates
element.
Example:
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="people"> <html> <head><title>Famous Scientists</title></head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> <xsl:template match="name"> <p> <xsl:value-of select="last_name">, <xsl:value-of select="first_name"> </p> </xsl:template> <xsl:template match="person"> <xsl:apply-templates select="name"> </xsl:template> </xsl:stylesheet>
The output XML document after applying the above stylesheet to the example XML document:
<html> <head> <title>Famous Scientists</title> </head> <body> <p>Turing, Alan</p> <p>Feyman, Richard</p> </body> </html>
For each of the seven kinds of nodes in an XML document — root, element, attribute, text, comment, processing instruction, and namespace — XSLT provides a default built-in template rule for each kind.
Defaults for text and attribute nodes:
<xsl:template match="text()|@*"> <xsl:value-of select="."> </xsl:template>
Defaults for element and root nodes:
<xsl:template match="*|/"> <xsl:apply-templates> </xsl:template>
Defaults for comment and processing instruction nodes:
<xsl:template match="processing-instruction()|comment()/"> </xsl:template>
We can include known attribute values in the output document as literals. Example:
<xsl:template match="person"> <span class="person"><xsl:apply-templates/><span> </xsl:template>
We can insert attribute values that are calculated from the input document into the output document using attribute value templates. Example:
<xsl:template match="name"> <name first="{first_name}" initial="{middle_name}" last="{last_name}" /> </xsl:template>
When applied to the example document, the output will contain, for example:
<name first="Richard" initial="P" last="Feyman"/>
Sometimes the same input content needs to appear multiple times in the
output document, transformed into different formats in different contexts.
Both xsl:apply-templates
and xsl:template
elements can have optional mode
attributes that
connect different rules to different contexts. Example:
<?xml version="1.0"> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="people"> <html> <head><title>Famous Scientists</title></head> <body> <ul><xsl:apply-templates select="person" mode="toc"/></ul> <xsl:apply-templates select="person"/> </body> </html> </xsl:template> <!-- Table of Contents Mode Templates --> <xsl:template match="person" mode="toc"> <xsl:apply-templates select="name" mode="toc"/> </xsl:stylesheet> <xsl:template match="name" mode="toc"> <li><xsl:value-of select="last_name">, <xsl:value-of select="first_name"></li> </xsl:template> <!-- Normal Mode Templates --> <xsl:template match="person"> <p><xsl:apply-templates/></p> </xsl:stylesheet>
When applied to the example document, the output will contain, for example:
<html> <head> <title>fanous Scientists</title> </head> <body> <ul> <li>Turing, Alan</li> <li>Feyman, Richard</li> </ul> <p> Alan Turing computer scientist mathematician cryptographer </p> <p> Richard P Feynman physicist Playing the bongoes </p> </body> </html>
We can select to apply templates in a procedural way. A series of templates is created, such that each template explicitly selects and processes the necessary elements. Often these procedures are carried out by using the following XSLT elements:
xsl:for-each
xsl:if
xsl:choose
xsl:when
xsl:otherwise
xsl:sort
xsl:for-each
ExampleThe following XML document:
<?xml version='1.0'?> <table> <row><entry>a1</entry><entry>a2</entry></row> <row><entry>b1</entry><entry>b2</entry></row> <row><entry>c1</entry><entry>c2</entry></row> </table>
is transformed by the following XSLT program:
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html"/> <xsl:template match="table"> <table> <xsl:for-each select="row"> <tr> <xsl:for-each select="entry"> <td><xsl:apply-templates/></td> </xsl:for-each> </tr> </xsl:for-each> </table> </xsl:template> </xsl:stylesheet>
into the following XML document:
<table> <tr> <td>a1</td><td>a2</td> </tr> <tr> <td>b1</td><td>b2</td> </tr> <tr> <td>c1</td><td>c2</td> </tr> </table>
For details about xsl:for-each
, see Section
8 Repetition
in XSL Transformations (XSLT), Version 1.0.
xsl:if
and xsl:when
ExamplesSimple conditional (no "else")
<xsl:if test="$somecondition"> <xsl:text>this text only gets used if $somecondition is true()</xsl:text> </xsl:if>
Select among alternatives with <xsl:when> and <xsl:otherwise>
<xsl:choose> <xsl:when test="$count > 2"><xsl:text>, and </xsl:text></xsl:when> <xsl:when test="$count > 1"><xsl:text> and </xsl:text></xsl:when> <xsl:otherwise><xsl:text> </xsl:text></xsl:otherwise> </xsl:choose>
For details about xsl:for-each
, see Section
9 Conditional Processing
in XSL Transformations (XSLT), Version 1.0.
xsl:sort
ExampleThe following XML document:
<doc> <para>Here's a table of sales:</para> <table> <row><cell>3000</cell><cell>Widgets 'R' Us</cell></row> <row><cell>2400</cell><cell>Widget Design and Implementation</cell></row> <row><cell>10000</cell><cell>Widgets for Dummies</cell></row> <row><cell>101</cell><cell>101 Uses for a Dead Widget</cell></row> </table> </doc>
is transformed by the following XSLT program:
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:import href="element.xsl"/> <xsl:template match="table"> <table> <xsl:apply-templates select="row"> <xsl:sort data-type="number" select="./cell[1]"/> </xsl:apply-templates> </table> </xsl:template> </xsl:stylesheet>
into the following XML document:
<table> <tr> <td>101</td><td>101 Uses for a Dead Widget</td> </tr> <tr> <td>2400</td><td>Widget Design and Implementation</td> </tr> <tr> <td>3000</td><td>Widgets 'R' Us</td> </tr> <tr> <td>10000</td><td>Widgets for Dummies</td> </tr> </table>
For details about xsl:for-each
, see Section
10 Sorting
in XSL Transformations (XSLT), Version 1.0.
The <xsl:number> element performs two functions:
It evaluates a numeric expression and converts the result into a formatted string:
<xsl:number value="3" format="A. "/> <xsl:number value="count(listitem)" format="01"/>
It counts elements in the source tree and converst the result into a formatted string:
<xsl:number count="listitem" format="i. "/> <xsl:number count="chapter" from="book" level="any" format="1. "/> <xsl:number count="h1|h2|h3" level="multiple" from="chapter|appendix" format="1."/>
The details of number to string conversion is spelled out in great detail in Section 7.7.1 Number to String Conversion Attributes in XSL Transformations (XSLT), Version 1.0.
The problem of multiple patterns that match is handled by conflict resolution:
Matching templates from imported modules are not considered if there is a matching template in the current module
Matching templates with a lower priority are not considered. The default priority is determined as follows:
Unqualified child or attribute names have a priority of 0.
Processing-instructions with a target have a priority of 0.
A namespace-qualified "*" child or attribute name has a priority of -0.25.
An unqualified "*" has a priority of -0.5
Any other template has a default priority of 0.5
Template priority may be specified explicitly with the priority attribute on <xsl:template>
"emphasis", "html:p", and "@foo" have a priority of 0
"html:*" has a priority of -0.25
"*" has a priority of -0.5
"para/emphasis" has a priority of 0.5
"emphasis/emphasis" has a priority of 0.5
"emphasis[@role]" has a priority of 0.5
It is technically an error if the conflict resolution process yields more than one template.
The following two rules would number title
elements. This is
intended for a document that contains a sequence of chapters followed by a
sequence of appendices, where both chapters and appendices contain sections,
which in turn contain subsections. Chapters are numbered 1, 2, 3; appendices
are numbered A, B, C; sections in chapters are numbered 1.1, 1.2, 1.3;
sections in appendices are numbered A.1, A.2, A.3.
<xsl:template match="title"> <fo:block> <xsl:number level="multiple" count="chapter|section|subsection" format="1.1 "/> <xsl:apply-templates/> </fo:block> </xsl:template> <xsl:template match="appendix//title" priority="1"> <fo:block> <xsl:number level="multiple" count="appendix|section|subsection" format="A.1 "/> <xsl:apply-templates/> </fo:block> </xsl:template>
The details of number to string conversion is spelled out in great detail in Section 7.7 Numbering in XSL Transformations (XSLT), Version 1.0.