Next XSLT Programmer’s Reference, Second Edition Michael Kay Contents © 2000 Wrox Press All rights reserved No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews The authors and publisher have made every effort in the preparation of this book to ensure the accuracy of the information However, the information contained in this book is sold without warranty, either express or implied Neither the authors, Wrox Press nor its dealers or distributors will be held liable for any damages caused or alleged to be caused either directly or indirectly by this book First Published: April 2000 Latest Reprint: March 2003 Wrox Published by Wrox Press Ltd Arden House, 1102 Warwick Road, Acock's Green, Birmingham B27 6BH, UK Printed in Canada ISBN 1-861005-06-7 Trademark Acknowledgements Wrox has endeavored to provide trademark information about all the companies and products mentioned in this book by the appropriate use of capitals However, Wrox cannot guarantee the accuracy of this information Credits Author Michael Kay Technical Reviewers David Carlisle Robert Chang Michael Corning Jim MacIntosh Craig McQueen Gary L Peskin Paul Tchistopolskii Linda van den Brink Dmitry E Voytenko Dan Wahlin Category Manager Dave Galloway Technical Architect Dianne Parker Technical Editor Simon Mackie Author Agent Marsha Collins Project Managers Avril Corbin Beckie Stones Production Manager Simon Hardware Production Coordinator Mark Burdett Figures Shabnam Hussein Cover Design Shelly Frasier Proofreader Ian Allen About the Author Michael Kay has recently joined the systems architecture team at Software AG, working on the standards and interfaces for their XML product line, centred around the Tamino database He also represents Software AG on the W3C XSL Working Group Until then, he had spent most of his career as a software designer and systems architect with ICL, the IT services supplier His background (and Ph.D., from the University of Cambridge) is in database technology He has worked on the design of network, relational, and object-oriented database software products as well as a text search engine In the XML world he is known as the developer of the open source Saxon product, the first fully-conformant implementation of the XSLT standard Michael lives in Reading, Berkshire with his wife and daughter His hobbies, as you might guess from the examples in this book, include genealogy and choral singing Acknowledgements Firstly, I'd like to acknowledge the work of the W3C XSL Working Group, who created the XSLT language Without their efforts there would have been no language and no book I wrote the first edition of this book from an outsider's perspective before I joined the group, and I've tried to keep that flavor in the second edition, despite the fact that I now have to take my share of responsibility for the spec being the way it is More specifically, I'm grateful to James Clark, the editor of the XSLT and XPath specifications, who responded courteously and promptly to a great many enquiries I've learnt a great deal of what I know about XSLT from the people on the XSL-List; not only from the experts who answer so many of the questions, but also from the many beginners who ask them Many of the new techniques and explanations in the second edition were prompted by ideas first aired on this list I owe a debt both to ICL and to Software AG, my employers during the life of this project, who both offered me every encouragement and support My editors at Wrox Press, and the technical reviewers, made an invaluable contribution by pointing out the many places where improvements were needed And finally, I'm once again grateful for the support of Penny and Pippa, who took the news that I was planning a second edition with little more than a sigh of resignation Next Index C - General XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Index C Contents Symbols A B C D E F G H I J K L M N O P Q R S T U V W X Z Index E ECMAScript extension functions, 132 EDI message, 52 effective version value of xsl:version attribute, 215 electronic commerce, 12 element node, 59 attributes, 63 base URI, 62 built-in template rule, 82 children, 62 namespaces, 63 node name, 60 string-value, 62 element-available() function, 476 alternative to xsl:fallback, 216 creating multiple output files, 478 examples, 481 rules, 477 instructions defined in XSLT, 477 testing for availability of extension elements, 136 features in later XSLT versions, 480 vendor extenions, 481 usage, 480 Elements attribute value template, 150 classification into groups, 151 document order, 150 expressions, 150 instantiate, 150 instructions, 150 literal result element, 150 patterns, 150 QName, 150 stylesheets, 151 template body, 151 template rules, 151 temporary trees, 151 embedded stylesheet, 304 example, 109 using Saxon, 110 using xsl:stylesheet element, 310 empty node-set, 92 result of comparisons, 88 empty string, 91 encoding of HTML output, 280 of text output, 281 of XML output, 276 ends-with() function there is none, 540 entities external general parsed entity, 58 unparsed, 562 entity reference effect on string-length(), 543 handled by XML parser, 141 HTML output, 279 EntityResolver use with Saxon, 772 entity-uri() function, 64 EqualityExpr expression, 370 equals operator =, 370 applied to a tree, 375 not an identity comparison, 374 rules for node-sets, 372 rules for simple values, 371 ErrorListener interface (TrAX), 858 escaping of special characters, 325 eval() Saxon extension function, 781 evaluate() Saxon extension function, 782 example, 786 Xalan extension function, 815 Excelon producers of Stylus Studio, 830 exclude-result-prefixes attribute in an imported stylesheet, 227 in an included stylesheet, 239 xsl:stylesheet element, 312 exists() Next Saxon extension function, 782 expanded name, 66 expat XML parser used by Sablotron, 830 Expr expression, 377 expression() Saxon extension function, 782 Expressions (XPath), 349 see also patterns, and Lexical tokens AbbreviatedAbsoluteLocationPath, 352 AbbreviatedAxisSpecifier, 353 AbbreviatedRelativeLocationPath, 354 AbbreviatedStep, 356 AbsoluteLocationPath, 358 AdditiveExpr, 359 AndExpr, 360 Argument, 361 AxisName, 363 AxisSpecifier, 368 within a pattern, 440 can be used as arguments to functions, 362 examples, 362 can be used independently of XSLT, 16 comparing simple values, 371 comparisons involving node-sets, 372 example, 373 comparisons involving trees example, 375 context to be used in XSLT stylesheet, 377 defined in seperate W3C Recommendation, 16 defined in XPath Recommendation, 84 description, 150 EqualityExpr, 370 essential part of XSLT, 16 examples, 380 Expr, 377 FilterExpr, 382 FunctionCall, 385 LocationPath, 391 MSXML3 DOM, used in, 726 MultiplicativeExpr, 392 NameTest, 394 NodeTest, 398 within a pattern, 440 OrExpr, 404 PathExpr, 405 Predicate, 407 PredicateExpr, 411 Predicates within a pattern, 441 PrimaryExpr, 412 production rules, 349 QName, 414 RelationalExpr, 416 RelativeLocationPath, 419 static context of an expression, 378 Step, 420 stylesheet, used in, 84 UnaryExpr, 423 UnionExpr, 424 VariableReference, 425 ExprToken expression, 380 examples, 381 ExprWhitespace expression, 381 eXtensible Stylesheet Language: Transformations see XSLT extensibility design principles, 130 fallback behaviour, 131 format attribute xsl:number element, 131 lang attribute xsl:number element, 131 xsl:sort element, 131 method attribute xsl:output element, 131 recognizing vendor extensions, 131 system property() function, 131 testing whether extensions are available, 131 vendor defined attributes, 131 vendor defined top level elements, 131 XSLT open ended attribute values, 131 vendor discretion on values to support, 131 eXtensible Server Pages see XSP pages extension elements definition, 117 implementation, vendor-dependant, 118 namespace, 305 non-standard elements from vendor and user, 136 saxon:group extension element, 117 saxon:while element, 135 supported in Saxon, 777 testing for availability, 476 tokenizing a string, 135 use for debugging, 311 use of xsl:fallback element, 118 xsl:exclude-result-prefixes attribute, 136 xsl:fallback element, 137 xsl:stylesheet element, 136 extension-element-prefixes attribute, 135 xsl:extension-element-prefixes attribute, 135 extension functions always called from within XPath expressions, 568 as substitute for updateable variables, 613 binding, 569 calling, 386 construct new tree in form of DOM, 572 node must be Document or Document Fragment object, 572 choice of language to write in, 568 description, 132 DOM rules, 572 identifying the Java class, 574 Java, 293 JavaScript, 293 mechanism for using other languages, 132 name contains namespace prefix and colon, 568 namespace URI, 569 native languages recommended, 569 prefixes, 132 reasons for, 568 require care when there are side effects, 135 return values, 581 exceptions, 581 returning external object, 84 side-effects, 387 testing availability, 490 updating the DOM, 572 writing, 567 XPath rules, 572 xsl:script, 569 elements can specify different languages, 569 XSLT source tree, accessing, 570 extension-element-prefixes attribute imported stylesheet, in, 239 relationship to element-available() function, 480 xsl:stylesheet element, 311 external functions within a loop, 585 external general parsed entity, 58 used as XML output, 272 external object cannot use as a predicate, 412 table of conversion rules, 581 external object data type, 386 EZ/X produced by Activated Intelligence, 828 Previous Next Chapter - Writing Extension Functions XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Chapter Contents Overview When are Extension Functions Needed? Next Calling Extension Functions Extension functions are always called from within an XPath expression, and I explained the syntax for this in Chapter A typical function call looks like this: my:function($arg1, 23, string(title)) Calling Extension Functions What Language is Best? Binding Extension Functions XPath Trees and the DOM The Java Language Binding The JavaScript Language Binding Summary The name of an extension function will always contain a namespace prefix and a colon The prefix («my» in this example) must be declared in a namespace declaration on some containing element in the stylesheet, in the usual way The function may take any number of arguments (zero or more), and the parentheses are needed even if there are no arguments The arguments can be any XPath expressions; in our example, the first argument is a variable reference, the second is a number, and the third is a function call The arguments are passed to the function by value, which means that the function can never modify the values of the arguments The function always returns a result We'll have a lot more to say about the data types of the arguments, and the data type of the result, in due course Previous Next Chapter - XSLT Elements XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Chapter Contents Overview xsl:apply-imports xsl:apply-templates xsl:attribute xsl:attribute-set xsl:call-template xsl:choose xsl:comment xsl:copy xsl:copy-of xsl:decimal-format xsl:document xsl:element xsl:fallback xsl:for-each Next xsl:if The instruction encloses a template body that will be instantiated only if a specified condition is true is analogous to the if statement found in many programming languages Defined in XSLT section 9.1 Format template body Position is an instruction It is always used within a template body Attributes Name Value Meaning test mandatory Expression The Boolean condition to be tested xsl:if Defined in Format Effect Content Usage A template body Examples Effect See also xsl:import xsl:include xsl:key xsl:message xsl:namespace-alias xsl:number xsl:otherwise xsl:output xsl:param xsl:preserve-space xsl:processing-instruction xsl:script xsl:sort xsl:strip-space xsl:stylesheet xsl:template xsl:text xsl:transform xsl:value-of xsl:variable xsl:when xsl:with-param Summary The test expression is evaluated and the result is converted if necessary to a Boolean using the rules defined for the boolean() function If the result is true, the contained template body is instantiated; otherwise, no action is taken Any XPath value may be converted to a Boolean In brief, the rules are: if the expression is a node-set, it is treated as true if the node-set contains at least one node (This means that a reference to a temporary tree is always treated as true.) if the expression is a string, it is treated as true if the string is not empty if the expression is a number, it is treated as true if the number is non-zero Usage The instruction is useful where an action is to be performed conditionally It performs the functions of the if-then construct found in other programming languages If there are two or more alternative actions (the equivalent of an if-then-else or switch or Select Case in other languages), use instead One common use of is to test for error conditions In this case it is often used with Try to avoid using immediately within It's better to use a predicate instead, because that gives the processor more scope for optimization For example: can be rewritten as: Examples The following example outputs an element after processing the last of a sequence of elements: The following example reports an error if the percent attribute of the current element is not a number between and 100 The expression returns true if: the percent attribute does not exist, or the value cannot be interpreted as a number (so that «number(@percent)» is NaN), or the numeric value is less than zero, or the numeric value is greater than 100 percent attribute must be a number between and 100 The following example formats a list of names, using to produce punctuation that depends on the position of each name in the list Example: Formatting a List of Names Source The source file authors.xml contains a single element with a list of authors Design Patterns Erich Gamma Richard Helm Ralph Johnson John Vlissides Stylesheet The stylesheet authors.xsl processes the list of authors, adding punctuation depending on the position of each author in the list by , and Output Design Patterns by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides See also on page 188 Previous Next Introduction XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Contents Who is this Book for? Why a Second Edition? What does the Book Cover? How is the Book Structured? Other XSLT Resources What I Need to Use this Book? Conventions Customer Support Tell Us What You Think Next Tell Us What You Think The author and the Wrox team have worked hard to make this book a pleasure to read as well as being useful and educational, so we'd like to know what you think Wrox are always keen to hear what you liked best and what improvements you think are possible We appreciate feedback on our efforts and take both criticism and praise on board in our future editorial efforts When necessary, we'll forward comments and queries to the author If you've anything to say, let us know by sending an e-mail to: feedback@wrox.com Previous Next Chapter - XSLT Elements XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Chapter Contents Overview xsl:apply-imports xsl:apply-templates xsl:attribute xsl:attribute-set xsl:call-template xsl:choose xsl:comment xsl:copy xsl:copy-of xsl:decimal-format xsl:document xsl:element xsl:fallback xsl:for-each xsl:if xsl:import xsl:include Next xsl:include is a top-level element used to include the contents of one stylesheet module within another The definitions in the included stylesheet module have the same import precedence as those in the including module, so the effect is exactly as if these definitions were textually included at the point in the including module where the element appears Defined in XSLT section 2.6.1 Format Position is a top-level element, which means that it must appear as a child of the element There are no constraints on its ordering relative to other top-level elements in the stylesheet Attributes Name Value Meaning href URI The URI of the stylesheet to be included mandatory Defined in Format Content Effect None; the element is always empty Usage Effect Examples See also xsl:key xsl:message xsl:namespace-alias xsl:number xsl:otherwise xsl:output xsl:param xsl:preserve-space xsl:processing-instruction xsl:script xsl:sort xsl:strip-space xsl:stylesheet xsl:template xsl:text xsl:transform xsl:value-of xsl:variable xsl:when xsl:with-param Summary The URI contained in the href attribute may be an absolute URI or a relative URI If relative, it is interpreted relative to the the base URI of the XML document or external entity containing the element For example, if a file main.xsl contains the element then by default the system will look for date.xsl in the same directory as main.xsl With XSLT 1.1 you can change this behavior by using the xml:base attribute to specify a base URI explicitly, as described in page 62 The URI must identify an XML document that is a valid XSLT stylesheet The top level elements of this stylesheet are logically inserted into the including stylesheet module at the point where the element appears However: These elements retain their base URI, so anything that involves referencing a relative URI is done relative to the original URI of the included stylesheet This rule applies, for example, when expanding further and elements, or when using relative URIs as arguments to the document() function When a namespace prefix is used (typically within a QName, but it also applies to freestanding prefixes such as those in the xsl:exclude-result-prefixes attribute of a literal result element) it is interpreted using only the namespace declarations in the original stylesheet module in which the QName occurred An included stylesheet module does not inherit namespace declarations from the module that includes it This even applies to QNames constructed at execution time as the result of evaluating an expression, for example an expression used within an attribute value template for the name or namespace attribute of The values of the version, extension-element-prefixes, and excluderesult-prefixes attributes that apply to an element in the included stylesheet module, as well as xml:lang and xml:space, are those that were defined on its own element, not those on the element of the including stylesheet module An exception is made for elements in the included stylesheet module elements must come before any other top-level elements, so instead of placing them in their natural sequence in the including module, they are promoted so they appear after any elements, but before any other top-level elements, in the including stylesheet module This is relevant to situations where there are duplicate definitions and the XSLT processor is allowed to choose the one that comes last The included stylesheet module may use the simplified (literal-result-element-asstylesheet) stylesheet syntax, described in Chapter This allows an entire stylesheet module to be defined as the content of an element such as It is then treated as if it were a module containing a single template, whose match pattern is «/» and whose content is the literal result element The included stylesheet module may contain statements to include further stylesheets, or statements to import them A stylesheet must not directly or indirectly include itself It is not an error to include the same stylesheet module more than once, either directly or indirectly, but it is not a useful thing to It may well cause errors due to the presence of duplicate declarations, in fact if the stylesheet contains definitions of global variables or named templates, and is included more than once at the same import precedence, such errors are inevitable In some other situations it is implementation-defined whether an XSLT processor will report duplicate declarations as an error, so the behavior may vary from one product to another Usage provides a simple textual inclusion facility analagous to the #include directive in C, it is purely a way of writing a stylesheet in a modular way so that commonly used definitions can be held in a library and used wherever they are needed Chapter - Expressions XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Chapter Contents Overview Notation Where to Start A Syntax Tree AbbreviatedAbsoluteLocationPath AbbreviatedAxisSpecifier AbbreviatedRelativeLocationPath AbbreviatedStep AbsoluteLocationPath AdditiveExpr AndExpr Argument AxisName AxisSpecifier Digits EqualityExpr Expr ExprToken ExprWhitespace FilterExpr FunctionCall FunctionName Literal LocationPath MultiplicativeExpr MultiplyOperator NameTest NCName and NCNameChar NodeTest NodeType Next NodeType A NodeType represents a constraint on the type of a node Expression Syntax NodeType «comment» | «text» | «processing-instruction» | «node» A NodeType is a token, so it can contain whitespace before and after the name, but not within it Note that the four NodeType names cannot be used as function names, but apart from this, they are not reserved words It is quite possible to have elements or attributes called «text» or «node» in your source XML document, and therefore you can use «text» or «node» as ordinary names in XPath This is why the names are flagged in a NodeTest by the following parentheses, for example «text()» Defined in XPath section 3.7 (Lexical Rules), rule 38 Used in NodeTest Usage A NodeType can be used within a NodeTest (which in turn is used within a Step) to restrict a Step to return nodes of a particular type The keywords «comment», «text», and «processing-instruction» are self-explanatory: they restrict the selection to nodes of that particular type The keyword «node» selects nodes of any type, and is useful because a Step has to include some kind of NodeTest, so if you want all the nodes on the axis, you can specify «node()» For example, if you want all child nodes, specify «child::node()» Note that there is no way of referring to the other four node types, namely root, element, attribute, and namespace In the case of the root node, this is because if you only want the root node, you don't need to find it using an axis, just use the special expression «/» In the case of the attribute and namespace nodes, it is because these types of node are exclusive to the attribute and namespace axes: you can only find these nodes by using the axis of the same name, and all the nodes on that axis will be nodes of the appropriate type In the case of element nodes, all the axes that can contain elements have element as their principal node type, and you can select the nodes of the principal node type using the special NameTest «*» Examples in Context Defined in Used in Expression Description Usage parent::node() Selects the parent of the context node, whether this is an element node or the root node This differs from «parent::*», which selects the parent node only if it is an element The expression «parent::node()» is usually abbreviated to « » //comment() Selects all comment nodes in the document child::text() Selects all text node children of the context node This is usually abbreviated to «text()» @comment() A strange but legal way of getting an empty node-set: it looks for all comment nodes on the attribute axis, and of course finds none Examples in Context Number Operator OperatorName OrExpr PathExpr Predicate PredicateExpr PrimaryExpr QName RelationalExpr RelativeLocationPath Step UnaryExpr UnionExpr VariableReference Summary Previous Next Chapter - Functions XSLT Programmer's Reference, Second Edition byMichael Kay Wrox Press 2001 Previous Chapter Contents Overview boolean ceiling concat contains count current document element-available false floor format-number function-available generate-id id key lang last local-name name Next namespace-uri The namespace-uri() function returns a string that represents the URI of the namespace in the expanded name of a node Typically this will be a URI used in a namespace declaration, that is, the value of an xmlns or xmlns:* attribute For example, if you apply this function to the outermost element of the stylesheet by writing the expression «namespace-uri(document('')/*)», the result will be the string «http://www.w3.org/1999/XSL/Transform» Defined in XPath section 4.1 Format namespace-uri() ® string namespace-uri(node) ® string Arguments Argument Data type Meaning node (optional) node-set Identifies the node whose namespace URI is required If the node-set contains more than one node, the target node is the one that comes first in document order If the node-set is empty, the function returns an empty string If the argument is omitted, the target node is the context node It is an error if the argument supplied is not a nodeset namespace-uri Defined in Format Result Rules A string value: the namespace URI of the expanded name of the target node Usage Rules Examples See also normalize-space not number position round starts-with string string-length substring substring-after substring-before sum system-property translate true unparsed-entity-uri Summary The namespace URI of a node depends on the node type, as follows: Node type Namespace URI root None, an empty string is returned element If the element name as given in the source XML contained a colon, the value will be the URI from the namespace declaration corresponding to the element's prefix Otherwise, the value will be the URI of the default namespace If this is null, the result will be an empty string attribute If the attribute name as given in the source XML contained a colon, the value will be the URI from the namespace declaration corresponding to the attribute's prefix Otherwise, the namespace URI will be an empty string text None, an empty string is returned processing instruction None, an empty string is returned comment None, an empty string is returned namespace None, an empty string is returned Except for element and attribute nodes, namespace-uri() returns an empty string Usage Let's start with some situations where you don't need this function If you want to test whether the current node belongs to a particular namespace, the best way to achieve this is using a NameTest of the form «prefix:*» For example, to test whether the current element belongs to the «http://ibm.com/ebiz» namespace, write: If you want to find the namespace URI corresponding to a given prefix the best solution is to use namespace nodes You might need to this if namespace prefixes are used in attribute values: the XSLT standard itself uses this technique in attributes such as extension-elementprefixes, and there is no reason why other XML document types should not the same If you have an attribute «@value» which you know takes the form of a namespace-qualified name (a QName), you can get the associated namespace URI by writing: The namespace-uri() function, by contrast, is useful in display contexts, where you just want to display the namespace URI of the current node, and also if you want to more elaborate tests For example, you may know that there is a whole family of namespaces whose URIs all begin with urn:schemas.biztalk, and you may want to test whether the current node is in any one of these You can achieve this by writing: