![XML in a Nutshell: A Desktop Quick Reference](http://img.images-bn.com/static/redesign/srcs/images/grey-box.png?v11.9.4)
XML in a Nutshell: A Desktop Quick Reference
712![XML in a Nutshell: A Desktop Quick Reference](http://img.images-bn.com/static/redesign/srcs/images/grey-box.png?v11.9.4)
XML in a Nutshell: A Desktop Quick Reference
712Paperback(Third Edition)
-
SHIP THIS ITEMTemporarily Out of Stock Online
-
PICK UP IN STORECheck Availability at Nearby Stores
Available within 2 business hours
Related collections and offers
Overview
- Quick-reference syntax rules and usage examples for the core XML technologies, including XML, DTDs, Xpath, XSLT, SAX, and DOM
- Develop an understanding of well-formed XML, DTDs, namespaces, Unicode, and W3C XML Schema
- Gain a working knowledge of key technologies used for narrative XML documents such as web pages, books, and articles technologies like XSLT, Xpath, Xlink, Xpointer, CSS, and XSL-FO
- Build data-intensive XML applications
- Understand the tools and APIs necessary to build data-intensive XML applications and process XML documents, including the event-based Simple API for XML (SAX2) and the tree-oriented Document Object Model (DOM)
Product Details
ISBN-13: | 9780596007645 |
---|---|
Publisher: | O'Reilly Media, Incorporated |
Publication date: | 09/15/2004 |
Series: | In a Nutshell (O'Reilly) |
Edition description: | Third Edition |
Pages: | 712 |
Product dimensions: | 6.00(w) x 9.00(h) x 1.60(d) |
About the Author
W. Scott Means has been a professional software developer since 1988, when he joined Microsoft Corporation at the age of 17. He was one of the original developers of OS/2 1.1 and Windows NT, and did some of the early work on the Microsoft Network for the Microsoft Advanced Technology and Business Development group. Since then he has written software for everything from multiplayer casino games to railroad geometry measurement equipment. For Scott's latest projects and musings on software development, visit his blog at smeans.com.
Read an Excerpt
Chapter 9: XPath
XPath is a non-XML language used to identify particular parts of XML documents. XPath lets you write expressions that refer to the document's firstperson
element, the seventh child element of
the third person
element, the ID
attribute of the first person
element whose contents are the string "Fred Jones," all xml-stylesheet
processing instructions in the document's
prolog, and so forth. XPath indicates nodes by position, relative position,
type, content, and several other criteria. XSLT uses XPath expressions to match
and select particular elements in the input document for copying into the output
document or further processing. XPointer uses XPath expressions to identify the
particular point in or part of an XML document that an XLink links to.
XPath expressions can also represent numbers, strings, or Booleans, so XSLT stylesheets carry out simple arithmetic for numbering and cross-referencing figures, tables, and equations. String manipulation in XPath lets XSLT perform tasks like making the title of a chapter uppercase in a headline, but mixed case in a reference in the body text.
The Tree Structure of an XML Document
An XML document is a tree made up of nodes. Some nodes contain other nodes. One root node ultimately contains all other nodes. XPath is a language for picking nodes and sets of nodes out of this tree. From the perspective of XPath, there are seven kinds of nodes:
- The root node
- Element nodes
- Text nodes
- Attribute nodes
- Comment nodes
- Processing instruction nodes
- Namespace nodes
Note the constructs not included in this list: CDATA sections, entity references, and document type declarations. XPath operates on an XML document after these items have merged into the document. For instance, XPath cannot identify the first CDATA section in a document or tell whether a particular attribute value was included directly in the source element start tag or merely defaulted from the declaration of the attribute in the DTD.
Consider the document in Example 9-1. This document exhibits all seven types of nodes. Figure 9-1 is a diagram of this document's tree structure....
...The XPath data model has several inobvious features. First, the
tree's root node is not the same as its root element.
The tree's root node contains the entire document, including the root element
and comments and processing instructions that occur before the root element
start tag or after the root element end tag. In Example 9-1, the root node contains the xml-stylesheet
processing instruction and the root element
people
.
The XPath data model does not include everything in the document.
In particular, the XML declaration and DTD are not
addressable via XPath. However, if the DTD provides default values for any
attributes, then XPath recognizes those attributes. The homepage
element has an xlink:type
attribute supplied by the DTD. Similarly, any
references to parsed entities are resolved. Entity references, character
references, and CDATA sections are not individually identifiable, though any
data they contain is addressable. For example, XSLT does not enable you to make
all text in CDATA sections bold because XPath doesn't know what text is and
isn't part of a CDATA section.
Finally, xmlns
attributes are reported
as namespace nodes. They are not considered attribute nodes, though a
non-namespace aware parser will see them as such. Furthermore these nodes are
attached to every element and attribute node for which that declaration has
scope. They are not just attached to the single element where the namespace is
declared.
Location Paths
The most useful XPath expression is a location path. A location path uses at least one location step to identify a set of nodes in a document. This set may be empty, contain a single node, or contain several nodes. These nodes can be element, attribute, namespace, text, comment, processing instruction, root nodes, or any combination of them.
The Root Location Path
The simplest location path is the one that selects the document's
root node. This path is simply the forward slash /
.
(You'll notice that a lot of XPath syntax was deliberately chosen to be similar
to the syntax used by the Unix shell. Here /
is the
root of a Unix filesystem and /
is the root node of
an XML document.) For example, this XSLT template uses the XPath pattern /
to match the entire input document tree and wrap it in an
html
element...