XML in a Nutshell: A Desktop Quick Reference

XML in a Nutshell: A Desktop Quick Reference

by Elliotte Rusty Harold, W. Scott Means
XML in a Nutshell: A Desktop Quick Reference

XML in a Nutshell: A Desktop Quick Reference

by Elliotte Rusty Harold, W. Scott Means

Paperback(Third Edition)

$44.99 
  • SHIP THIS ITEM
    Temporarily Out of Stock Online
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

If you're a developer working with XML, you know there's a lot to know about XML, and the XML space is evolving almost moment by moment. But you don't need to commit every XML syntax, API, or XSLT transformation to memory; you only need to know where to find it. And if it's a detail that has to do with XML or its companion standards, you'll find it—clear, concise, useful, and well-organized—in the updated third edition of XML in a Nutshell. With XML in a Nutshell beside your keyboard, you'll be able to:
  • Quick-reference syntax rules and usage examples for the core XML technologies, including XML, DTDs, Xpath, XSLT, SAX, and DOM
  • Develop an understanding of well-formed XML, DTDs, namespaces, Unicode, and W3C XML Schema
  • Gain a working knowledge of key technologies used for narrative XML documents such as web pages, books, and articles technologies like XSLT, Xpath, Xlink, Xpointer, CSS, and XSL-FO
  • Build data-intensive XML applications
  • Understand the tools and APIs necessary to build data-intensive XML applications and process XML documents, including the event-based Simple API for XML (SAX2) and the tree-oriented Document Object Model (DOM)
This powerful new edition is the comprehensive XML reference. Serious users of XML will find coverage on just about everything they need, from fundamental syntax rules, to details of DTD and XML Schema creation, to XSLT transformations, to APIs used for processing XML documents. XML in a Nutshell also covers XML 1.1, as well as updates to SAX2 and DOM Level 3 coverage. If you need explanation of how a technology works, or just need to quickly find the precise syntax for a particular piece, XML in a Nutshell puts the information at your fingertips. Simply put, XML in a Nutshell is the critical, must-have reference for any XML developer.

Product Details

ISBN-13: 9780596007645
Publisher: O'Reilly Media, Incorporated
Publication date: 09/15/2004
Series: In a Nutshell (O'Reilly)
Edition description: Third Edition
Pages: 712
Product dimensions: 6.00(w) x 9.00(h) x 1.60(d)

About the Author

Elliotte Rusty Harold is originally from New Orleans to which he returns periodically in search of a decent bowl of gumbo. However, he currently resides in the Prospect Heights neighborhood of Brooklyn with his wife Beth, dog Shayna, and cat Marjorie (named after his mother-in-law). He's a frequent speaker at industry conferences including Software Development, Dr. Dobb's Architecure & Design World, SD Best Practices, Extreme Markup Languages, and too many user groups to count. His open source projects include the XOM Library for processing XML with Java and the Amateur media player.

W. Scott Means has been a professional software developer since 1988, when he joined Microsoft Corporation at the age of 17. He was one of the original developers of OS/2 1.1 and Windows NT, and did some of the early work on the Microsoft Network for the Microsoft Advanced Technology and Business Development group. Since then he has written software for everything from multiplayer casino games to railroad geometry measurement equipment. For Scott's latest projects and musings on software development, visit his blog at smeans.com.

Read an Excerpt

Chapter 9: XPath

XPath is a non-XML language used to identify particular parts of XML documents. XPath lets you write expressions that refer to the document's first person element, the seventh child element of the third person element, the ID attribute of the first person element whose contents are the string "Fred Jones," all xml-stylesheet processing instructions in the document's prolog, and so forth. XPath indicates nodes by position, relative position, type, content, and several other criteria. XSLT uses XPath expressions to match and select particular elements in the input document for copying into the output document or further processing. XPointer uses XPath expressions to identify the particular point in or part of an XML document that an XLink links to.

XPath expressions can also represent numbers, strings, or Booleans, so XSLT stylesheets carry out simple arithmetic for numbering and cross-referencing figures, tables, and equations. String manipulation in XPath lets XSLT perform tasks like making the title of a chapter uppercase in a headline, but mixed case in a reference in the body text.

The Tree Structure of an XML Document

An XML document is a tree made up of nodes. Some nodes contain other nodes. One root node ultimately contains all other nodes. XPath is a language for picking nodes and sets of nodes out of this tree. From the perspective of XPath, there are seven kinds of nodes:

  • The root node

  • Element nodes

  • Text nodes

  • Attribute nodes

  • Comment nodes

  • Processing instruction nodes

  • Namespace nodes

Note the constructs not included in this list: CDATA sections, entity references, and document type declarations. XPath operates on an XML document after these items have merged into the document. For instance, XPath cannot identify the first CDATA section in a document or tell whether a particular attribute value was included directly in the source element start tag or merely defaulted from the declaration of the attribute in the DTD.

Consider the document in Example 9-1. This document exhibits all seven types of nodes. Figure 9-1 is a diagram of this document's tree structure....

...The XPath data model has several inobvious features. First, the tree's root node is not the same as its root element. The tree's root node contains the entire document, including the root element and comments and processing instructions that occur before the root element start tag or after the root element end tag. In Example 9-1, the root node contains the xml-stylesheet processing instruction and the root element people.

The XPath data model does not include everything in the document. In particular, the XML declaration and DTD are not addressable via XPath. However, if the DTD provides default values for any attributes, then XPath recognizes those attributes. The homepage element has an xlink:type attribute supplied by the DTD. Similarly, any references to parsed entities are resolved. Entity references, character references, and CDATA sections are not individually identifiable, though any data they contain is addressable. For example, XSLT does not enable you to make all text in CDATA sections bold because XPath doesn't know what text is and isn't part of a CDATA section.

Finally, xmlns attributes are reported as namespace nodes. They are not considered attribute nodes, though a non-namespace aware parser will see them as such. Furthermore these nodes are attached to every element and attribute node for which that declaration has scope. They are not just attached to the single element where the namespace is declared.

Location Paths

The most useful XPath expression is a location path. A location path uses at least one location step to identify a set of nodes in a document. This set may be empty, contain a single node, or contain several nodes. These nodes can be element, attribute, namespace, text, comment, processing instruction, root nodes, or any combination of them.

The Root Location Path

The simplest location path is the one that selects the document's root node. This path is simply the forward slash /. (You'll notice that a lot of XPath syntax was deliberately chosen to be similar to the syntax used by the Unix shell. Here / is the root of a Unix filesystem and / is the root node of an XML document.) For example, this XSLT template uses the XPath pattern / to match the entire input document tree and wrap it in an html element...

Table of Contents

Preface; What This Book Covers; What's New in the Third Edition; Organization of the Book; Conventions Used in This Book; Request for Comments; Acknowledgments; Part I: XML Concepts; Chapter 1: Introducing XML; 1.1 The Benefits of XML; 1.2 What XML Is Not; 1.3 Portable Data; 1.4 How XML Works; 1.5 The Evolution of XML; Chapter 2: XML Fundamentals; 2.1 XML Documents and XML Files; 2.2 Elements, Tags, and Character Data; 2.3 Attributes; 2.4 XML Names; 2.5 References; 2.6 CDATA Sections; 2.7 Comments; 2.8 Processing Instructions; 2.9 The XML Declaration; 2.10 Checking Documents for Well-Formedness; Chapter 3: Document Type Definitions (DTDs); 3.1 Validation; 3.2 Element Declarations; 3.3 Attribute Declarations; 3.4 General Entity Declarations; 3.5 External Parsed General Entities; 3.6 External Unparsed Entities and Notations; 3.7 Parameter Entities; 3.8 Conditional Inclusion; 3.9 Two DTD Examples; 3.10 Locating Standard DTDs; Chapter 4: Namespaces; 4.1 The Need for Namespaces; 4.2 Namespace Syntax; 4.3 How Parsers Handle Namespaces; 4.4 Namespaces and DTDs; Chapter 5: Internationalization; 5.1 Character-Set Metadata; 5.2 The Encoding Declaration; 5.3 Text Declarations; 5.4 XML-Defined Character Sets; 5.5 Unicode; 5.6 ISO Character Sets; 5.7 Platform-Dependent Character Sets; 5.8 Converting Between Character Sets; 5.9 The Default Character Set for XML Documents; 5.10 Character References; 5.11 xml:lang; Part II: Narrative-Like Documents; Chapter 6: XML as a Document Format; 6.1 SGML's Legacy; 6.2 Narrative Document Structures; 6.3 TEI; 6.4 DocBook; 6.5 OpenOffice; 6.6 WordprocessingML; 6.7 Document Permanence; 6.8 Transformation and Presentation; Chapter 7: XML on the Web; 7.1 XHTML; 7.2 Direct Display of XML in Browsers; 7.3 Authoring Compound Documents with Modular XHTML; 7.4 Prospects for Improved Web Search Methods; Chapter 8: XSL Transformations (XSLT); 8.1 An Example Input Document; 8.2 xsl:stylesheet and xsl:transform; 8.3 Stylesheet Processors; 8.4 Templates and Template Rules; 8.5 Calculating the Value of an Element with xsl:value-of; 8.6 Applying Templates with xsl:apply-templates; 8.7 The Built-in Template Rules; 8.8 Modes; 8.9 Attribute Value Templates; 8.10 XSLT and Namespaces; 8.11 Other XSLT Elements; Chapter 9: XPath; 9.1 The Tree Structure of an XML Document; 9.2 Location Paths; 9.3 Compound Location Paths; 9.4 Predicates; 9.5 Unabbreviated Location Paths; 9.6 General XPath Expressions; 9.7 XPath Functions; Chapter 10: XLinks; 10.1 Simple Links; 10.2 Link Behavior; 10.3 Link Semantics; 10.4 Extended Links; 10.5 Linkbases; 10.6 DTDs for XLinks; 10.7 Base URIs; Chapter 11: XPointers; 11.1 XPointers on URLs; 11.2 XPointers in Links; 11.3 Shorthand Pointers; 11.4 Child Sequences; 11.5 Namespaces; 11.6 Points; 11.7 Ranges; Chapter 12: XInclude; 12.1 The include Element; 12.2 Including Text Files; 12.3 Content Negotiation; 12.4 Fallbacks; 12.5 XPointers; Chapter 13: Cascading Style Sheets (CSS); 13.1 The Levels of CSS; 13.2 CSS Syntax; 13.3 Associating Stylesheets with XML Documents; 13.4 Selectors; 13.5 The Display Property; 13.6 Pixels, Points, Picas, and Other Units of Length; 13.7 Font Properties; 13.8 Text Properties; 13.9 Colors; Chapter 14: XSL Formatting Objects (XSL-FO); 14.1 XSL Formatting Objects; 14.2 The Structure of an XSL-FO Document; 14.3 Laying Out the Master Pages; 14.4 XSL-FO Properties; 14.5 Choosing Between CSS and XSL-FO; Chapter 15: Resource Directory Description Language (RDDL); 15.1 What's at the End of a Namespace URL?; 15.2 RDDL Syntax; 15.3 Natures; 15.4 Purposes; Part III: Record-Like Documents; Chapter 16: XML as a Data Format; 16.1 Why Use XML for Data?; 16.2 Developing Record-Like XML Formats; 16.3 Sharing Your XML Format; Chapter 17: XML Schemas; 17.1 Overview; 17.2 Schema Basics; 17.3 Working with Namespaces; 17.4 Complex Types; 17.5 Empty Elements; 17.6 Simple Content; 17.7 Mixed Content; 17.8 Allowing Any Content; 17.9 Controlling Type Derivation; Chapter 18: Programming Models; 18.1 Common XML Processing Models; 18.2 Common XML Processing Issues; 18.3 Generating XML Documents; Chapter 19: Document Object Model (DOM); 19.1 DOM Foundations; 19.2 Structure of the DOM Core; 19.3 Node and Other Generic Interfaces; 19.4 Specific Node-Type Interfaces; 19.5 The DOMImplementation Interface; 19.6 DOM Level 3 Interfaces; 19.7 Parsing a Document with DOM; 19.8 A Simple DOM Application; Chapter 20: Simple API for XML (SAX); 20.1 The ContentHandler Interface; 20.2 Features and Properties; 20.3 Filters; Part IV: Reference; Chapter 21: XML Reference; 21.1 How to Use This Reference; 21.2 Annotated Sample Documents; 21.3 XML Syntax; 21.4 Constraints; 21.5 XML 1.0 Document Grammar; 21.6 XML 1.1 Document Grammar; Chapter 22: Schemas Reference; 22.1 The Schema Namespaces; 22.2 Schema Elements; 22.3 Built-in Types; 22.4 Instance Document Attributes; Chapter 23: XPath Reference; 23.1 The XPath Data Model; 23.2 Data Types; 23.3 Location Paths; 23.4 Predicates; 23.5 XPath Functions; Chapter 24: XSLT Reference; 24.1 The XSLT Namespace; 24.2 XSLT Elements; 24.3 XSLT Functions; 24.4 TrAX; Chapter 25: DOM Reference; 25.1 Object Hierarchy; 25.2 Object Reference; Chapter 26: SAX Reference; 26.1 The org.xml.sax Package; 26.2 The org.xml.sax.helpers Package; 26.3 SAX Features and Properties; 26.4 The org.xml.sax.ext Package; Chapter 27: Character Sets; 27.1 Character Tables; 27.2 HTML4 Entity Sets; 27.3 Other Unicode Blocks; Colophon;
From the B&N Reads Blog

Customer Reviews