CSIT600: Introduction to XML Dickson K.W. Chiu Reference:

Download Report

Transcript CSIT600: Introduction to XML Dickson K.W. Chiu Reference:

CSIT600:
Introduction to XML
Dickson K.W. Chiu
PhD, SMIEEE
Thanks to Prof. SC Cheung (HKUST), Prof. Francis Lau (HKU)
Reference: XML How To Program, Deitel, Prentice Hall 2001
J2EE 1.4 tutorial
1
General HTML Problems
The Web has changed everything

Except the need for such as:




Data integrity
Process repeatability
Competitive cost structure
HTML fails to meet these critical business
needs



High development and maintenance costs
Internet and server bottlenecks
Interoperability
Dickson Chiu 2004
CSIT600b 01-2
Specific HTML Problems




HTML mixes up structure and style
Many wanted personalized tags
non-standard HTML web pages on the Internet
nowadays
Want to put other data into HTML



mathematics, database entries, literary text, poems, purchase
orders, graphic layouts ….
Different conceptions for the language
Software processing



Server management of data (library Web site, any large site)
Client data processing (machine--machine communication)
But -- HTML is so ill-formed, this is hard!
Dickson Chiu 2004
CSIT600b 01-3
Web Page Processing
HTML
chunk
Web server
engine
HTML
data
Web
software
(from
somewhere
on the
Web ...)
HTML
chunk
HTML
chunk
HTML
chunk
Into a
database,
or other
tool
HTML
Dickson Chiu 2004
CSIT600b 01-4
Case Study: Price Comparison
Scenario - compares prices of books
For example, a user enters a book title, and your
page displays the price at bn.com, amazon.com,
bestbuy.com, etc. User can choose the cheapest
price.
Dickson Chiu 2004
CSIT600b 01-5
SGML








Standard Generalized Markup Language
See: http://www.w3.org/MarkUp/SGML/
Developed in the 1970s
Used by big organizations: IBM, US DoD
A meta-language for defining languages
Focuses on content structure, not look and feel
HTML is defined using SGML
Complex, sophisticated, powerful





Information model of freedom and extensibility
Write once, reuse many times
Future-proof, platform-proof
Validation for completeness and correctness
Infinite possibilities for expressing information (user-defined
tag set)
Dickson Chiu 2004
CSIT600b 01-6
Problem of SGML


Too complicated
Rules too strict



Not good in a distributed environment
Can’t mix different data together





Can’t distribute ‘muddle-able’, loosely formatted text (like
HTML)
Can’t add arbitrary tags
No mainstream browser support
Unlimited options, which complicates the tools
Not much support for styles
Limited vendor support
Dickson Chiu 2004
CSIT600b 01-7
eXtensible Markup Language
XML to the Rescue








Well-behaved subset of SGML designed to enable delivery
over the Web
a structured meta-language in the format of ASCII plain
text
SGML - -, not HTML + +
Designed by the World Wide Web Consortium (W3C)
Overwhelming vendor support
Can use XML to define new languages
Distributes easily on the Web
Can mix different types of data together




can easily add new tags, and tell a browser what to do with them
(more or less....)
Tools are easier to build
Mainstream browsers (IE 5 and Netscape 6) support XML
However! Reuse, interchange and automation still require
data analysis and enforcement of rules
Dickson Chiu 2004
CSIT600b 01-8
XML History and Pointers








XML is an official standard of the World Wide Web Consortium
(W3C)
Official information is available at:
http://www.w3.org/XML/
Version:1.0 (2nd edition: 6 October 2000)
New version 1.1
http://www.w3.org/TR/2004/REC-xml11-20040204/
The Official spec is available at: http://www.w3.org/TR/2000/RECxml-20001006
The Official XML FAQ:
http://www.ucc.ie/xml/
Popular reference sites:
http://www.xml.com/ http://www.xml.org/
Reference Book: XML – How to Program
Dietel, Dietel, Nieto, Lin & Sadhu (Prentice Hall 2000)
Dickson Chiu 2004
CSIT600b 01-9
XML Family of Technologies (partial)








DTD / Schema – defining XML document, elements
and attributes
DOM - manipulating XML (and HTML) file from a
programming language
Xpath - address parts of an XML document
Xlink - adding hyperlinks to an XML file
XPointer - pointing to parts of an XML document CSS
is applicable to XML as it is to HTML
XSL - an advanced language for expressing style
sheets (XML represents data but not how it looks…)
XSLT - transforming XML to other formats
Namespaces - differentiating elements of different
XML documents
Dickson Chiu 2004
CSIT600b 01-10
Official (W3C) Design Goal of XML
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
XML shall be straightforwardly usable over the Internet.
XML shall support a wide variety of applications.
XML shall be compatible with SGML.
It shall be easy to write programs which process XML
documents.
The number of optional features in XML is to be kept to
the absolute minimum, ideally zero.
XML documents should be human-legible and
reasonably clear.
The XML design should be prepared quickly.
The design of XML shall be formal and concise.
XML documents shall be easy to create.
Terseness in XML markup is of minimal importance.
Dickson Chiu 2004
CSIT600b 01-11
Examples of Hot XML Application

E-publishing




E-commerce



Web (intranet, extranet, Internet)
CD-ROM
Print
Electronic commerce (business-to-consumer)
Electronic Data Interchange -- EDI (business-to-business)
Software applications



Data exchange between applications and databases
Application integration
Standard data formats for industries


MathML -- for mathematics
SpeechML -- for synthesised voices
Dickson Chiu 2004
CSIT600b 01-12
XML Advantages for Web Delivery
Over SGML:




Faster download
Supported by
mainstream browsers
Standard linking
Standard stylesheet
Over HTML:




Interchangeable
Reusable
Enables automation
Searchable
Dickson Chiu 2004
CSIT600b 01-13
The XML Family Tree
SMIL
XHTML
HTML
SpeechML
WML
MathML
RDF
TEI
...
...
XML
SGML
Dickson Chiu 2004
CSIT600b 01-14
HTML vs. XML – a Quick Example
HTML
XML
<html>
<body>
<p>333 MHz Pentium II
with 256K internal cache,
512K external cache, 32MB
standard RAM, 512MB max.
RAM</p>
</body>
</html>
<pcinfo>
<processor>
<type>Pentium II</type>
<speed>333</speed>
<intcache>256</intcache>
</processor>
<extcache>512</exctache>
<ram>
<standard>32</standard>
<max>512</max>
</ram>
</pcinfo>
Dickson Chiu 2004
CSIT600b 01-15
MathML Example


Designed to express
layout of maths

Also can express
semantics

Cut & paste into Maple,
Mathematica
x2 + 4x + 4 =0
<mrow>
<mrow>
<msup> <mi>x</mi> <mn>2</mn>
</msup> <mo>+</mo>
<mrow>
<mn>4</mn>
<mo>&invisibletimes;</mo>
<mi>x</mi>
</mrow>
<mo>+</mo>
<mn>4</mn>
</mrow>
<mo>=</mo>
<mn>0</mn>
</mrow>
Dickson Chiu 2004
CSIT600b 01-16
XML Markup
<?xml version = "1.0" encoding="utf-8" ?>
<!– XML Fig. 5.1 : intro.xml -->
<!-- Simple introduction to XML markup -->
<myMessage>
<message>Welcome to XML!</message>
</myMessage>






Declaration; version 1.0
Encoding specification, e.g., UTF-8 Unicode
(Unicode Transformation Format-8: www.utf-8.com)
Comments
A tree of elements
One root element per document, e.g., <myMessage>
Child elements

<message> which contains the text Welcome to XML!
Dickson Chiu 2004
CSIT600b 01-17
XML Markup Syntax

Tags written as in HTML, but ...






Only 1 root element in a XML document
Tag names are case-sensitive
Always need end tags
Special empty-element tags
<img src = "img.gif" />
or
<img src = "img.gif"></img>
(<img src = "img.gif"> is invalid)
Always quote attribute values
Proper nesting for XML elements:
<x><y>hello</x></y> is an error
Dickson Chiu 2004
CSIT600b 01-18
XML Characters

Unicode characters (http://www.unicode.org)





Reserved characters: & < > ' "
Entity reference



Definition: <!ENTITY myName “Dickson Chiu”>
Using them: &myName;
Built-in entity: &amp; &lt; &gt; &apos; &quot;


ASCII a small subset
Most languages in the world
E.g., &#1583;&#1575;&#1610;&#1578;&#1614;
&lt;hello&gt; displayed as <hello>
By default, consecutive white space, tabs and blank
lines as single space. To override:
<myCProgram xml:space = “preserve”>
if ( x <= 0)
x = 5;
</myCProgram>
Dickson Chiu 2004
CSIT600b 01-19
Why Use Attributes?

Elements define structure, attributes describe
elements




Many debates – why pollute the language with two
ways of doing the same thing?
“Attributes can provide metadata that may not be
relevant to most applications dealing with XML”



<car doors=“4”/> or
<car>
<doors type=“4”/>
</car>?
Metadata is data about data (i.e., description)
Attributes save bandwidth?
Personal preference
Dickson Chiu 2004
CSIT600b 01-20
CDATA


Character not parsed by parser (good for code)
IE5 displays CDATA as is, including whitespace
<?xml version = "1.0"?>
<!-- Fig. 5.7 : cdata.xml -->
<book title = "C++ How to Program" edition = "3">
<sample>
// C++ comment
if ( this-&gt;getX() &lt; 5 &amp;&amp; value[ 0 ] !=
3 )
cerr &lt;&lt; this-&gt;displayError();
</sample>
<sample>
<![CDATA[
// C++ comment
if ( this->getX() < 5 && value[ 0 ] != 3 )
cerr << this->displayError();
]]>
</sample>
C++ How to Program by Deitel &amp; Deitel
</book>
Dickson Chiu 2004
CSIT600b 01-21
XML Namespaces



To avoid name collisions (same name for different elements)
A namespace is tied to a uniform resource identifier (URI)
A common practice is to use URL
<?xml version = "1.0"?>
<!-- Fig. 5.9 : defaultnamespace.xml -->
<directory xmlns = "urn:deitel:textInfo"
xmlns:image = "urn:deitel:imageInfo">
<file filename = "book.xml">
<!-- default ns -->
<description>A book list</description>
</file>
<image:file filename = "funny.jpg">
<image:description>A funny picture
</image:description>
<image:size width = "200" height = "100"/>
</image:file>
</directory>
Dickson Chiu 2004
CSIT600b 01-22
Document Type Definition





DTD (Document Type Definition) to define a
document’s structure – what tags/attributes
are permitted, and the “grammar”
Validity = conformance to some DTD
(“grammatically correct”)
Well-formedness – required;
validity – optional
DTD recommended, esp. for B2B transactions
DTDs are defined using EBNF (Extended
Backus-Naur Form), not XML
Dickson Chiu 2004
CSIT600b 01-23
XML Parsers


“Parse”, to separate a sentence into its parts [Webster]
XML parser, a program/function that reads the XML
document



An XML document is well-formed if it is syntactically
correct





To check its syntax
To allow programmatic access (DOM or SAX) to the contents
One root element
Start and end tag for each element
Proper nesting, etc.
Validity implies well-formedness; the reverse is not true
All XML parsers check for well-formedness; validating
parsers check also for validity
Dickson Chiu 2004
CSIT600b 01-24
Many Free XML Parsers


Apache’s Xerces, Sun’s JAXP, IBM’s XML4J, etc.
IE5 has one built in, msxml





It uses a default style sheet
With style sheets such as CSS or XSL, the data can
be displayed in any desired format
msxml is a validating parser
But the validation feature needs to be turned on in IE5
Current version msxml4, see:
http://msdn.microsoft.com/xml
Dickson Chiu 2004
CSIT600b 01-25
DTD <!DOCTYPE … >


DTDs are specified using <!DOCTYPE … >
Internal DTD:


<!DOCTYPE myMessage [
<!ELEMENT myMessage ( #PCDATA )>
]>
External DTD:

<!DOCTYPE myMessage SYSTEM “myDTD.dtd”>
Dickson Chiu 2004
CSIT600b 01-26
DTD Example
<?xml version = "1.0"?>
<!-- Fig. 6.8: IDExample.xml -->
<!DOCTYPE bookstore [
<!ELEMENT bookstore ( shipping+, book+ )>
<!ELEMENT shipping ( duration )>
<!ATTLIST shipping shipID ID #REQUIRED>
<!ELEMENT book ( #PCDATA )>
<!ATTLIST book shippedBy
IDREF #IMPLIED>
<!ELEMENT duration ( #PCDATA )>
]>
<bookstore>
<shipping shipID = "s1">
<duration>2 to 4 days</duration>
</shipping>
<shipping shipID = "s2">
<duration>1 day</duration>
</shipping>
<book shippedBy = "s2">
Java How to Program 3rd edition.
</book>
<book shippedBy = "s2">
C How to Program 3rd edition.
</book>
</bookstore>
Dickson Chiu 2004
CSIT600b 01-27
Something better?


Schemas is the answer
“Schema”, originated in database, means the
organization or structure of a database






Naming of data items
Constraints to be applied to data (eg., data typing)
Relationships between data items
W3C schemas (May 2001)–
http://www.w3c.org/XML/Schema
XML-Data Reduced (XDR) - Microsoft’s non-W3Ccompliant implementation
See tutorial:
http://zvon.org/xxl/XMLSchemaTutorial/Output/index.html
Dickson Chiu 2004
CSIT600b 01-28
XML Schema – Simple Types

Elements that do not contain other elements or
attributes are of type simpleType.
<xsd:element name=“STAFFNO” type = “xsd:string”/>
<xsd:element name=“DOB” type = “xsd:date”/>
<xsd:element name=“SALARY” type = “xsd:decimal”/>

Attributes must be defined last:
<xsd:attribute name=“branchNo” type = “xsd:string”/>
Dickson Chiu 2004
CSIT600b 01-29
XML Schema – Enumeration Types
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
<xsd:element name="root">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="N/A"/>
<xsd:enumeration value="#REF!"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:schema>
Dickson Chiu 2004
CSIT600b 01-30
XML Schema – Range Restrictions

Element "root" to be from the range 0-100 or 300-400
(including the border values).
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
<xsd:element name="root">
<xsd:simpleType>
<xsd:union>
<xsd:simpleType>
<xsd:restriction base="xsd:integer">
<xsd:minInclusive value="0"/>
<xsd:maxInclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType>
<xsd:restriction base="xsd:integer">
<xsd:minInclusive value="300"/>
<xsd:maxInclusive value="400"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
</xsd:element>
</xsd:schema>
Dickson Chiu 2004
CSIT600b 01-31
XML Schema – Complex Types


Elements that contain other elements are of type
complexType.
List of children of complex type are described by



all – must all appear, any order
sequence –must all appear according to specified sequence
choice – any can appear
<xsd:element name = “STAFFLIST”>
<xsd:complexType>
<xsd:sequence>
<!-- children defined here -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Dickson Chiu 2004
CSIT600b 01-32
Cardinality


Cardinality of an element can be represented
using attributes minOccurs and maxOccurs
(default 1).
To represent an optional element, set minOccurs
to 0; to indicate there is no maximum number of
occurrences, set maxOccurs to “unbounded”.
<xsd:element name=“DOB” type=“xsd:date”
minOccurs = “0”/>
<xsd:element name=“NOK” type=“xsd:string”
minOccurs = “0” maxOccurs = “3”/>
Dickson Chiu 2004
CSIT600b 01-33
References

Can use references to elements and attribute
definitions.
<xsd:element name=“STAFFNO” type=“xsd:string”/>
….
<xsd:element ref = “STAFFNO”/>

If there are many references to STAFFNO, use of
references will place definition in one place and
improve the maintainability of the schema.
Dickson Chiu 2004
CSIT600b 01-34
Defining New Types

Can also define new data types to create elements and attributes.
<xsd:simpleType name = “STAFFNOTYPE”>
<xsd:restriction base = “xsd:string”>
<xsd:maxLength value = “5”/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="myNumber">
<xsd:restriction base="xsd:decimal">
<xsd:totalDigits value="5"/>
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="myString">
<xsd:restriction base="xsd:string">
<xsd:pattern value="[^@]+@[^.]+\..+"/>
</xsd:restriction>
</xsd:simpleType>
Dickson Chiu 2004
CSIT600b 01-35
XMI – Designing XML and Schema

XML Metadata Interchange (XMI)




Created by the OMG
As a standard for exchanging metamodels and
models
Provides a standard method for mapping object
models and instances to XML
Mapping UML models to XML is only one specific
subset of how XMI can be applied
Dickson Chiu 2004
CSIT600b 01-36
Appling XMI On UML

Mapping UML models to XML schemas and
documents
UML
Meta Model
instance of
produced according to XMI
Our Model
XML DTD / Schema
instance of
Our Model Instance
validated by
translated according to XMI
Dickson Chiu 2004
XML Document
CSIT600b 01-37
Simplified Product Catalog Example
Catalog
name : String
expirationDate : Date
+item
0..*
CatalogItem
name : String
description : String
sku : String
listPrice : Money
*
keyword [0..*] : String
1
Product
photoURL : Stirng
units : UnitOfMeasure
Organization
name : String
+supplier address : String
city : String
1 state : String
zip : String
Service
units : UnitOfTime
<<enumeration>>
UnitOfMeasure
<<enumeration>>
UnitOfTime
each
dozen
meter
kilogram
hour
day
week
month
year
Money
currency : String
amount : double
Dickson Chiu 2004
CSIT600b 01-38
Disassembling UML Objects Into XML

UML Classes to Elements


Inheritance


Copy-down inheritance for attributes, association
references, and compositions
UML Attributes to XML Elements


Each instance of a UML class produces one XML element
Each data value as a child element
UML Attributes to XML Attributes

Only possible for primitive data type or an enumeration
:Product
name = Wizard 4000
description = 400 Mhz notebook…
sku = p123
photoURL = images/p123.gif
:Money
listPrice currency = USD
amount = 1899.95
Dickson Chiu 2004
CSIT600b 01-39
Disassembling UML Objects Into XML

Class
Inheritance
UML Attribute
to XML
Element
UML Attributes to XML Elements
<Product>
<CatalogItem.name>Wizard 4000</CatalogItem.name>
<CatalogItem.sku>p123</CatalogItem.sku>
<CatalogItem.description>
400 MHz notebook computer with Win98
</CatalogItem.description>
<CatalogItem.listPrice>
<Money>
<Money.currency>USD</Money.currency>
<Money.amount>1899.95</Money.amount>
<Money>
</CatalogItem.listPrice>
<Product.photoURL>images/p123.gif</Product.photoURL>
</Product>
:Product
name = Wizard 4000
description = 400 Mhz notebook…
sku = p123
photoURL = images/p123.gif
:Money
listPrice currency = USD
amount = 1899.95
Dickson Chiu 2004
CSIT600b 01-40
Disassembling UML Objects Into XML

UML Attribute
to XML
Attribute
UML Attributes to XML Attributes
<Product sku=“p123” photoURL=“images/p123.gif”>
<CatalogItem.name>Wizard 4000</CatalogItem.name>
<CatalogItem.description>
400 MHz notebook computer with Win98
</CatalogItem.description>
<CatalogItem.listPrice>
<Money currency=“USD” amount=“1899.95” />
</CatalogItem.listPrice>
</Product>
:Product
name = Wizard 4000
description = 400 Mhz notebook…
sku = p123
photoURL = images/p123.gif
:Money
listPrice currency = USD
amount = 1899.95
Dickson Chiu 2004
CSIT600b 01-41
Disassembling UML Objects Into XML

Mapping UML Compositions


The role name on the composition end is mapped
onto an XML element
The composite mapped as child element
:Catalog
item
:Product
sku = p123
name = Wizard 4000
item
:Product
sku = p456
name = Wizard 5500
:Service
item
sku = p789
name = 3 yrs on-site repair
Dickson Chiu 2004
CSIT600b 01-42
Disassembling UML Objects Into XML
role name
composite
<Catalog>
<Catalog.item>
<Product sku=“p123”>
<CatalogItem.name>Wizard 4000</CatalogItem.name>
</Product>
<Product sku=“p456”>
<CatalogItem.name>Wizard 5500</CatalogItem.name>
</Product>
<Service sku=“p789”>
<CatalogItem.name>3 yrs on-site repair</CatalogItem.name>
</Service>
</Catalog.item>
</Catalog>
Dickson Chiu 2004
CSIT600b 01-43
Disassembling UML Objects Into XML

Mapping UML Associations


The associated object as a separate element with a
xmi.id attribute
The role name of the association end is mapped to
an XML element or attribute, referred to the
associated element
:Catalog
item
item
:Product
sku = p123
name = Wizard 4000
:Organization
xmi.id = s1
name = Cheap PCs, Inc
supplier
supplier
:Product
sku = p456
name = Wizard 5500
Dickson Chiu 2004
CSIT600b 01-44
Disassembling UML Objects Into XML
role name
Attribute
containing
the
reference
Associated
element
<Catalog>
<Catalog.item>
<Product sku=“p123”>
<CatalogItem.name>Wizard 4000</CatalogItem.name>
<CatalogItem.supplier>
<Organization xmi.idref=“s1” xmi.label=“Wizard PCs, Inc.” />
</CatalogItem.supplier>
</Product>
<Product sku=“p456” supplier=“s1”>
<CatalogItem.name>Wizard 5500</CatalogItem.name>
</Product>
</Catalog.item>
</Catalog>
<Organization xmi.id=“s1”>
<Organization.name>Wizard PCs, Inc.</Organization.name>
</Organization>
Dickson Chiu 2004
CSIT600b 01-45
XML Schema Mapping Example
Catalog
name : String
startDate : Date
<xsd:element name=“Catalog” type=“Catalog”/>
<xsd:complexType name=“Catalog”>
<xsd:all>
Attribute
<xsd:element name=“Catalog.name” type=“xs:string”/>
<xsd:element name=“Catalog.startDate” type=“xs:date”/>
+item 0..*
<xsd:element name=“Catalog.item” minOccurs=“0” />
CatalogItem
Association <xsd:complexType>
name : String
<xsd:choice minOccurs=“0” maxOccurs=“unbounded”>
description : String
<xsd:element ref=“CatalogItem”/>
<xsd:element ref=“Product”/>
</xsd:choice>
Product
</xsd:complexType>
photoURL : String
units : UnitOfMeasure
</xsd:element>
</xsd:all>
<<enumeration>>
</xsd:complextType>
UnitOfMeasure
each
dozen
Dickson Chiu 2004
CSIT600b 01-46
XML Schema Mapping Example
Catalog
name : String
startDate : Date
+item 0..*
CatalogItem
name : String
description : String
Product
photoURL : String
units : UnitOfMeasure
Class
<xsd:element name=“CatalogItem”
type=“CatalogItem”/>
<xsd:complexType name=“CatalogItem”>
<xsd:all>
<xsd:element name=“CatalogItem.name”
type=“xs:string”/>
<xsd:element name=“CatalogItem.description”
type=“xs:string”/>
</xsd:all>
</xsd:complexType>
<<enumeration>>
UnitOfMeasure
each
dozen
Dickson Chiu 2004
CSIT600b 01-47
XML Schema Mapping Example
Catalog
name : String
startDate : Date
+item 0..*
CatalogItem
name : String
description : String
Product
photoURL : String
units : UnitOfMeasure
<<enumeration>>
UnitOfMeasure
each
dozen
Inheritance
Enumeration
<xsd:element name=“Product” type=“Product”/>
<xsd:complexType name=“Product”>
<xsd:complexContent>
<xsd:extension base=“CatalogItem”>
<xsd:all>
<xsd:element name=“Product.photoURL”
type=“xs:string”/>
<xsd:element name=“Product.units”
type=“UnitOfMeasure”/>
</xsd:all>
</xsd:extension>
</xsd:complextContent>
</xsd:complextType>
Dickson Chiu 2004
CSIT600b 01-48
XML Schema Mapping Example
Catalog
name : String
startDate : Date
+item 0..*
CatalogItem
name : String
description : String
Product
photoURL : String
units : UnitOfMeasure
Enumeration
<<enumeration>>
UnitOfMeasure
each
dozen
<xsd:simpleType name=“UnitOfMeasure”>
<xsd:restriction base=“xs:string”>
<xsd:enumeration value=“each”/>
<xsd:enumeration value=“dozen”/>
</xsd:restriction>
</xsd:simpleType>
Dickson Chiu 2004
CSIT600b 01-49
What’s DOM (W3C)

“… a platform- and language-neutral
interface that will allow programs and scripts
to dynamically access and update the content,
structure and style of documents.”
Application
DOM API
XML parser
XML
document
Dickson Chiu 2004
CSIT600b 01-50
Public Releases


The DOM specification is being built level by level.
NN3.0/IE3.0 – “Level 0”


Level 1 – a W3C recommendation, 10/1998




HTML DOM
DOM Core, HTML DOM, XML DOM
Level 2 – a W3C recommendation, 11/2000
 Extends Level 1 by an events model, a style sheet
model, support for XML namespaces, etc.
Level 3 – working draft
http://www.w3c.org/DOM/
Dickson Chiu 2004
CSIT600b 01-51
HTML/XML as Tree
<TABLE>
<TBODY>
<TR> <TD>Shady Grove</TD> <TD>Aeolian</TD> </TR>
<TR> <TD>Over the River, Charlie</TD> <TD>Dorian</TD></TR>
</TBODY>
</TABLE>
Dickson Chiu 2004
CSIT600b 01-52
XML Tree and Nodes

Seven types of nodes







root nodes
element nodes
text nodes
attribute nodes
namespace nodes
processing instruction nodes
comment nodes
Dickson Chiu 2004
CSIT600b 01-53
An DOM Example (Fig. 11.2)
Root
Comment
Fig. 11.1 :
simple.xml
Comment
Simple XML document
Element
book
Attribute
Title
C++ How to Program
Attribute
edition 3
Element
sample
<?xml version = "1.0"?>
<!-- Fig. 11.1 : simple.xml -->
<!-- Simple XML document -->
<book title = "C++ How to Program" edition = "3">
<sample>
<![CDATA[
// C++ comment
if ( this->getX() < 5 && value[ 0 ] != 3 )
cerr << this->displayError();
]]>
</sample>
C++ How to Program by Deitel &amp; Deitel
</book>
Text
// C++ comment
if (this -> getX() < 5 && value[ 0 ] != 3 )
cerr << this->displayError();
Text
C++ How to Program by Deitel & Deitel
Dickson Chiu 2004
CSIT600b 01-54
XPath Introduction


The primary purpose of XPath is to address parts of
an XML document
Resources



Used by XSLT and XPointer




http://www.w3c.org/TR/xpath
http://www.vbxml.com/xsl/XPathRef.asp
XSLT to transform XML documents into other formats
XPointer for “pointing” to a document’s contents
Non-XML syntax to facilitate use with URI’s and
attribute values
A path notation like that of URL for navigation
catalog/cd/title
Dickson Chiu 2004
CSIT600b 01-55
XPath 1.0


XPath 1.0 is a W3C recommendation (16 November
1999)
The recommendation consists of





Location paths for getting around
Expressions the primary syntax constructs
Core function library every implementation must include
Data model where an XML document is viewed conceptually
as a tree of nodes
See Tutorial:
http://zvon.org/xxl/XPathTutorial/Output/index.html
Dickson Chiu 2004
CSIT600b 01-56
XPath Example
<books>
<book>
<title>Java How to Program</title>
<translation edition = "1">Spanish</translation>
<translation edition = "1">Chinese</translation>
<translation edition = "1">Japanese</translation>
<translation edition = "2">French</translation>
<translation edition = "2">Japanese</translation>
</book>
<book>
<title>C++ How to Program</title>
<translation edition = "1">Korean</translation>
<translation edition = "2">French</translation>
<translation edition = "2">Spanish</translation>
<translation edition = "3">Italian</translation>
<translation edition = "3">Japanese</translation>
</book>
/books/book/translation[. = ‘Japanese’]/../title
</books>
/books/book/translation[. = ‘Japanese’]/@edition
Dickson Chiu 2004
CSIT600b 01-57
The Stock XSLT Example
<?xml version = "1.0"?>
<!-- Output of stocks.xsl -->
<!-- Fig. 11.14 : stocks.xsl -->
<html>
<xsl:stylesheet version = "1.0" xmlns:xsl =
<body>
"http://www.w3.org/1999/XSL/Transform">
<ul>
<xsl:template match = "/stocks">
<li>CSCO - Cisco Systems</li>
<html>
<li>CMGI - CMGI, Inc.</li>
<body>
</ul>
<ul>
</body>
<xsl:for-each select = "stock">
</html>
<xsl:if test =
<?xml version = "1.0"?>
"starts-with(@symbol, 'C')">
<?xml:stylesheet type = "text/xsl"
<li>
href = “stock.xsl"?>
<xsl:value-of select =
<stocks>
"concat(@symbol,' - ', name)"/>
<stock symbol = "INTC">
</li>
<name>Intel Corporation</name>
</xsl:if>
</stock>
</xsl:for-each>
<stock symbol = "CSCO">
<name>Cisco Systems</name>
</ul>
</stock>
</body>
<stock symbol = "CMGI">
</html>
<name>CMGI, Inc.</name>
</xsl:template>
</stock>
</xsl:stylesheet>
</stocks>
Dickson Chiu 2004
CSIT600b 01-58
XSL - Extensible Stylesheet Language



Transforms an input document (source tree) to an output
document (result tree)
By matching XPath expression – data driven
(cf. Prolog)
Two (language) components


XSLT, T for Transformation
XSL FO, FO for Formatting Objects

XPath was the query language part of the original XSLT proposal
XSLT 1.0 is a W3C recommendation, so is XPath 1.0;
(now working on 2.0)

XSL 1.0 (with FO) recommendation Oct 15, 2001.




http://www.w3.org/Style/XSL/
http://www.w3.org/Style/XSL/XSL-FO/
Tutorial: http://www.zvon.org/xxl/XSLTutorial/Output/index.html
Dickson Chiu 2004
CSIT600b 01-59
XSLT for E-Commerce






Companies sharing information
Company A sends purchase order to
Company B
A and B need different sets of information
related to the purchase
Using XSLT, the same (master) data can be
transformed to suit the needs of A and B
The companies can use XSL FO to further
modify the transformed data for display
Compare views in databases…
Dickson Chiu 2004
CSIT600b 01-60
XSLT/XPath Processors

IE




Apache Xalan


Need to work with a Java-based XML parser such
as Xerces
XT



IE5 need to install the latest MSXML (3.0) in
“replace mode”
IE6 – with MSXML 3.0
Latest version: MSXML 4.0 SP1…
By one of the champions of the W3C XSL effort
Has a Win32 executable version
Many others
Dickson Chiu 2004
CSIT600b 01-61
<xsl:templates>

XSLT stylesheets are simply a collection of templates


A template specifies



To apply to an input document to create an output document
The section of the source tree to which the template applies,
using the match attribute (see: zvon.org tutorial p6 p8 p9)
The output that will be inserted into the result tree –
everything between <xsl:template> and </xsl:template>
Other attributes:




name
param – specific parameter for call-template
(see: zvon.org tutorial p34)
priority – (numeric) order upon multiple template match
(see: zvon.org tutorial p10)
mode – for advanced programming: context match but mode
value does not: skip (see: zvon.org tutorial p11)
Dickson Chiu 2004
CSIT600b 01-62
<xsl:apply-templates>




Used within a template, to call other templates
Recursive templates allowed!
Uses the select attribute to select a context node
If empty, the current node is selected, as in the
following default template (see: zvon.org tutorial p7)



<xsl:template match=“/|*”>
<xsl:apply-templates/>
</xsl:template>
Note difference from subroutine call – more than 1
match possible (cf. Prolog)
Example
<xsl:template match=“/”>
<Members><xsl:apply-templates
select=“/Team/Member”/></Members>
</xsl:template>
Dickson Chiu 2004
CSIT600b 01-63
<xsl:value-of>


To do something useful with the information
in the source tree
Searches the context node for the value
specified in the select attribute, and inserts it
into the result tree


<xsl:value-of select=“.”/>
<xsl:value-of select=“customer/@id” />
Dickson Chiu 2004
CSIT600b 01-64
<xsl:element> <xsl:attribute>


What if we don’t know ahead of time what elements
we are creating?
The following element creation depends on the
contents of a source element





<xsl:template match=“name”>
<xsl:element name=“{.}”>Hello</xsl:element>
</xsl:template>
produces
<Tommy>Hello</Tommy>
when run against
<name>Tommy</name>
in the source tree
See: zvon.org tutorial p22
You may fill in attributes too. (See: zvon.org tutorial
p23)
Dickson Chiu 2004
CSIT600b 01-65
<xsl:if> and <xsl:choose>

<xsl:if> evaluates the expression in the test attribute; if
true, the contents of <xsl:if> is evaluated (See: zvon.org
tutorial p26)



<xsl:if test=“name”>Hello</xsl:if>
would insert the text Hello into the result tree if <name> is a child
of the context node
Interesting example: zvon.org tutorial p28
<xsl:choose> is more flexible (See: zvon.org tutorial p27)

<xsl:choose>
<xsl:when test=“grade[. &gt; 70]”>good</xsl:when>
<xsl:when test=“grade[. &gt; 50]”>ok</xsl:when>
<xsl:otherwise>hmmm</xsl:otherwise>
</xsl:choose>
Dickson Chiu 2004
CSIT600b 01-66
<xsl:for-each>



Often we need to process a number of nodes
in series
<xsl:apply-templates> + a new template
would work
<xsl:for-each> is an (better) alternative,
which forms a template within a template


<xsl:for-each select=“name”>
Hello, <xsl:value-of select=“.”/>!
</xsl:for-each>
See: zvon.org tutorial p18
Dickson Chiu 2004
CSIT600b 01-67
<xsl:copy-of> and <xsl:copy>

<xsl:copy-of> takes sections (sub-structure) of the
source tree and copies them to the result tree (See:
zvon.org tutorial p63)



vs. manually using a series of <xsl:value-of>’s
Uses the select attribute to indicate what to copy
<xsl:copy> is more flexible (can set attributes at the
same time, see: p64)

<xsl:template match=“name”>
<xsl:copy>
<xsl:value-of select=“.”/>
</xsl:copy>
</xsl:template>
Dickson Chiu 2004
CSIT600b 01-68
<xsl:sort>



To apply to <xsl:apply-templates> and <xsl:for-each>,
using the select attribute to choose the key
See: zvon.org tutorial p19
Other attributes include data-type (text* / number,
see: p20), order (ascending* / descending), and
case-order (upper-first* / lower-first, see: p21)


<xsl:for-each select=“/people/name”>
<xsl:sort select=“last”/>
<xsl:sort select=“first”/>
</xsl:for-each>
* means default
Dickson Chiu 2004
CSIT600b 01-69
Variables and Constants

As in programming languages





<xsl:variable name=“pi”>3.14</xsl:variable>
which defines a constant to be denoted by $pi
<xsl:variable name=“who” select=“/people/name”/>
which defines a variable, $who
BUT: no variable update – you can only put updated value
into new variables
See: zvon.org tutorial p32
<xsl:variable> can contain XML markup (see: p35)

<xsl:variable name=“who”>
<name><xsl:value-of select=“/name/last”/>
<xsl:value-of select=“/name/first”/></name>
</xsl:variable>
…
<xsl:copy-of select=“$who”/>
Dickson Chiu 2004
CSIT600b 01-70
Other functions

String functions (see: zvon.org tutorial p47-52)


Node set functions



ceilng(), floor(), round(), /, div …
Test for number (p42): <xsl:if test="string(number(.))='NaN'"> is
not a number</xsl:if>
Boolean functions (e.g., p44)


count() (p54), position(), last() (p53)
Numeric calculations (p38-42)


concat(), starts-with(), substring-before(), substring-after(),
string-length(), normalize-space(), translate()…
<xsl:template match="car[not(@checked)]">
Output Coding (p58-62)

<xsl:output method="html" encoding="UTF-8"/>
Dickson Chiu 2004
CSIT600b 01-71
XSLT vs. CSS

CSS pushes, XSLT pulls




CSS is content-blind and order-sensitive
CSS is not a programming language, XSLT is
like one
But CSS provides a rich set of capabilities for
content presentation, XSLT has nothing on
styling (XSL FO has)
CSS should still be considered an integral part
of using XML

at least until XSL FO becomes stable and widely
available; even then, CSS is an easier alternative
Dickson Chiu 2004
CSIT600b 01-72
Main Discussion End Here …

The rest of the slides are just for reference
Dickson Chiu 2004
CSIT600b 01-73
XML API


Two types: tree-based and event-based
A tree-based API (eg., DOM) compiles an XML
document into an internal tree structure



Uses much physical memory
An event-based API (eg., SAX) reports
parsing events (such as the start and end of
elements) directly to the application
Consider the task to locate the record
element containing the word “Hong Kong” in
a 20MB large (or even just 2MB) document
Dickson Chiu 2004
CSIT600b 01-74
Event-Based API

For the document


An event-based interface will generate the series of
events:


<doc>
<para>Hello, world!</para>
</doc>
start document
start element: doc
start element: para
characters: Hello, world!
end element: para
end element: doc
end document
A “tell me when something happens” approach
Dickson Chiu 2004
CSIT600b 01-75
DOM vs. SAX

SAX = Simple API for XML
XML
document
XML
parser
parse
DOM
get information
Application
DOM
XML
document
XML parser
parse
Application
SAX
Dickson Chiu 2004
information
Event
handlers
CSIT600b 01-76
DOM with JavaScript (Reference)
(MS XML Parser) (Deitel Sec 8.3)
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<!-- Fig. 8.3 : DOMExample.html -->
<head> <title>A DOM Example</title> </head>
<body>
<script type = "text/javascript" language = "JavaScript">
var xmlDocument = new ActiveXObject( "Microsoft.XMLDOM" );
xmlDocument.load( "article.xml" );
// get the root element
var element = xmlDocument.documentElement;
document.writeln("<p>Here is the root node of the document:" );
document.writeln("<strong>" + element.nodeName+"</strong>" );
document.writeln("<br>The following are its child elements:" );
document.writeln( "</p><ul>" );
Dickson Chiu 2004
CSIT600b 01-77
DOM with JavaScript (2)
// traverse all child nodes of root element
for ( i = 0; i < element.childNodes.length; i++ ) {
var curNode = element.childNodes.item( i );
// print node name of each child element
document.writeln( "<li><strong>" + curNode.nodeName
+ "</strong></li>" );
}
document.writeln( "</ul>" );
// get the first child node of root element
var currentNode = element.firstChild; // firstChild = childNodes.item(0)
document.writeln( "<p>The first child of root node is:" );
document.writeln( "<strong>" + currentNode.nodeName+ "</strong>");
document.writeln( "<br>whose next sibling is:" );
// get the next sibling of first child
var nextSib = currentNode.nextSibling;
document.writeln( "<strong>" + nextSib.nodeName+ "</strong>." );
document.writeln( "<br>Value of <strong>" + nextSib.nodeName
+ "</strong> element is:" );
Dickson Chiu 2004
CSIT600b 01-78
DOM with JavaScript (3)
var value = nextSib.firstChild;
// print the text value of the sibling
document.writeln( "<em>" + value.nodeValue + "</em>" );
document.writeln( "<br>Parent node of " );
document.writeln( "<string>" + nextSib.nodeName+ "</strong> is:" );
document.writeln( "<strong>" + nextSib.parentNode.nodeName
+ "</strong>.</p>" );
</script></body></html>
<?xml version = "1.0"?>
<!-- Fig. 8.2: article.xml -->
<article>
<title>Simple XML</title>
<date>December 6, 2000</date>
<author>
<fname>Tem</fname>
<lname>Nieto</lname>
</author>
<summary>XML is pretty easy.</summary>
<content>Once you have mastered HTML, XML is easily
learned. You must remember that XML is not for
displaying information but for managing information.
</content>
</article>
Dickson Chiu 2004
CSIT600b 01-79
Case Study: EDI



EDI (Electronic Data Interchange) – aims at
eliminating the use of paper for business data
exchange
Single point of information capture, electronic
delivery, low storage and retrieval costs
Statistics show that only the top 10,000
companies on a global scale are using EDI.
The rest of the business world: only 5% using
EDI, all others, paper
Dickson Chiu 2004
CSIT600b 01-80
The Case of a Small Business

Arthur runs a music wholesaling business



His ordering using EDI is actually worse than fax




He buys CDs from publishers using EDI or by fax
He sells CDs to shops, taking orders by mail, phone, fax, or
over the Web
Big business (the record company) benefits, the small guy
(Arthur) suffers
His suppliers all use different EDI standards
Arthur has to use four PCs, one for each supplier, running
some expensive software to produce EDI orders and accept
EDI invoices
Worse, none of systems links to his accounting
system
Dickson Chiu 2004
CSIT600b 01-81
Problems of EDI

Information coded in EDI is not self-describing


Systems must be 100% compatible in the message
structures they understand:



might look like any of “Wing Discspinner Music”, “Music
distributor”, “Wing Discspinner”, [email protected]
Imagine what happens when adding a new field “Arther
Discspinner Music”, “Music distributor”, “Arthur Discspinner”,
“0118 912 3456”, [email protected].
In general, the EDI system will report an error EDI
So companies need to band together to define
standards
Dickson Chiu 2004
CSIT600b 01-82
EDI by XML

Using XML, the same info will be coded as
<Company>Arthur Discspinner Music</Company>
<MarketSector>Music distributor</MarketSector>
<Contact>
<Name>Arthur Discspinner</Name>
<Phone>0118 912 3456</Phone>
<Email>[email protected]</Email>
</Contact>

The software will access the data by element
name
Dickson Chiu 2004
CSIT600b 01-83
New Vocabularies for E-Business

What Arthur is/we are looking for






A system that can link accounting systems over
the Web or by email
A “many-to-many” solution
“Flexible interoperability“
XML can achieve all this
To use XML to define vocabularies for
business relationships and transactions
An example: ebXML (http://www.ebxml.org/)
Dickson Chiu 2004
CSIT600b 01-84
Element Type Declarations

The line in red is an ETD:





<!DOCTYPE myMessage [
<!ELEMENT myMessage ( #PCDATA )>
]>
#PCDATA means “parsable character data” that will be
parsed and hence characters such as <, >, &, etc. will be
specially treated
EMPTY – no content allow
ANY – anything allowed (poor design)
Dietel Fig. 6.1 & 6.2 (intro.xml and intro.dtd)
intro.dtd
intro.xml
<!ELEMENT myMessage ( message )>
<!ELEMENT message ( #PCDATA )>
<!DOCTYPE myMessage SYSTEM "intro.dtd">
<myMessage>
<message>Welcome to XML!</message>
</myMessage>
Dickson Chiu 2004
CSIT600b 01-85
,|+*?
,
Sequence
|
Choice
+
<!ELEMENT classroom
(teacher, student)>
< … (iceCream | pastry)>
One or more < … (songTitle, duration)+ … >
* Zero or more < … (book*)>
?
Zero or one
< … (person?)>
Dickson Chiu 2004
CSIT600b 01-86
Examples



<!ELEMENT class
( number, instructor, demtors+,
( assignment+ | project ), test*, exam,
( credit | noCredit ) )>
<!ELEMENT farm
( farmer+, ( dog* | cat? ), pig*,
( goat | cow )?, ( chicken+ | duck* ) )>
Fig. 6.5 (mixed.xml)
<!DOCTYPE format [
<format>
<!ELEMENT format ( #PCDATA |
This is a simple formatted sentence.
bold | italic )*>
<bold>I have tried bold.</bold>
<!ELEMENT bold ( #PCDATA )>
<italic>I have tried italic.</italic>
<!ELEMENT italic ( #PCDATA )>
Now what?
]>
</format>
Dickson Chiu 2004
CSIT600b 01-87
Attribute Declarations





<!ELEMENT car EMTPY>
<!ATTLIST car doors CDATA #REQUIRED>
<!ELEMENT point EMTPY>
<!ATTLIST point
x CDATA #REQUIRED
y CDATA #REQUIRED >
<!ELEMENT point EMTPY>
<!ATTLIST point x CDATA #REQUIRED >
<!ATTLIST point y CDATA #REQUIRED >
CDATA for non-parsed character data except
<, >, &, ‘ and “
#REQUIRED: attribute must be provided



<car doors=“4”/>
#IMPLIED: application can derive its values if attribute does not
appear
#FIXED: only 1 possible value as specified if the attribute presents
<!ATTLIST po ...
confirmed CDATA #FIXED “yes”>
Dickson Chiu 2004
CSIT600b 01-88
Attribute Types



ID - key uniquely identifies an element
IDREF – points to elements with ID attribute
Enumerated attribute types (with default values)

<!ATTLIST person gender ( M | F ) “F”>
<?xml version = "1.0"?>
<!-- Fig. 6.8: IDExample.xml -->
<!DOCTYPE bookstore [
<!ELEMENT bookstore ( shipping+, book+ )>
<!ELEMENT shipping ( duration )>
<!ATTLIST shipping shipID ID #REQUIRED>
<!ELEMENT book ( #PCDATA )>
<!ATTLIST book shippedBy
IDREF #IMPLIED>
<!ELEMENT duration ( #PCDATA )>
]>
<bookstore>
<shipping shipID = "s1">
<duration>2 to 4 days</duration>
</shipping>
<shipping shipID = "s2">
<duration>1 day</duration>
</shipping>
<book shippedBy = "s2">
Java How to Program 3rd edition.
</book>
<book shippedBy = "s2">
C How to Program 3rd edition.
</book>
</bookstore>
Dickson Chiu 2004
CSIT600b 01-89
More Attribute Types

NMTOKEN / NMTOKENS – name token / tokens,
each consists of only letters, digits, periods, periods,
underscores, hyphens and colon




ENTITY – attribute must be a declared entity
referring to external unparsed entity



<!ELEMENT born EMTPY>
<!ATTLIST born year NMTOKEN #REQUIRED>
Conforming element: <born year= “1934” />
Attribute need not start with a letter
<!ENTITY city … >
<!ENTITY boat … >
<!ENTITY train … >
<!ATTLIST company tour ENTITY #REQUIRED>
Conforming element: <company tour = “city”>
ENTITIES – one or more of the above ENTITY


<!ATTLIST company tourset ENTITIES #REQUIRED>
Conforming element: <company tourset = “city boat train”>
(Assume city, boat, train are declared entities.)
Dickson Chiu 2004
CSIT600b 01-90
Limitations of DTD of XML 1.0










DTD not extensible
Only one DTD per document
Limited support of namespaces
Weak data typing
No inheritance
Document can override an external DTD
Non-XML syntax
No (direct) DOM support
Limited tools
Cannot specify cardinality
Dickson Chiu 2004
CSIT600b 01-91
Schema vs. DTD

DTD is weak in data typing


Schemas are XML documents which can be
manipulated like other XML documents




<quantity>hello</quantity>
Valid schemas conform to DTD’s
Schemas have more detailed and robust
content models
Schemas are extensible
Dynamic schemas – can be modified at
runtime
Dickson Chiu 2004
CSIT600b 01-92
Location Paths


An expression to specify how to navigate
from one node to another
A series of location steps, leading to the
target node(s)


The starting point is a context node
A location step has an axis, a node test, and
an optional predicate; for example

child::text()[position()=1]
child is the axis, text() the node test, and
[position()=1] a predicate
Dickson Chiu 2004
CSIT600b 01-93
String-value & Expanded-name

Every node has a string representation – its
string value



Used in comparison
Some string-values are concatenation of all the
string-values of descendant nodes
Some types of node also have an expandedname


A pair consisting of a local part (a string) and a
namespace URI (could be null)
Used to locate specific nodes in the tree
Dickson Chiu 2004
CSIT600b 01-94
XPath Axes
Axis Name
Ordering
Description
self
none
The context node itself.
parent
reverse
The context node’s parent, if one exists.
child
forward
The context node’s children, if they exist.
ancestor
reverse
The context node’s ancestors, if they exist.
ancestor-or-self
reverse
The context node’s ancestors and also itself.
descendant
forward
The context node’s descendants.
descendant-or-self
forward
The context node’s descendants and also itself.
following
forward
The nodes in the XML document following the
context node, not including descendants.
following-sibling
forward
The sibling nodes following the context node.
preceding
reverse
The nodes in the XML document preceding the
context node, not including ancestors.
preceding-sibling
reverse
The sibling nodes preceding the context node.
attribute
forward
The attribute nodes of the context node.
namespace
forward
The namespace nodes of the context node.
Dickson Chiu 2004
CSIT600b 01-95
Some Node Tests


To refine the set of nodes selected by an axis
Select by node name or type
Node Test
Description
*
Selects all nodes of the same
principal node type.
node()
Selects all nodes, regardless
of their type.
text()
Selects all text nodes.
comment()
Selects all comment nodes.
processinginstruction()
Selects all processinginstruction nodes.
node name
Selects all nodes with the
specified node name.
Dickson Chiu 2004
child::*
child::text()
child::*/child::text()
CSIT600b 01-96
Abbreviations


body = child::body
//body = /descendant-or-self::node()/child::body
Location Path
Description
child::
This location path is used by
default if no axis is supplied
and may therefore be omitted.
attribute::
The attribute axis may be
abbreviated as @.
/descendant-orself::node()/
This location path is
abbreviated as
two slashes (//).
self::node()
The context node is
abbreviated with a period (.).
parent::node()
The context node’s parent is
abbreviated with two periods
(..).
Dickson Chiu 2004
CSIT600b 01-97
Node-set Operators
Node-set
Operators
Description
pipe (|)
Performs the union of two nodesets.
slash (/)
Separates location steps.
double-slash Abbreviation for the location path
(//)
/descendant-or-self::node()/
head|body
//book
Dickson Chiu 2004
CSIT600b 01-98
Node-set Functions
Node-set Functions
Description
last()
Returns the number of nodes
in the node-set.
position()
Returns the position number
of the current node in the
node-set being tested.
count( node-set )
Returns the number of nodes
in node-set.
id( string )
Returns the element node
whose ID attribute matches
the value specified by
argument string.
local-name( nodeset )
Returns the local part of
the expanded-name for the
first node in node-set.
namespace-uri(
node-set )
Returns the namespace URI of
the expanded-name for the
first node in node-set.
name( node-set )
Returns the qualified name
for the first node in nodeset.
Dickson Chiu 2004
head/title[last()]
book[position()=3]
book[3]
count(*)
CSIT600b 01-99