Transcript Title Slide

CS 253: Topics in Database
Systems: C2
Dr. Alexandra I. Cristea
http://www.dcs.warwick.ac.uk/~acristea/
• Previously we looked at:
– XML
– XSL
– XSLT
• Next:
– XPath
– XQuery
2
XPath
3
XPath
• XPath is a syntax for defining parts of an XML
document
• XPath uses path expressions to navigate in
XML documents
• XPath contains a library of standard functions
• XPath is a major element in XSLT
• XPath is a W3C recommendation, thus a
Standard (16. November 1999 )
4
XPath Path Expressions
• Uses path expressions to select nodes
or node-sets in an XML document.
– These path expressions look very much
like the expressions you see when you
work with a traditional computer file
system.
5
XPath Standard Functions
• over 100 built-in functions.
– string values,
– numeric values,
– date and time comparison,
– node and QName manipulation,
– sequence manipulation,
– Boolean values,
– and more.
6
XPath Terminology
•
•
•
•
Nodes
Atomic values
Items (atomic values or nodes)
Relationships of nodes
– Parent
– Children
– Siblings
– Ancestors
– Descendants
7
XPath Nodes
•
7 kinds of nodes:
1.
2.
3.
4.
5.
6.
7.
element,
attribute,
text,
namespace,
processing-instruction,
comment, and
document (root) nodes.
•
XML documents are treated as trees of nodes. The root
of the tree is called the document node (or root node).
8
Nodes Examples
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
Document (root) node
<book>
<title lang="en">Harry Potter</title>
Element node
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
Attribute node
</bookstore>
9
Atomic values Examples*
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book>
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
*nodes with no children or parent
10
Selecting nodes
Expression
Description
nodename
Selects all child nodes of the node
/
//
Selects from the root node
.
..
@
Selects the current node
Selects nodes in the document from the current node
down that match the selection no matter where they are
Selects the parent of the current node
Selects attributes
11
Examples of selecting nodes
Path Expression
bookstore
Result
Selects all the child nodes of the bookstore element
/bookstore
Selects the root element bookstore
Note: If the path starts with a slash ( / ) it always
represents an absolute path to an element!
bookstore/book
Selects all book elements that are children of
bookstore
Selects all book elements no matter where they are
in the document
//book
bookstore//book
//@lang
Selects all book elements that are descendant of the
bookstore element, no matter where they are under
the bookstore element
Selects all attributes that are named lang
12
Predicates
• Predicates are used to find a specific
node or a node that contains a specific
value.
• Predicates are always embedded in
square brackets.
13
Example predicates
Path Expression
/bookstore/book[1]
Result
Selects the first book element that is the
child of the bookstore element
/bookstore/book[last()]
Selects the last book element that is the
child of the bookstore element
/bookstore/book[last()-1]
Selects the last but one book element that
is the child of the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that
are children of the bookstore element
14
Example predicates – cont.
Path Expression
Result
//title[@lang]
Selects all the title elements that
have an attribute named lang
//title[@lang='eng']
Selects all the title elements that
have an attribute named lang with a
value of 'eng'
Selects all the book elements of the
/bookstore/book[price>35.00]
bookstore element that have a price
element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the
book elements of the bookstore
element that have a price element
with a value greater than 35.00
15
Selecting Unknown Nodes
Wildcard
*
@*
node()
Description
Matches any element node
Matches any attribute node
Matches any node of any kind
16
Example: selecting several paths
Path Expression
//book/title | //book/price
Result
Selects all the title AND price
elements of all book elements
//title | //price
Selects all the title AND price
elements in the document
Selects all the title elements of the
book element of the bookstore
element AND all the price elements
in the document
/bookstore/book/title | //price
17
Location Path Expression
• A location path can be absolute or
relative.
• An absolute location path: /step/step/...
• A relative location path: step/step/...
• Location step:
axisname::nodetest[predicate]
18
XPath Axes
self
child
parent
ancestor
descendant
ancestor-or-self
descendant-orself
preceding-sibling
following-sibling
preceding
following
attribute
namespace
19
AxisName
Result
ancestor
Selects all ancestors (parent, grandparent, etc.) of the current node
ancestor-or-self
Selects all ancestors (parent, grandparent, etc.) of the current node
and the current node itself
attribute
Selects all attributes of the current node
child
Selects all children of the current node
descendant
Selects all descendants (children, grandchildren, etc.) of the current no
descendant-orself
Selects all descendants (children, grandchildren, etc.) of the
current node and the current node itself
following
Selects everything in the document after the closing tag of the current
following-sibling Selects all siblings after the current node
namespace
Selects all namespace nodes of the current node
parent
Selects the parent of the current node
preceding
Selects everything in the document that is before the start tag of
the current node
precedingsibling
Selects all siblings before the current node
self
Selects the current node
20
axisname::nodetest[predicate]
• //DDD/parent::*
<AAA>
<BBB>
<DDD>
</DDD>
</BBB>
</AAA>
21
axisname::nodetest[predicate]
• //BBB/child::*
<AAA>
<BBB>
<DDD>
</DDD>
</BBB>
</AAA>
Note: /AAA is equivalent to /child::AAA
22
More examples
• http://www.zvon.org/xxl/XPathTutorial/Genera
l/examples.html
– Check basics, //, *, predicates, attributes, functions
(new ones: count, name, normalize-space, startswith, contains, string-length, floor, ceiling), axes,
operators (mod)
– Note: The ancestor, descendant, following,
preceding and self axes partition a document
(ignoring attribute and namespace nodes): they do
not overlap and together they contain all the
nodes in the document. (see example)
23
XPath Conclusion
• We have learned:
–
–
–
–
–
–
–
–
XPath definition
Path expressions
Standard functions
Terminology
Predicates
Location paths
Axes
Some operators
24
• Before we go on, one more thing about
XML:
• XML Namespaces
25
Naming ambiguity
26
The Idea to Solve it
• Assign a URI (~ URL) to every sublanguage:
– E.g., for XHTML 1.0:
http://www.w3.org/1999/xhtml
• Qualify element names with URIs:
– {http://www.w3.org/1999/xhtml}head
Web Naming and Addressing Overview (URIs, URLs, ...)
27
The actual solution
• Namespace declarations bind URIs to
prefixes:
• Default namespace (no prefix) declared
with: xmlns=“…”
• Lexical Scope
• Attribute names can also be prefixed
28
Applying namespaces
29
• Next we look at how to query XML
• This can be done, to some extent, as
we have seen, within XSLT,
• but the main language developed for
this purpose is …
30
XQuery
31
What is XQuery?
•
•
•
•
•
XQuery is the language for querying XML data
XQuery for XML is like SQL for databases
XQuery is built on XPath expressions
XQuery is defined by the W3C
XQuery is supported by all the major database
engines (IBM, Oracle, Microsoft, etc.)
• XQuery is a W3C recommendation (Jan 2007)
thus a standard
32
Maturity Levels Towards W3C
Recommendation
•
•
•
•
Working Draft (WD)
Candidate Recommendation (CR)
Proposed Recommendation (PR)
W3C Recommendation (REC)
33
XQuery and XPath
• XQuery 1.0 and XPath 2.0 share the
same data model and support the same
functions and operators.
34
XQuery - Examples of Use
•
•
•
•
Extract information to use in a Web Service
Generate summary reports
Transform XML data to XHTML
Search Web documents for relevant
information
35
Usage Scenario:
Document-Oriented
• Queries could be used
– To retrieve parts of documents
– To provide dynamic indexes
– To perform context-sensitive searching
– To generate new documents as
combinations of existing ones
36
Usage Scenario: Programming
• Queries could be used to automatically
generate documentation
37
Usage Scenario: Hybrid
• Queries could be used to data mine
hybrid data, such as patient records
38
XQuery compared to XPath
• XQuery 1.0 is a strict superset of XPath 2.0
•  XPath 2.0 expression is directly an XQuery
1.0 expression (a query)
• The extra expressive power is the ability to:
– Join information from different sources and
– Generate new XML fragments
39
Relationship to XSLT
• XQuery, XSLT: both domain-specific
languages for combining and transforming
data from multiple sources
• different in design - historical reasons
– XQuery: designed from scratch
– XSLT: intellectual descendant of CSS
• technically, they may emulate each other
40
XQuery query makeup
• Prolog
– Like XPath, XQuery expressions are
evaluated relatively to a context
– explicitly provided by a prolog (header)
~ header with definitions
• Body
– The actual query
41
XQuery Ex.: Prolog + Query
42
XQuery Prolog (i.e., header(s))
• Settings define various parameters for the XQuery processor language,
such as:
xquery version "1.0";
module namespace math = "http://example.org/mathfunctions";
declare base-uri "http://example.org";
declare default element namespace
"http://example.org/names";
declare namespace xs= "http://www.w3.org/2001/XMLSchema"
import module "http://www.w3.org/2003/05/xpathfunctions" at "logo.xq“;
declare variable $x as xs:integer := 7;
declare function addLogo($root as node()) as node()*{ };
(: etc :)
43
XQuery body:
XQuery capabilities
• Generate
• Join
• Select
44
Generate: constructors
• XQuery expressions may compute new XML nodes
• Expressions may denote:
– element, character data, comment and processing
instruction nodes
•  node is created with a unique node identity
• Constructors may be either
– direct or
– computed
45
Direct constructors in XQuery
<XMLfragment>my fragment </XMLfragment>
• Evaluates to the given XML fragment
• Try out at*:
• http://support.x-hive.com/xquery/index.html
46
Explicit, computed constructors
47
Variable bindings
(implicit constructors)
<employee empid="{$id}">
<name>{$name}</name>
{$job} <deptno>{$deptno}</deptno>
<salary>{$SGMLspecialist+100000}</salary>
</employee>
48
How to Select Nodes with XQuery?
• Functions
– XQuery uses functions to extract data from XML
documents.
• (X)Path Expressions
– XQuery uses path expressions to navigate
through elements in an XML document.
• Predicates
– XQuery uses predicates to limit the extracted
data from XML documents.
49
Functions
• doc()
– function to open a file
• Example:
– doc("books.xml")
• Note: A call to a function can appear
where an expression may appear.
50
Path Expressions
• Example:
select all the title elements in the "books.xml"
file:
doc("books.xml")/bookstore/book/title
51
Predicates
• Example:
select all the book elements under the
bookstore element that have a price
element with a value that is less than 30 :
doc("books.xml")/bookstore/book[price<30]
52
At a glance: function, path, predicate
53
FLWOR
• For, Let, Where, Order by, Return
= main engine
~ SQL syntax (SFWH)
~ programs and function calls
54
FLWOR by comparison with
Path expressions
• select all the title elements under the book elements that are
under the bookstore element that have a price element with
a value that is higher than 30.
• Path expression:
doc("books.xml")/bookstore/book[price>30]/title
• FLWOR expression:
for $x in doc("books.xml")/bookstore/book
where $x/price>30
return $x/title
55
Sorting in FLWOR
• for $x in
doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
56
Present the Result In an HTML List
<ul>
{
for $x in
doc("books.xml")/bookstore/book/title
order by $x
return <li>{$x}</li>
}
</ul>
57
Result HTML List
<ul>
<li><title lang="en">Everyday Italian</title></li>
<li><title lang="en">Harry Potter</title></li>
<li><title lang="en">Learning XML</title></li>
<li><title lang="en">XQuery Kick Start</title></li>
</ul>
58
Eliminate element (here: title)
<ul>
{
for $x in doc("books.xml")/bookstore/book/title
order by $x
return <li>{data($x)}</li> (: also text() :)
}
</ul>
59
New result HTML List
<ul>
<li>Everyday Italian</li>
<li>Harry Potter</li>
<li>Learning XML</li>
<li>XQuery Kick Start</li>
</ul>
60
Another FLWOR Expression
61
The Difference between for and let
62
The Difference between for and let
:=
in
63
The Difference between for and let
64
The Difference between for and let
65
FLWOR Basic Building Blocks
66
General rules
• for and let may be used many times in
any order
• only one where is allowed
• many different sorting criteria can be
specified (descending, ascending, etc.)
67
Reversing order
• Reverses the order of a sequence, for
nodes or atomic values
• reverse (( 1, 2, 3))
-> 321
68
Joining documents
for $p in doc("www.irs.gov/taxpayers.xml")//person
for $n in doc("neighbors.xml")//neighbor[ssn = $p/ssn]
return
<person>
<ssn> { $p/ssn } </ssn>
{ $n/name }
<income> { $p/income } </income>
</person>
69
Two-way join in a where Clause
for $item in doc(“ord.xml”)//item,
$product in doc(“cat.xml”)//product
where $item/@num = $product/number
return
<item num=“{$item/@num}”
name=“{$product/name}”
quan=“{$item/@quantity}” />
70
Aggregating
• Make summary calculations on grouped
data
• Functions:
– sum, avg, max, min, count
71
Conditionals
for $b in doc(“bib.xml”)/book
return
<short>
{$b/title}
<author>
{if ( count($b/author) < 3 )
then $b/author
else
( $b/author[1], <author>and others</author>)
}
</author>
</short>
72
Nesting Conditional Expressions
• Conditional expressions can be nested
• ‘else if’ functionality is provided
• if ( count($b/author) = 1 )
then $b/author
else if (count($b/author) = 2 )then (: ..
;)
else ( $b/author[1], <author>and
others</author>)
73
Logical Expressions
• and, or operators:
– and has precedence over or
– Parentheses can change precedence
if ($isDiscounted and ($discount
> 5 or $discount < 0 ) ) then 5
else $discount
• not function for negations:
if (not($isDiscounted)) then 0
else $discount
74
XQuery Built-in Functions
XQuery function namespace URI is:
http://www.w3.org/2005/02/xpath-functions
default prefix: fn:.
• E.g.: fn:string().
• fn: is the default prefix of the namespace, the
function names does not need to be prefixed
when called.
75
Built-in Functions
• String-related
– substring, contains, matches, concat, normalizespace, tokenize
• Date-related
– current-date, month-from-date, adjust-time-totimezone
• Number-related
– round, avg, sum, ceiling
• Sequence-related
– index-of, insert-before, reverse, subsequence,
distinct-values
76
Built-in Functions (2)
• Node-related
– data, empty, exists, id, idref
• Name-related
– local-name, in-scope-prefixes, QName, resolveQName
• Error handling and trapping
– error, trace, exactly-one
• Document and URI-related
– collection, doc, root, base-uri
77
Function calls
doc("books.xml")//book[substring(title,1,5)='Harry']
let $name := (substring($booktitle,1,4))
<name>{uppercase($booktitle)}</name>
78
• doc("http://www.dcs.warwick.ac.uk/~acri
stea/courses/CS253/books.xml")//book[
substring(title,1,5)='Harry']
79
• for $x in
doc("http://www.dcs.warwick.ac.uk/~acri
stea/courses/CS253/books.xml")//book/t
itle
• for $y in data($x)
• for $name in (substring($y,1,4))
• return $name
80
User Defined Functions
declare function
prefix:function_name($parameter AS datatype)
AS returnDatatype
{
(: ...function code here... :) };
81
User-defined Functions (Ex. 1)
• declare function depth($e) AS xsd:integer
{
if (empty($e/*) then 1
else max(for $c in $e/* return depth($c)) ) +1
};
• for $b in doc(“bib.xml”)/book
return depth($b)
82
User Defined Functions (ex. 2)
declare function local:minPrice(
$price as xs:decimal?,
$discount as xs:decimal?)
AS xs:decimal?
{
let $disc := ($price * $discount) div 100
return ($price - $disc)
};
(: example of how to call the function above:)
<minPrice>{local:minPrice($book/price,$book/discount)}
</minPrice>
83
Existential and Universal Quantifiers
• for $b in doc(“bib.xml”)/book
where some $author in $b/author
satisfies $author/text() = “Ullman”
return $b
Return books where at least
one author is “Ullman”
• for $b in doc(“bib.xml”)/book
where every $author in $b/author
satisfies $author/text() = “Ullman”
return $b
Return books where all
authors are “Ullman”
84
• for $b in
doc("http://www.dcs.warwick.ac.uk/~acri
stea/courses/CS253/books.xml ")//book
• where some $author in $b/author
• satisfies $author/text() = "Kurt Cagle"
• return $b
85
Comments
86
Comparisons
• Value comparisons
Eq, ne, lt, le, gt, ge
Used to compare individual values
Each operand must be a single atomic value (or a
node containing a single atomic value)
• General comparisons
=, !=, <, <=, >, >=
Can be used with sequences of multiple items
87
Example
88
XQuery on Distributed Sources
89
90
91
92
93
94
XQuery Syntax
• Declarative, functional language
~ SQL
• Nested expressions
• Case sensitive
• White spaces:
– Tabs, space, CR, LF
– Ignored between language constructs
– Significant in quoted strings
• No special EOL character
95
Keywords and names
• Keywords and operators
– Case-sensitive, generally lower case
– May have several meanings depending on the
context
• E.g. “*” or “in”
– No reserved words
• All names must be valid XML names
– For variables, functions, elements, attributes
– Can be associated with a namespace
96
XQuery gives you a choice:
• Path Expressions:
– If you just want to copy certain elements
and attributes as is
• FLWOR Expressions:
– Allow sorting
– Allow adding elements/attributes
– Verbose, but can be clearer
97
XQuery tools
• XStylus Studio 2007
http://www.stylusstudio.com/xml_downlo
ad.html (free trial version)
– See also short XQuery intro at:
http://www.stylusstudio.com/xquery_primer.
html
98
XML and programming
• XSLT, XPath and XQuery provide tools for
specialized tasks.
• But many applications are not covered:
– domain-specific tools for concrete XML
languages
– general tools that nobody has thought of yet
99
XML in general-purpose
programming languages
•
•
•
•
•
parse XML documents into XML trees
navigate through XML trees
construct XML trees
output XML trees as XML documents
DOM and SAX are corresponding APIs that
are language independent and supported
by numerous languages. JDOM is an API
that is tailored to Java.
100
XQuery Conclusion
• We have learned:
–
–
–
–
–
–
–
XQuery definition
Usage scenarios
Comparison w. XSLT and XPath
Capabilities
Functions, path expressions and predicates
FLWOR
Extensions for generic programming with XML
101