Fig. 5.1 Simple XML document containing a message.

Download Report

Transcript Fig. 5.1 Simple XML document containing a message.

Chapter 5 – Creating Markup with
XML
Outline
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
Introduction
Introduction to XML Markup
Parsers and Well-formed XML Documents
Parsing an XML Document with msxml
Characters
5.5.1
Character Set
5.5.2
Characters vs. Markup
5.5.3
While Space, Entity References and Built-in Entities
5.5.4
Using Unicode in an XML Document
Markup
CDATA Sections
XML Namespaces
Case Study: A Day Planner Application
 2002 Prentice Hall, Inc. All rights reserved.
5.1 Introduction
• XML
– Technology for creating markup languages
– Enables document authors to describe data of any type
– Allows creating new tags
• HTML limits document authors to fixed tag set
 2002 Prentice Hall, Inc. All rights reserved.
5.2 Introduction to XML Markup
• XML document (intro.xml)
– Marks up message as XML
– Commonly stored in text files
• Extension .xml
 2002 Prentice Hall, Inc. All rights reserved.
1
2
3
4
5
6
7
8
<?xml version = "1.0"?>
Document begins with declaration
that specifies XML version 1.0
<!-- Fig. 5.1 : intro.xml
-->
<!-- Simple introduction to XML markup -->
<myMessage>
<message>Welcome to XML!</message>
</myMessage>
Line numbers are not part
of XML document. We
include them for clarity.
Comments
Element message is
child element of root
element myMessage
Outline
Fig. 5.1 Simple XML
document containing a
message.
Line numbers are not part
of XML document. We
include them for clarity.
Document begins with
declaration that specifies
XML version 1.0
Comments
Element message is child
element of root element
myMessage
 2002 Prentice Hall, Inc. All rights reserved.
5.2 Introduction to XML Markup (cont.)
• XML documents
– Must contain exactly one root element
• Attempting to create more than one root element is erroneous
– Elements must be nested properly
• Incorrect: <x><y>hello</x></y>
• Correct: <x><y>hello</y></x>
 2002 Prentice Hall, Inc. All rights reserved.
5.3 Parsers and Well-formed XML
Documents
• XML parser
– Processes XML document
•
•
•
•
Reads XML document
Checks syntax
Reports errors (if any)
Allows programmatic access to document’s contents
 2002 Prentice Hall, Inc. All rights reserved.
5.3 Parsers and Well-formed XML
Documents (cont.)
• XML document syntax
– Considered well formed if syntactically correct
•
•
•
•
•
Single root element
Each element has start tag and end tag
Tags properly nested
Attribute (discussed later) values in quotes
Proper capitalization
– Case sensitive
 2002 Prentice Hall, Inc. All rights reserved.
5.3 Parsers and Well-formed XML
Documents (cont.)
• XML parsers support
– Document Object Model (DOM)
• Builds tree structure containing document data in memory
– Simple API for XML (SAX)
• Generates events when tags, comments, etc. are encountered
– (Events are notifications to the application)
 2002 Prentice Hall, Inc. All rights reserved.
5.4 Parsing an XML Document with msxml
• XML document
– Contains data
– Does not contain formatting information
– Load XML document into Internet Explorer 5.0
• Document is parsed by msxml.
• Places plus (+) or minus (-) signs next to container elements
– Plus sign indicates that all child elements are hidden
– Clicking plus sign expands container element
» Displays children
– Minus sign indicates that all child elements are visible
– Clicking minus sign collapses container element
» Hides children
• Error generated, if document is not well formed
 2002 Prentice Hall, Inc. All rights reserved.
Fig. 5.2 XML document shown in IE5.
 2002 Prentice Hall, Inc. All rights reserved.
Fig. 5.3 Error message for a missing end
tag.
 2002 Prentice Hall, Inc. All rights reserved.
5.5 Characters
• Character set
– Characters that may be represented in XML document
• e.g., ASCII character set
– Letters of English alphabet
– Digits (0-9)
– Punctuation characters, such as !, - and ?
 2002 Prentice Hall, Inc. All rights reserved.
5.5.1 Character Set
• XML documents may contain
– Carriage returns
– Line feeds
– Unicode characters (Section 5.5.4)
• Enables computers to process characters for several languages
 2002 Prentice Hall, Inc. All rights reserved.
5.5.2 Characters vs. Markup
• XML must differentiate between
– Markup text
• Enclosed in angle brackets (< and >)
– e.g,. Child elements
– Character data
• Text between start tag and end tag
– e.g., Fig. 5.1, line 7: Welcome to XML!
 2002 Prentice Hall, Inc. All rights reserved.
5.5.3 White Space, Entity References and
Built-in Entities
• Whitespace characters
– Spaces, tabs, line feeds and carriage returns
• Significant (preserved by application)
• Insignificant (not preserved by application)
– Normalization
» Whitespace collapsed into single whitespace character
» Sometimes whitespace removed entirely
<markup>This is character
data</markup>
after normalization, becomes
<markup>This is character data</markup>
 2002 Prentice Hall, Inc. All rights reserved.
5.5.3 White Space, Entity References and
Built-in Entities (cont.)
• XML-reserved characters
–
–
–
–
–
Ampersand (&)
Left-angle bracket (<)
Right-angle bracket (>)
Apostrophe (’)
Double quote (”)
• Entity references
– Allow to use XML-reserved characters
• Begin with ampersand (&) and end with semicolon (;)
– Prevents from misinterpreting character data as markup
 2002 Prentice Hall, Inc. All rights reserved.
5.5.3 White Space, Entity References and
Built-in Entities (cont.)
• Build-in entities
–
–
–
–
–
–
Ampersand (&amp;)
Left-angle bracket (&lt;)
Right-angle bracket (&gt;)
Apostrophe (&apos;)
Quotation mark (&quot;)
Mark up characters “<>&” in element message
<message>&lt;&gt;&amp;</message>
 2002 Prentice Hall, Inc. All rights reserved.
5.5.4 Using Unicode in an XML Document
• XML Unicode support
– e.g., Fig. 5.4 displays Arabic words
• Arabic characters
– represented by entity references for Unicode characters
 2002 Prentice Hall, Inc. All rights reserved.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<?xml version = "1.0"?>
<!-- Fig. 5.4 : lang.xml
<!-- Demonstrating Unicode
-->
-->
Document type definition
(DTD) defines document
structure and entities
<!DOCTYPE welcome SYSTEM "lang.dtd">
<welcome>
<from>
<!-- Deitel and Associates -->
&#1583;&#1575;&#1610;&#1578;&#1614;&#1604;
&#1571;&#1606;&#1583;
<!-- entity -->
&assoc;
</from>
<subject>
<!-- Welcome to the world of Unicode -->
&#1571;&#1607;&#1604;&#1575;&#1611;
&#1576;&#1603;&#1605;
&#1601;&#1610;&#1616;
&#1593;&#1575;&#1604;&#1605;
<!-- entity -->
&text;
</subject>
</welcome>
 2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 5.4 XML
document that
contains Arabic words
Document type definition
(DTD) defines document
structure and entities
Root element welcome
element welcome
contains childRoot
elements
contains child elements
from and subject
from and subject
Sequence
of entity
Sequence of
entity references
references
for Unicode
for Unicode
characters
in
characters
in
Arabic
Arabic alphabet
alphabet
lang.dtd defines
entities defines
lang.dtd
assoc and text
entities assoc and text
Fig. 5.4 XML document that contains Arabic
words.
 2002 Prentice Hall, Inc. All rights reserved.
5.6 Markup
• XML element markup
– Consists of
• Start tag
• Content
• End tag
– All elements must have corresponding end tag
<img src = “img.gif”>
is correct in HTML, but not XML
– XML requires end tag or forward slash (/) for termination
<img src = “img.gif”></img>
or
<img src = “img.gif”/>
is correct XML syntax
 2002 Prentice Hall, Inc. All rights reserved.
5.6 Markup (cont.)
• Elements
– Define structure
– May (or may not) contain content
• Child elements, character data, etc.
• Attributes
– Describe elements
• Elements may have associated attributes
– Placed within element’s start tag
– Values are enclosed in quotes
• Element car contains attribute doors, which has value “4”
<car doors = “4”/>
 2002 Prentice Hall, Inc. All rights reserved.
5.6 Markup (cont.)
• Processing instruction (PI)
– Passed to application using XML document
– Provides application-specific document information
– Delimited by <? and ?>
 2002 Prentice Hall, Inc. All rights reserved.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<?xml version = "1.0"?>
Processing instruction specifies
stylesheet (discussed in Chapter 12)
<!-- Fig. 5.5 : usage.xml
-->
<!-- Usage of elements and attributes -->
<?xml:stylesheet type = "text/xsl" href = "usage.xsl"?>
Outline
Fig. 5.5 XML
document that marks
up information about a
fictitious book.
<book isbn = "999-99999-9-X">
<title>Deitel&amp;s XML Primer</title>
Processing instruction
Root element book contains
child
specifies
stylesheet
<author>
elements title, author,
<firstName>Paul</firstName>
(discussed in Chapter 12)
<lastName>Deitel</lastName>
chapters and media
</author>
Root element book
Element book contains
contains child elements
<chapters>
attribute isbn, which
has author,
title,
<preface num = "1" pages = "2">Welcome</preface>
value of 999-99999-9-X
chapters and media
<chapter num = "1" pages = "4">Easy XML</chapter>
<chapter num = "2" pages = "2">XML Elements?</chapter>
<appendix num = "1" pages = "9">Entities</appendix>
</chapters>
<media type = "CD"/>
</book>
 2002 Prentice Hall, Inc. All rights reserved.
Element book contains
attribute isbn, which has
Element chapters containsvalue of 999-9999-9-X
four child elements, each
which contain two attributesElement chapters
contains four child
elements, each which
contain two attributes
Fig. 5.5 XML document that marks up
information about a fictitious book.
 2002 Prentice Hall, Inc. All rights reserved.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
<?xml version = "1.0"?>
<!-- Fig. 5.6: letter.xml
-->
<!-- Business letter formatted with XML -->
<letter>
<contact type = "from">
<name>Jane Doe</name>
<address1>Box 12345</address1>
<address2>15 Any Ave.</address2>
<city>Othertown</city>
<state>Otherstate</state>
<zip>67890</zip>
<phone>555-4321</phone>
<flag gender = "F"/>
</contact>
<contact type = "to">
<name>Jane Doe</name>
<address1>123 Main St.</address1>
<address2></address2>
<city>Anytown</city>
<state>Anystate</state>
<zip>12345</zip>
<phone>555-1234</phone>
<flag gender = "M"/>
</contact>
 2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 5.6 XML
document that marks
up a letter.
30
<salutation>Dear Sir:</salutation>
31
32
<paragraph>It is our privilege to inform you about our new
33
database managed with <bold>XML</bold>. This new system
34
allows you to reduce the load on your inventory list
35
server by having the client machine perform the work of
36
sorting and filtering the data.</paragraph>
37
38
<paragraph>The data in an XML element is normalized, so
39
plain-text diagrams such as
40
/---\
41
|
|
42
\---/
43
will become gibberish.</paragraph>
44
45
<closing>Sincerely</closing>
46
<signature>Ms. Doe</signature>
47
48 </letter>
 2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 5.6 XML
document that marks
up a letter. (Part 2)
Fig. 5.6 XML document that marks up a
letter.
 2002 Prentice Hall, Inc. All rights reserved.
5.7 CDATA Sections
• CDATA sections
– May contain text, reserved characters and whitespace
• Reserved characters need not be replaced by entity references
–
–
–
–
Not processed by XML parser
Commonly used for scripting code (e.g., JavaScript)
Begin with <![CDATA[
Terminate with ]]>
 2002 Prentice Hall, Inc. All rights reserved.
1
Outline
<?xml version = "1.0"?>
2
3
<!-- Fig. 5.7 : cdata.xml
-->
4
<!-- CDATA section containing C++ code
-->
Fig. 5.7 Using a
CDATA section.
5
6
<book title = "C++ How to Program" edition = "3">
7
8
<sample>
9
// C++ comment
10
if ( this-&gt;getX() &lt; 5 &amp;&amp; value[ 0 ] != 3 )
11
12
15
XML does not process
CDATA section
cerr &lt;&lt; this-&gt;displayError();
</sample>
13
14
references required
Entity referencesEntity
required
not in CDATA section
if not in CDATAifsection
<sample>
<![CDATA[
XML does not process
CDATA section
Note the simplicity offered
by CDATA section
16
17
// C++ comment
18
if ( this->getX() < 5 && value[ 0 ] != 3 )
19
cerr << this->displayError();
20
]]>
21
</sample>
22
23
C++ How to Program by Deitel &amp; Deitel
24 </book>
 2002 Prentice Hall, Inc. All rights reserved.
Note the simplicity
offered by CDATA section
Fig. 5.7 Using a CDATA section
 2002 Prentice Hall, Inc. All rights reserved.
5.8 XML Namespaces
• Naming collisions
– Two different elements have same name
<subject>Math</subject>
<subject>Thrombosis</subject>
• Namespaces
– Differentiate elements that have same name
<school:subject>Math</school:subject>
<medical:subject>Thrombosis</medical:subject>
• school and medical are namespace prefixes
– Prepended to elements and attribute names
– Tied to uniform resource identifier (URI)
» Series of characters for differentiating names
 2002 Prentice Hall, Inc. All rights reserved.
5.8 XML Namespaces (cont.)
• Creating namespaces
– Use xmlns keyword
xmlns:text = “urn:deitel:textInfo”
xmlns:image = “urn:deitel:imageInfo”
• Creates two namespace prefixes text and image
• urn:deitel:textInfo is URI for prefix text
• urn:deitel:imageInfo is URI for prefix image
– Default namespaces
• Child elements of this namespace do not need prefix
xmlns = “urn:deitel:textInfo”
 2002 Prentice Hall, Inc. All rights reserved.
1
Outline
<?xml version = "1.0"?>
2
3
<!-- Fig. 5.8 : namespace.xml -->
4
<!-- Namespaces
-->
5
6
7
Element directory contains
Fig. 5.8 Listing for
two namespace prefixes
namespace.xml.
<directory xmlns:text = "urn:deitel:textInfo"
xmlns:image = "urn:deitel:imageInfo">
8
9
10
11
<text:file filename = "book.xml">
Use prefix text to
Element
describe elements
file directory
contains two namespace
and description
prefixes
<text:description>A book list</text:description>
Use prefix text to
describe elements file
and description
</text:file>
12
13
<image:file filename = "funny.jpg">
14
<image:description>A funny picture</image:description>
15
<image:size width = "200" height = "100"/>
16
</image:file>
17
18 </directory>
 2002 Prentice Hall, Inc. All rights reserved.
Applytoprefix text to
Apply prefix text
describe
describe elements
file,elements file,
description
and size
description and
size
1
urn:deitel:textInfo
is default namespace
<?xml version = "1.0"?>
2
3
<!-- Fig. 5.9 : defaultnamespace.xml -->
4
<!-- Using Default Namespaces
Fig. 5.9 Using default
namespaces.
-->
5
6
<directory xmlns = "urn:deitel:textInfo"
7
xmlns:image = "urn:deitel:imageInfo">
8
9
10
11
<file filename = "book.xml">
<description>A book list</description>
</file>
12
13
<image:file filename = "funny.jpg">
urn:deitel:textElement fileInfo
is in is default namespace
default namespace
Element file is in default
namespace
Specify namespace
Specify namespace
14
<image:description>A funny picture</image:description>
15
<image:size width = "200" height = "100"/>
16
Outline
</image:file>
17
18 </directory>
 2002 Prentice Hall, Inc. All rights reserved.
5.9 Case Study: A Day Planner Application
• Markup for Day-Planner application
– Scheduling appointments and task
• Date
• Time
• Appointment type
 2002 Prentice Hall, Inc. All rights reserved.
1
Outline
<?xml version = "1.0"?>
2
3
<!-- Fig. 5.10 : planner.xml
-->
4
<!-- Day Planner XML document
-->
5
6
Root element planner
holds all appointments
<planner>
7
8
<year value = "2000">
9
10
Fig. 5.10 Day planner
XML document
planner.xml.
date elements store
specific dates with attributes
month and day
<note time = "1430">Doctor&apos;s appointment</note>
12
<note time = "1620">Physics class at BH291C</note>
</date>
14
15
16
17
date elements store
specific dates with
attributes month and day
<date month = "7" day = "15">
11
13
Root element planner
holds all appointments
<date month = "7" day = "4">
<note>Independence Day</note>
</date>
18
 2002 Prentice Hall, Inc. All rights reserved.
note elements mark
up appointments
note elements mark up
appointments
19
20
21
<date month = "7" day = "20">
<note time = "0900">General Meeting in room 32-A</note>
</date>
22
23
24
25
<date month = "7" day = "20">
<note time = "1900">Party at Joe&apos;s</note>
</date>
26
27
28
29
<date month = "7" day = "20">
<note time = "1300">Financial Meeting in room 14-C</note>
</date>
30
31
Outline
</year>
32
33 </planner>
 2002 Prentice Hall, Inc. All rights reserved.
Fig. 5.10 Day planner
XML document
planner.xml. (Part 2)
Fig. 5.11 Application that uses the day
planner.
 2002 Prentice Hall, Inc. All rights reserved.