Extensible Markup Language (XML)
Download
Report
Transcript Extensible Markup Language (XML)
Extensible Markup Language
(XML)
What is a Markup Language
A syntax and procedure for embedding in text documents
tags that control formatting when the documents are viewed
by a special application.
<b> Hi </b>
a set of codes or tags that surrounds content and tells a
person or program what that content is (its structure)
and/or what it should look like (its format). Markup tags
have a distinct syntax that sets them apart from the content
that they surround.
2
History of Markup Languages
1967: GenCode
1970s, 1980s :Tex
1980s: Scribe
Early 80s: SGML
1991: HTML
Late 90s: XML`
3
SGML
HTML
4
XML
Motivation For XML.
XML is an attempt to package up the important virtues and most-
used features of SGML in a compact, easily-implemented package
that is optimized for delivery on the WWW.
(Bray)
XML is started as a simplified subset of the Standard Generalized
Markup Language (SGML), and is designed to be relatively humanlegible. By adding semantic constraints, application languages can be
implemented in XML.
Data Storage in an organized way.
Fast and easy exchange of Data.
5
XML Syntax
Header
6
<?xml version="1.0" encoding="utf-8"?>
Comments:<!-- This is a comment -->
Nesting:<a> <b> </b> </a> <!-- OK -->
Empty elements:<a value="123"></a> <a value="123"/>
One root element!
Some Standards that use XML
SVG
MathML
HL7 V. 3.0 and Medical Markup Language (MML)
7
XML Example
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="book.xsl"?>
<dblp>
<book mdate="2004-03-08" key="books/acm/Kim95">
<editor>Won Kim</editor>
<title>Modern Database Systems: The Object Model, Interoperability, and Beyond.</title>
<booktitle>Modern Database Systems</booktitle>
<publisher>ACM Press and Addison-Wesley</publisher>
<year>1995</year>
<isbn>0-201-59098-0</isbn>
<url>db/books/collections/kim95.html</url>
</book>
<book mdate="2002-01-03" key="books/aw/AbiteboulHV95">
<author>Serge Abiteboul</author>
<author>Richard Hull</author>
<author>Victor Vianu</author>
<title>Foundations of Databases.</title>
<publisher>Addison-Wesley</publisher>
<year>1995</year>
<isbn>0-201-53771-0</isbn>
<url>db/books/dbtext/abiteboul95.html</url>
</book>
</dblp>
8
Defining an XML Language
i.e. which <tags> in which order
Not strictly necessary
o you can parse/produce XML without formally defining the structure
of the language
o This is called a Well-formed document
DTDs (“Document Type Definitions”)
o Simple, limited
o This is called a valid document
XML-Schema
o (too?) complex and expressive (includes inheritance, restricted
datatypes, ranges)
Data binding
o Define (e.g.) java classes + mapping from object tree to XML
Document
o JAXB, Castor, JSX, NeuroML
9
DTD
<!ELEMENT dblp (article|inproceedings|proceedings|book|incollection|
phdthesis|mastersthesis|www)*>
<!ENTITY % field
"author|editor|title|booktitle|pages|year|address|journal|volume|number|month|url|ee
|cdrom|cite|publisher|note|crossref|isbn|series|school|chapter">
…………………..
<!ELEMENT book (%field;)*>
<!ATTLIST book key CDATA #REQUIRED mdate CDATA #IMPLIED
……………………….
<!ELEMENT author (#PCDATA)>
<!ELEMENT editor (#PCDATA)>
<!ELEMENT address (#PCDATA)>
<!ENTITY % titlecontents "#PCDATA|sub|sup|i|tt|ref">
<!ELEMENT title (%titlecontents;)*>
<!ELEMENT booktitle (#PCDATA)>
10
Databases Overview
Data
Relational
Data
RDBMS as
- MySQL
- MS SQL Server
- Oracle
- DB2
11
XML
Secondary
Storage
DBMS as
- Sedna
Main
Memory
Tools as
- SAX
- DOM
- LINQ
Tools to read/write an XML file
SAX
DOM
LINQ
Why we need such tools???
- To make sure that the file is either valid or
well-formed.
- To read document in term of entities or
attributes.
12
SAX (Simple API for XML)
Event based.
… you provide a startElement(), characters() endElement()
methods.
You have to keep track of where you are
in the tree/document.
Fast, but a bit painful to code.
Mainly adopted in Java.
13
DOM (Document Object Model)
You get a tree of objects of type “Element” and “Attribute” +
methods to navigate the tree.
Contents are all strings, so you have to do data conversion
yourself to set ints, floats, your Object types.
Mainly adopted in MS .NET.
14
LINQ (Language Integrated Query)
- Microsoft Property and innovation.
- Introduced in Nov 2007 as a library in .NET Framework
3.5.
- Very fast and efficient library to query Relational
Databases, XML Files or even arrays.
- We write SQL like Query to get the information.
15
How to Query These Data???
16
Querying Data
Data
Relational
Data
XML
Translator
Secondary
Storage
Standard
Query
Language
(SQL)
17
Main
Memory
XQuery
XQUERY
An XML Query Language
W3C Recommendation since 23 January 2007
e.g: “/dblp/book[author=“John Smith”]”
Return of Query is XML Elements
18
Query Data (Once Again)
Data
Relational
Data
XML
Translator
Secondary
Storage
Standard
Query
Language
(SQL)
19
Translator
Main
Memory
XQuery
Related Technologies
XPath
XSLT
20
XPATH
XPath: a way to refer to specific subset of elements /
atributes in a document
Method to navigate in file not to query.
"/dblp" -- the root element
"/dblp/book[1]" -- first book element
"//book" -- all <book> elements
"//@title" -- all title= attributes
... used in XSLT for pattern matching
21
How to Display These Data in user friendly way
???
22
XSLT (Extensible Stylesheet Language
Transformations)
<html>
<body>
<h2>Currently Incollection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th align="left">Title</th>
<th align="left">Book Title</th>
<th align="left">Author</th>
<th align="left">Pages</th>
<th align="left">Year</th>
</tr>
<xsl:for-each select="dblp/incollection">
<tr>
</body>
</html>
23
XSLT (continued)
Add this line to original file
<?xml-stylesheet type="text/xsl" href="book.xsl"?>
24
Questions
25