Transcript JAXP

JAXB
Java Architecture for XML Binding
26-Jul-16
What is JAXB?


JAXB is Java Architecture for XML Binding
SAX and DOM are generic XML parsers


JAXB creates a parser that is specific to your DTD


They will parse any well-structured XML
A JAXB parser will parse only valid XML (as defined by your
DTD)
DOM and JAXB both produce a tree in memory


DOM produces a generic tree; everything is a Node
JAXB produces a tree of Objects with names and attributes as
described by your DTD
2
Advantages and disadvantages

Advantages:

JAXB requires a DTD






Using JAXB ensures the validity of your XML
A JAXB parser is actually faster than a generic SAX parser
A tree created by JAXB is smaller than a DOM tree
It’s much easier to use a JAXB tree for application-specific code
You can modify the tree and save it as XML
Disadvantages:

JAXB requires a DTD


You must do additional work up front to tell JAXB what kind of tree
you want it to construct


Hence, you cannot use JAXB to process generic XML (for example, if
you are writing an XML editor or other tool)
But this more than pays for itself by simplifying your application
JAXB is new: Version 1.0 dates from Q4 (fourth quarter) 2002
3
How JAXB works

JAXB takes as input two files: your DTD and a binding
schema (which you also write)




JAXB produces as output Java source code which you
compile and add to your program




A binding schema is an XML document written in a “binding
language” defined by JAXB (with extension .xjs)
A binding schema is used to customize the JAXB output
Your binding schema can be very simple or quite complex
Your program will uses the specific classes generated by JAXB
Your program can then read and write XML files
JAXB also provides an API for working directly with XML
Some examples in this lecture are taken from the JAXB User’s guide,
http://java.sun.com/xml/jaxb/docs.html
4
A first example



The DTD:
<!ELEMENT book (title, author, chapter+) >
<!ELEMENT title (#PCDATA) >
<!ELEMENT author (#PCDATA)>
<!ELEMENT chapter (#PCDATA) >
The schema: <xml-java-binding-schema>
<element name="book" type="class" root="true" />
</xml-java-binding-schema>
Note 1: In these slides
we only show the class
The results: public Book(); // constructor
outline, but JAXB
public String getTitle();
creates a complete
public void setTitle(String x);
class for you
public String getAuthor();
public void setAuthor(String x); Note 2: JAXB constructs
public List getChapter();
names based on yours,
public void deleteChapter();
with good capitalization
public void emptyChapter();
style
5
Adding complexity

Adding a choice can reduce the usefulness of the parser



<!ELEMENT book (title, author, (prologue | preface), chapter+)>
<!ELEMENT prologue (#PCDATA) >
<!ELEMENT preface (#PCDATA) >
With the same binding schema, this gives:
 public Book();
public List getContent();
public void deleteContent();
public void emptyContent();
An improved binding schema can give better results
6
Improving the binding schema


<xml-java-binding-schema>
<element name="book" type="class" root="true">
<content>
<element-ref name="title" />
<element-ref name="author” />
<choice property="prologue-or-preface" />
</content>
</element>
</xml-java-binding-schema>
Result is same as the original, plus methods for the choice:

public Book(); // constructor
...
public void emptyChapter();
public MarshallableObject getPrologueOrPreface();
public void setPrologueOrPreface(MarshallableObject x);
7
Marshalling





marshal, v.t.: to place or arrange in order
marshalling: the process of producing an XML
document from Java objects
unmarshalling: the process of producing a content tree
from an XML document
JAXB only allows you to unmarshal valid XML
documents
JAXB only allows you to martial valid content trees
into XML
8
Limitations of JAXB

JAXB only supports DTDs and a subset of XML
Schemas


Later versions may support more schema languages
JAXB does not support the following legal DTD
constructs:




Internal subsets
NOTATIONs
ENTITY and ENTITIES
Enumerated NOTATION types
9
A minimal binding schema


A JAXB binding schema is itself in XML
Start with: <xml-java-binding-schema version="1.0ea">



The version is optional
“ea” stands for “early access,” that is, not yet released
Put in:
<element name="rootName" type="class" root="true" />
for each possible root element





An XML document can have only one root
However, the DTD does not say what that root must be
Any top-level element defined by the DTD may be a root
The value of name must match exactly with the name in the DTD
End with: </xml-java-binding-schema>
10
More complex schemata


JAXB requires that you supply a binding schema
As noted on the previous slide, this would be
<xml-java-binding-schema version="1.0ea">
<element name="rootName" type="class" root="true" />
</xml-java-binding-schema>

With this binding schema, JAXB uses its default rule
set to generate your “bindings”


A binding is an association between an XML element and
the Java code used to process that element
By adding to this schema, you can customize the
bindings and thus the generated Java code
11
Default bindings, I

A “simple element” is one that has no attributes and only
character contents:


<!ELEMENT elementName (#PCDATA) >
For simple elements, JAXB assumes:
<element name="elementName" type="value"/>
 JAXB will treat this element as an instance variable of the class
for its enclosing element
 This is the default binding, that is, this is what JAXB will assume
unless you tell it otherwise



For example, you could write this yourself, but set type="class"
For simple elements, JAXB will generate these methods in the
class of the enclosing element:
void setElementName(String x);
String getElementName();
We will see later how to convert the #PCDATA into some type
other than String
12
Default bindings, II





If an element is not simple, JAXB will treat it as a class
Attributes and simple subelements are treated as instance variables
DTD: <!ELEMENT elementName (subElement1, subElement2) >
<!ATTLIST elementName attributeName CDATA #IMPLIED>
Binding: <element name="elementName" type="class">
<attribute name="attributeName"/>
<content>
<element-ref name="subElement1" /> <!-- simple element -->
<element-ref name="subElement2" /> <!-- complex element -->
</content>
</element>
Java: class ElementName extends MarshallableObject {
void setAttributeName1(String x);
String getAttributeName1();
String getSubElement1();
void setSubElement1(String x);
// Non-simple subElement2 is described on the next slide
13
Default bindings, III


If an element contains a subelement that is defined by a class, the
code generated will be different
<element name="elementName" type="class">
<content>
<element-ref name="subElement2" />
<!-- Note that "element-ref" means this is a reference to an
element that is defined elsewhere, not the element itself ->
</content>
</element>
 Results in:
class ElementName extends MarshallableObject {
SubElement2 getSubElement2();
void setSubElement2(SubElement2 x);
...}
 Elsewhere, the DTD definition for subElement2 will result in:
class SubElement2 extends MarshallableObject { ... }
14
Default bindings, IV

A simple sequence is just a list of contents, in order, with no
+ or * repetitions



Example: <!ELEMENT html (head, body) >
For an element defined with a simple sequence, setters and getters are
created for each item in the sequence
If an element’s definition isn’t simple, or if it contains
repetitions, JAXB basically “gives up” and says “it’s got
some kind of content, but I don’t know what”


Example: <!ELEMENT book (title, forward, chapter*)>
Result:
public Book(); // constructor
public List getContent(); // "general content"--not too useful!
public void deleteContent();
public void emptyContent();
15
Customizing the binding schema

You won’t actually see these default bindings anywhere-they are just assumed



If a default binding is OK with you, don’t do anything
If you don’t like a default binding, just write your own
Here’s the minimal binding you must write:
<xml-java-binding-schema>
<element name="rootElement" type="class" root="true" />
</xml-java-binding-schema>

Start by “opening up” the root element:
<xml-java-binding-schema>
<element name="rootElement" type="class" root="true" >
</element>
</xml-java-binding-schema>

Now you have somewhere to put your customizations
16
Primitive attributes

By default, attributes are assumed to be Strings



<!ATTLIST someElement someAttribute CDATA #IMPLIED>
class SomeElement extends MarshallableObject {
void setSomeAttribute(String x);
String getSomeAttribute();
You can define your own binding and use the convert attribute to
force the defined attribute to be a primitive, such as an int:


<element name="someElement " type="class" >
<attribute name="someAttribute" convert="int" />
</element>
class SomeElement extends MarshallableObject {
void setSomeAttribute(int x);
int getSomeAttribute();
17
Conversions to Objects, I

At the top level (within <xml-binding-schema>), add a
conversion declaration, such as:


Add a convert attribute where you need it:


<element name="name" type="value" convert="BigDecimal" />
The result should be:


<conversion name="BigDecimal" type="java.math.BigDecimal" />
 name is used in the binding schema
 type is the actual class to be used
public java.math.BigDecimal getName();
public void setName(java.math.BigDecimal x);
This works for BigDecimal because it has a constructor that
takes a String as its argument
18
Conversions to Objects, II

There is a constructor for Date that takes a String as its
one argument, but this constructor is deprecated




This is because there are many ways to write dates
For an object like this, you need to supply methods to “parse”
and “print”
<conversion name="MyDate" type="java.util.Date”
parse="MyDate.parseDate" print="MyDate.printDate"/>
Your class, MyDate, would extend Date and provide
parseDate and printDate methods
19
Creating enumerations




<!ATTLIST shirt size (small | medium | large) #IMPLIED> defines an attribute
of shirt that can take on one of a predefined set of values
A typesafe enum is a class whose instances are a predefined set of values
To create a typesafe enum for size:
 <enumeration name="shirtSize" members="small medium large">
 <element name="shirt" ...>
<attribute name="size" convert="shirtSize" />
</element>
You get:
 public final class ShirtSize {
public final static ShirtSize SMALL;
public final static ShirtSize MEDIUM;
public final static ShirtSize LARGE;
public static ShirtSize parse(String x);
public String toString();
}
20
Content models

The <content> tag describes one of two kinds of content
models:


A general-content property binds a single property
 You’ve seen this before:
<content property="my-content" />
 Gives: public List getMyContent();
public void deleteMyContent();
public void emptyMyContent();
A model-based content property can contain four types of
declarations:
 element-ref says that this element contains another element
 choice says that there are alternative contents
 sequence says that contents must be in a particular order
 rest can be used to specify any kind of content
21
Using JAXB


JAXB is not currently a part of the standard Java distributions
The steps involved in using JAXB are:





Download, install, and configure JAXB
Write a JAXB schema to describe the bindings you want for your XML
Use JAXB to read the JAXB schema and the XML DTD (or XML
Schema) and produce Java code
Add the Java code to your program and compile it
Use the resultant program to:




Read and validate XML input files
Modify the XML tree
Optionally validate and output the modified XML
Note: Validation is optional and can be performed during unmarshalling
or any time thereafter
22
The End
23