Normalization with XSL and a file splitter for greater control

Download Report

Transcript Normalization with XSL and a file splitter for greater control

Normalization with XSLT
Charles Draper ([email protected])
Curtis Thacker ([email protected])
Brigham Young University
IGeLU 2014
What is XSLT?
A language for transforming an XML document
into another XML Document.
Simple XSLT Example
----Source XML Document-------Output XML Document---<record id="12345678">
<record>
<title>A Very Simple Record</title></record>
<control>
----XSLT---<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="record">
<record>
<control>
<sourcerecordid>
<xsl:value-of select="@id" />
</sourcerecordid>
</control>
<display>
<xsl:copy-of select="title" />
</display>
</record></xsl:template></xsl:stylesheet
>
<sourcerecordid>12345678</sourcerecordid>
</control>
<display>
<title>A Very Simple Record</title>
</display></record>
Why XSLT?
1. XSLT is a programming language and therefore gives
added flexibility and power when creating the PNX.
2. Only one copy of the rules needed when working with
multiple institutions. Use if-else statements for differences
between institutions.
3. Some normalization rules can be extremely lengthy and
very complex. The logic can be greatly simplified through
the power of XSLT.
XSLT File Splitter
An extension of the built-in XML File Splitter
that performs XSLT transformations at harvest
time.
Overcoming Obstacles
Performance – Transformation performance is greatly enhanced by
using Michael Kay’s XSLT processor, Saxon.
Memory consumption – The transformation takes place on individual
records after the file has been split; not on the collection.
Complexity – XSLT is a programming language so there's no getting
around that; however, XSLT rules can be easier to maintain than
normalization rules.
Flexibility – Using a more advanced XSLT processor like Saxon allows
one to use XSLT 2.0 (free) or even XSLT 3.0 (paid). These
technologies give an enormous amount of flexibility.
Getting Started
1. Download the jar file for the splitter from
https://bitbucket.org/byuhbll/lib-java-xsltfilesplitter/src
2. Download Saxon-HE (saxon9he.jar) from
http://saxon.sourceforge.net/
3. Copy both jars to your Primo server into
/exlibris/primo/p#_#/ng/primo/home/profile/publish/publish/p
roduction/conf/fileSplitter/lib
Getting Started (continued)
4. Copy the sample marc folder of XSL from
https://bitbucket.org/byuhbll/lib-java-xsltfilesplitter/src to
somewhere on your Primo server
5. Customize the XSLT for your institution
6. Ready your XML like normal for harvest
Getting Started (continued)
7. Add file splitter to mapping tables
Getting Started (continued)
8. Configure the file splitter
Getting Started (continued)
9. Create 1:1_XML Normalization mapping set
Getting Started (continued)
10. Setup your data source and pipe to use the XSLT File
Splitter
Thank You