Using Quadstone’s Data Build Manager Starting in 15 minutes Starting in 10 minutes

Download Report

Transcript Using Quadstone’s Data Build Manager Starting in 15 minutes Starting in 10 minutes

Using Quadstone’s
Data Build Manager
Thursday, September 15, 2005
9am Pacific, 12pm Eastern, 5pm UK/Ireland
Friday, September 16, 2005
2pm UK/Ireland, 3pm Central European, 9am Eastern US
Please join the teleconference call now; if you have
any difficulty, contact [email protected].
Starting
Starting
Starting
in
in10
15
5 now
2
minutes
minutes
How to ask questions
Use Q&A (not Chat please):
• Click on the Q&A Panel icon at the
bottom-right of your screen:
• Type in your question:
© 2005 Quadstone
Using Data Build Manager
• Presenter: Patrick Surry, VP Customer Services
• Overview: The Data Build Manager (also known as
•
•
qsbuild) is a powerful tool to manage all of the
interdependent steps in real-world data-preparation,
including parameterization for automated scheduling
and the ability to run tasks in parallel.
Audience: Existing Quadstone data architects,
looking to improve the processing speed and their
productivity in creating customer analysis datasets.
Format:
• A live demo with slides for sign-posting
• Downloadable exercises in the form of a workbook and
dataset
• Duration: 1 hour, including Q&A
© 2005 Quadstone
A simple data preparation process
Customer
data
Transaction
data
SOR
T
DERIVE
SOR
T
Customer IDs
JOIN
MEASUR
E
To be filled
DERIVE
Measurement table
© 2005 Quadstone
Quadstone data preparation tools
XML Build Plan
Measure
RDBMS1
Third-party
Join
Enhance
etc.
FOCUS
Flat files
FOCUS
RDBMS2
Sort
RDBMS1
RDBMS2
Flat files
Third-party
• Efficient modular utilities operating primarily on foci
• Run via Quadstone System Explorer, the command
line, or an XML build plan
© 2005 Quadstone
Data-build commands
qsbuild
IMPORTING
qsgenfdd
qsimportflat
qsimportstat
ENHANCING
qssort
qsrenamefields
qsselect
qsderive
qsmeasure
qstrack
qsimportmetadata
qsupdate
[qsinterp]
[qsexportmetadata]
MANAGING
COMBINING
qsimportfocus
qsjoin
qsappendfields
qsmerge
qscopy
qslink
qsmove
qsremove
[qsremoveflat]
qstml
EXPORTING
qsdbcreatetable
qsdbinsert
qsdbupdate
FOCUS
FOCUS
qsdbaccess
qsimportdb
TRANSFORMING
qsexportflat
qsexportstat
REPORTING
qsdescribe
qsdescribestat
qsaudit
qsdtsnapshot
qsscsnapshot
qsxt
qsxt2spec
qsmapgen
[qsinfo]
See Quadstone System data-build command and TML reference
© 2005 Quadstone
What does Data Build Manager do?
• Flexible environment for implementing data-builds
•
• keep simple builds simple but support advanced requirements
• XML build plan; qsbuild DBC, point & click build execution
Key features:
• Simple & robust – simple structure with many different tasks
• Complete – everything in one place, including inline
•
•
•
•
•
© 2005 Quadstone
TML/FDL/SQL (if desired), and/or non-Quadstone tasks
Modular & portable – structure, reuse and move builds easily
Parameters – no code changes for similar builds
Incremental builds – failure recovery, only do what’s needed
Concurrency – run multiple jobs at the same time
Logging – various ways to track build status and performance
How do I launch it?
• Double-click a .qsb file (Build Plan) in the Quadstone Explorer
© 2005 Quadstone
What’s a build plan look like?
• Right-click a .qsb file in the QSE and choose View or
Edit
© 2005 Quadstone
What is XML?
Borrowed from: http://www.w3.org/XML/1999/XML-in-10-points
1. XML is for structuring data
Structured data includes things like spreadsheets, address books, configuration
parameters, financial transactions, and technical drawings. … XML makes it
easy for a computer to generate data, read data, and ensure that the data
structure is unambiguous. …
2. XML looks a bit like HTML
Like HTML, XML makes use of tags (words bracketed by '<' and '>') and
attributes (of the form name="value"). While HTML specifies what each tag and
attribute means, and often how the text between them will look in a browser,
XML uses the tags only to delimit pieces of data, and leaves the interpretation of
the data completely to the application that reads it. …
3. XML is text, but isn't meant to be read
… One advantage of a text format is that it allows people, if necessary, to look
at the data without the program that produced it; in a pinch, you can read a text
format with your favorite text editor. Text formats also allow developers to more
easily debug applications. Like HTML, XML files are text files that people
shouldn't have to read, but may when the need arises. …
© 2005 Quadstone
How can I change it?
• Extend a target with new steps (tasks)
•
•
• Cut & paste examples from documentation
• Cut & paste from command-line or focus history
Create new targets for logical separation
• Note build’s default target (and initial, final); dependencies
• Nest targets if desired
Increase efficiency
• Conditional execution with ‘unless’ to avoid rework
• Temporary outputs to avoid clutter
• Inline or external scripts (TML, FDL, SQL, …)
© 2005 Quadstone
Making it reusable
• Use properties to avoid repetitive changes
• Like variables but can’t change once set
• Tasks to set and manipulate in many ways
• Parameters are user-visible properties
• E.g. User selects build snapshot date
• E.g. User selects full or sample datasets
• Example:
• Parameterize build with target month
© 2005 Quadstone
More flexible ways of editing
• Some editors know XML (& schema!): very helpful
• See documentation for how to set it up, e.g. jEdit
© 2005 Quadstone
Going further
• RTFM – good overview of capabilities
•
•
•
•
•
•
•
© 2005 Quadstone
Concurrency
Default values for common attributes
Date manipulation via qsdateproperty, e.g. today
Debugging techniques
Running from the command-line
Good practices: tips & traps
Other resources on XML, Ant, etc
Where to get help
• Start>All Programs>Quadstone
• Using Data Build Manager
• Also see: Data-build command and TML reference;
Data-build command Tutorial
• Latest at support.quadstone.com/documentation
• Quadstone System Support:
• Web Site: support.quadstone.com/
• Email: [email protected]
• Tel: US 1-800-335-3860; UK 0131 240 3140; All +44 131 240 3140
© 2005 Quadstone
After the webinar
• These slides, a workbook and data are available via
•
•
•
www.quadstone.com/training/webinars/
Audio and video recordings of this webinar are
available via the same site
Any problems or questions, please contact
[email protected]
For more in-depth training (our ½-day Automating
Data Preparation course), contact
[email protected]
© 2005 Quadstone
Questions and answers
© 2005 Quadstone
Upcoming webinars
Ideas:
• Based around a real-life scenario (possibly Uplift
Analysis)?
• Decision trees, scorecards and quality measures: the
gory math internals?
• What's new in 5.2
• Cluster Builder?
• More on TML?
• Re-run previous webinars
See www.quadstone.com/training/webinars/.
If there’s a webinar topic you’d like to
see, please let us know via
[email protected].
© 2005 Quadstone
Your feedback
Please email [email protected]
© 2005 Quadstone