Uplift Modeling

Download Report

Transcript Uplift Modeling

The Wonders of
Field Derivation Language (FDL)
Wednesday, May 18, 2005
2pm UK/Ireland, 3pm Central European, 9am Eastern US
Thursday, May 19, 2005
9am Pacific, 12pm Eastern, 5pm UK/Ireland
Please join the teleconference call now; if you have
any difficulty, contact [email protected].
Starting
Starting
Starting
in
in10
15
5 now
2
minutes
minutes
How to ask questions
• Return to WebEx Event Manager:
• Use Q&A (not Chat):
• You can return to full-screen view:
© 2005 Quadstone
The Wonders of Field Derivation Language (FDL)
• Presenter: Eileen MacGregor, Support Manager
• Overview: Field Derivation Language (FDL) is one of
•
•
•
the most powerful features of the Quadstone System.
It allows you to transform or combine fields to
produce more useful data for analysis. However, the
programmatic nature of FDL can be off-putting. This
webinar aims to provide practical examples and
useful tips for creating your own derived fields.
Audience: Anyone who wants to enhance their
analysis with derived fields.
Format: A live demo, with slides for an introduction
Duration: 90 minutes
© 2005 Quadstone
Overview
• How do I derive a field?
• Tips and guidelines
• Working with numbers, strings, dates and fields in
•
•
FDL
Practical applications
• Flagging segments
• Dealing with Nulls
• Converting between data types
• Adding reference information
• Random selections
• Matching patterns in text
Gotchas!
© 2005 Quadstone
Colour coding in examples
•
•
•
•
FDL functions in red
Field names in blue
Literals (number/date/string) purple
All other FDL syntax in black
© 2005 Quadstone
What do derivations do?
• Most derivations manipulate existing data,
but derivations do not need to refer to any
field names — they can also use only
functions:
Age + 10;
countwholeyears( DOB, today() );
if ( rndUniform() < 0.4 ) then "Y" else "N";
today();
© 2005 Quadstone
FDL “laws”
• You can do virtually anything in FDL
• There is always more than one way to do
•
•
•
everything
A good chance you’ll get it wrong first time —
use the Help
( ) and ; are your friends
You won’t break anything by trying
© 2005 Quadstone
How to derive a field
In Decisionhouse, derivations are done in the
Table Viewer, by clicking on the New Field
button:
© 2005 Quadstone
How to derive a field
By default, a single line derivation textbox
appears:
© 2005 Quadstone
How to derive a field — tip
A multi-line textbox can be accessed by
clicking on
© 2005 Quadstone
How to derive a field — tip
• All built-in functions can be accessed by
clicking on
© 2005 Quadstone
How to derive a field — tip
• Function and field names can be entered
partially, then completed using TABcompletion
• Useful when there are lots of similarly-named
fields
© 2005 Quadstone
How to derive a field — tip
• Use the Quadstone System Help, which can
be accessed by pressing F1
© 2005 Quadstone
Derivations — guidelines
• Field names must be SQL-compliant (no
reserved words and with only alphanumeric
characters and the underscore _ character)
© 2005 Quadstone
Derivations — guidelines
• Working with fields, strings and dates
• Fieldnames can be single-quoted or have no quotes:
'SpendInCents' / 100
is identical to
SpendInCents / 100
• Auto-generated fields from DTB and SCB use single
quotes
© 2005 Quadstone
Derivations — guidelines
• Working with fields, strings and dates
• String values must be enclosed in double quotes
Gender = "Male"
© 2005 Quadstone
Derivations — guidelines
• Working with fields, strings and dates
• Date values must be preceded by a hash
DOB < #01/01/1984
FirstTrans > #2000/06/30
© 2005 Quadstone
Derivations — what if I get it wrong?
If there is a mistake in the derivation, an error
will be displayed:
© 2005 Quadstone
Derivations — what if I get it wrong?
If you derive a field, but use the wrong input
fields/functions etc.
• Either delete the field using
and start again
• Or simply correct the input FDL and press
<Return>
© 2005 Quadstone
Flagging a segment
• To flag a segment of the population using a
logical expression:
if Gender = "F" and Income > 35000
then "Target" else "No Target"
© 2005 Quadstone
Handling Nulls in data
• Missing data (Nulls) can be transformed:
if isnull(Response) then 0 else
Response
nvl(Response, 0)
© 2005 Quadstone
Handling Nulls in data
• Special functions handle Null data:
minnonnull(CallsQ1, CallsQ2, CallsQ3)
sumnonnull(Spend1, Spend2, Spend3)
© 2005 Quadstone
Converting between types
There are three datatypes for fields in the
Quadstone System:
• Numeric
• String
• Date
© 2005 Quadstone
tointeger() / toreal()
tostring()
todate()
Converting to numeric
• Integer → Real
• Real → Integer
• Note anything after the decimal point will be
truncated
• Dates and suitable strings can be converted
to numeric field types
• Nulls will be generated for unsuitable strings
© 2005 Quadstone
Converting to numeric
• toreal(TotalTax)
• tointeger(CurrentBalance)
• tointeger(DateField)
• tointeger( substr(StateCode, 2, 4) )
© 2005 Quadstone
Converting to string
• Numerics and dates can be converted to
•
string values
Why convert to string? In order to use other
string functions.
© 2005 Quadstone
Converting to string
tostring( today() )
concat(tostring(HouseNo), " ",StreetName)
© 2005 Quadstone
Converting to string
Example
How do I convert a date field into a string field
in the format “YYYY-MM” ?
© 2005 Quadstone
Converting to string
The answer initially given:
M1 := month(Mydate);
Yr := tostring(year(Mydate));
M2 := if (M1 < 10) then concat("0", tostring(M1))
else tostring(M1);
concat(Yr, "-", M2)
© 2005 Quadstone
Converting to string
The solution I came up with yesterday:
x := tostring( Mydate );
concat( substr(x, 6, 9), "-", substr(x, 3, 4) )
© 2005 Quadstone
Converting to date
• Before the type-conversion functions, only
individual numeric date components could be
used to create a date field, using:
• adddays, addmonths, addyears
© 2005 Quadstone
Converting to date
addyears(addmonths(adddays(#01/01/2000,
MyDay - 1), MyMonth - 1), MyYear - 2000)
© 2005 Quadstone
Converting to date
• Much easier now
• todate(20050515) — integer is in YMD
format
• todate("15/05/2005") — string is in ‘read’
format set in preferences, e.g. European,
American
• todate("15-May-2005", "%d-%b-%Y") —
specified date format
© 2005 Quadstone
Using lookup tables
• Reference information can be added from
lookup tables, using ‘key’ values in focus
• Lookup tables generally small
• Data does not have to be sorted
© 2005 Quadstone
Using lookup tables
© 2005 Quadstone
Using lookup tables
I have a StateCode field in my focus, and I want to
add the average income value per state. This data is
held in another table
© 2005 Quadstone
Using lookup tables
Use the dblookup() function
dblookup("focus:D:\Ref\USIncome.ftr",
"US_State", "MeanIncome",
substr(StateCode, 0, 1))
© 2005 Quadstone
Selecting records randomly
• Datasets can be split into segments, for
•
•
example Test/Training:
• numericTestTrainSplit()
Equal-sized populations:
• sampleEqualSize()
Stratified sampling:
• sampleStratified()
© 2005 Quadstone
Selecting records randomly
• To split the population into a 40% Test
segment and a 60% Training segment:
numericTestTrainSplit(0.4)
© 2005 Quadstone
Selecting records randomly
• To sample 2500 records from a population
comprising females:
sampleExactNumber(2500)
© 2005 Quadstone
Selecting records randomly
• To sample 2500 records from a segment
containing 17219 records, where the
segment comprises females, and 5000
records from the remaining population of
82781 records:
© 2005 Quadstone
Selecting records randomly
sampleStratified(2500, 17219, 5000, 82781,
Gender = 2)
© 2005 Quadstone
Regular expression matching
• It is possible to ‘find’ patterns within string
fields:
• match()
• And ‘replace’ patterns in strings:
• replaceall() , replacefirst()
© 2005 Quadstone
Regular expression matching
• To look for StateCode values which begin
with an alphabetic character:
match("^[A-Z]" , StateCode )
© 2005 Quadstone
Regular expression matching
• To look for StateCode values which only
contain alphabetic characters:
match("^[A-Z]\+$" , StateCode )
© 2005 Quadstone
Removing part of text in a string
How can I remove the title information from a
name field?
replacefirst("^Mr & Mrs ", "", FullName)
© 2005 Quadstone
Removing part of text in a string
To handle all the different titles in one derivation:
case
match
("^Mr
& Mrs ", FullName)
:
replacefirst("^Mr & Mrs ", "", FullName)
match
("^Mrs ", FullName) :
replacefirst("^Mrs ", "", FullName)
match
("^Mr ", FullName) :
replacefirst("^Mr ", "", FullName)
…
© 2005 Quadstone
;
;
;
Removing part of text in a string
To handle all the different titles in one derivation:
replacefirst( "^Mr & Mrs \|^Miss \|^Mrs \|^Ms
\|^Mr \|^Rev \|^Dr " , "", FullName)
© 2005 Quadstone
Regular expression matching
One customer’s use:
• To determine all possible entries for a product
from a free-text field:
match( "123", ProductCode )
© 2005 Quadstone
Global variables
• Global or accumulator variables allow values
to be computed over entire sets of records
For example, to create an index field:
global i := 0;
i := i + 1;
© 2005 Quadstone
Global variables
• The most common use by customers is to
‘de-duplicate’ a dataset:
© 2005 Quadstone
Global variables
• The most common use by customers is to
‘de-duplicate’ a dataset:
© 2005 Quadstone
Global variables
global lastID := "";
flag1st := (CustID != lastID);
lastID := CustID;
flag1st;
© 2005 Quadstone
Derivations — gotchas
• Derived fields are created in the focus
currently being used (so beware if subfoci
exist — they don’t propagate)
© 2005 Quadstone
Derivations — gotchas
• Beware of rederiving fields with different
derivation text which may cause the output
data type to change
• The type of a derived field is fixed
• You won’t get what you expect
© 2005 Quadstone
Derivations — gotchas
• Results are generated for all records in the
focus, irrespective of current selections
• isselected()
© 2005 Quadstone
Derivations — gotchas
• Field names are case-sensitive
© 2005 Quadstone
Where to find out more
• Quadstone System Help
• Field Derivation Language (FDL) reference
• Syntax and examples
• Quadstone System Support website:
http://support.quadstone.com/
© 2005 Quadstone
Questions and answers
© 2005 Quadstone
After the webinar
• These slides are available via
•
www.quadstone.com/training/webinars/
Any problems or questions, please contact
[email protected]
© 2005 Quadstone
Upcoming webinars
What's new in the Quadstone System
Version 5.1?
June 23, 2005 9am Pacific
11am Central
12 noon Eastern
What's new in the Quadstone System
Version 5.1?
June 24, 2005 14:00 UK/Ireland
15:00 Central European
Getting Started with the Quadstone
System
July 27, 2005
9am Pacific
11am Central
12 noon Eastern
Getting Started with the Quadstone
System
July 28, 2005
14:00 UK/Ireland
15:00 Central European
See www.quadstone.com/training/webinars/.
If there’s a webinar topic you’d like to see, please let us know via
[email protected].
© 2005 Quadstone
Your feedback
Please email [email protected]
© 2005 Quadstone