Transcript Slide 1

©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
New Approaches
to Scorecard Optimization
Julian Yarkony
Data Scientist
©2014 Experian Information Solutions, Inc. All rights reserved. Experian and the marks used herein
are service marks or registered trademarks of Experian Information Solutions, Inc. Other product
and company names mentioned herein are the trademarks of their respective owners. No part of this
copyrighted work may be reproduced, modified, or distributed in any form or manner without the
prior written permission of Experian. Experian Confidential.
Overview
 Predictive models training is generally purely data driven
►
Only considers accuracy, not explainability
 For FCRA reasons and general interpretability, we want models not
only accurate but also easily explainable
►
For example, the more delinquencies, the worse the score
 Explainability must be accounted for either manually or automatically
►
Such a capability is not readily available
►
Common practice in the industry is trial and error
3
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Constrained Optimization
 The lab developed an algorithm to automatically train score-cards with
these constraints
Define
Define
modeling
modeling
data
data
setset
Satisfy
Build
most accurate
the that satisfies
model
rules?
the
rules
Build most
Define
accurate
the rules
model
No
Adjusting
settings
 Constraint Optimization enables us to
►
Include business knowledge
►
Meet compliance requirements
►
Remove unnecessary human intervention
4
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Scorecards is an additive model
 In order to represent scorecards as an optimization problem we convert the
problem to the same form as a regression
►
To do this we convert our bins into binary indicator functions…
Original variable
Bin range
Parameter
Score value
New indicator
variable
Utilization
0-25
W(1,1)
+50
isUtilLessThan25
25-75
W(1,2)
0
isUtilBetween25and75
75+
W(1,3)
-50
isUtilGreaterThan75
0
W(2,1)
+100
is0Delinq
1
W(2,2)
-75
is1Delinq
2+
W(2,3)
-50
is2orMoreDelinq
Number of
Delinquencies
►
…and then express the model as a weighted sum of those indicator
functions
G(y|x,w)= 𝑎,𝑏 𝑤 𝑎, 𝑏 𝑋(𝑎, 𝑏)
5
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Framing the problem mathematically
 Once the problem has been converted to an optimization problem
it is easier to see how to express the constraints mathematically
min 𝐹 𝑤 =
𝑤
𝐸(𝑋|𝑤)
𝑋
Subject to:
𝑊
𝑊
𝑊
𝑊
𝑊
The score of each bin
of the second variable
must not exceed 50
(limit contribution)
𝑎, 1
𝑎, 2
𝑏, 1
𝑏, 2
𝑏, 3
≤𝑊
≤𝑊
≤𝑊
≤𝑊
≤𝑊
𝑎, 2
𝑎, 3
𝑏, 2
𝑏, 3
𝑏, 4
The score of each bin of
the first variable must
increase (variable is
monotonic)
𝑊 𝑎, 1 ≤ 50
𝑊 𝑏, 3 ≤ 50
6
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Benefits of Current Algorithm
Correct trends in variables are ensured
Relative contribution of variable can be controlled
Bounds can be set to limit the impact of individual variables
Variables are implicitly selected
Can be applied to other types of models such as Linear/Logistic regression
Scales to large problems?
7
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Challenges and Improvements
# of Variables vs. Runtime
250
Runtime (hours)
200
150
Existing Solution
New Solution
100
50
0
0
200
400
600
800
1000
# of variables
 We developed a new algorithm to address this challenge
 Lagrangian Learning/Constraint Splitting (LLCS) achieved significant
improvement in speed
 380K examples in the training dataset
8
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Formulation
m𝑎𝑥 𝑭 𝒘
𝑤
(𝑭 𝒘 is “profit function")
s. t. all constraints satified are met
m𝑎𝑥 𝑭 𝒘 + 𝑫(𝒘)
(𝑫 𝒘 is − ∞ if any of the constraints not met)
𝑤
𝑭 𝒘 : Learning problem with many observations)
𝑫 𝒘 : Finding a solution in a bounding box* on weights for example)
 Solving learning problem 𝑭 𝒘 can be done with gradient descent.
 Projecting the data onto a space 𝑫 𝒘 can be done using an LP (linear
program)**
… Solving both at the same time is hard
*Actually a convex polytope
**Simple constraints don’t need an LP to be solved and can be done in constant time
9
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Idea
Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘
max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2
2
− 𝜑 𝑤2 − 𝑤3 1 , and,
max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐
2
− 𝜑 𝒘𝟐 − 𝒘𝟑
𝒘𝟏,𝑤2,𝑤3
𝑤1,𝒘𝟐,𝒘𝟑
1
while trying to make w1=w2=w3
 Since we know 𝑭 𝒘 and 𝑫 𝒘 can be easily solved separately why
try to solve them together?
 We break the optimization into three separate problems which are
encouraged to take on the same state by Lagrange multipliers 𝜆, 𝜑
10
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Idea
Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘
max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2
2
− 𝜑 𝑤2 − 𝑤3 1 , and,
max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐
2
− 𝜑 𝒘𝟐 − 𝒘𝟑
𝒘𝟏,𝑤2,𝑤3
𝑤1,𝒘𝟐,𝒘𝟑
1
while trying to make w1=w2=w3
 Optimizing over w1 is a learning problem.
►
It is regularized by an L2 norm drawing it to w2.
 Optimizing over w3 is a LP or projection
►
Curiously jointly optimizing over w2 and w3 is also a LP or
projection
 Optimizing w2 has a closed form solution.
11
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Idea
Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘
max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2
2
− 𝜑 𝑤2 − 𝑤3 1 , and,
max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐
2
− 𝜑 𝒘𝟐 − 𝒘𝟑
𝒘𝟏,𝑤2,𝑤3
𝑤1,𝒘𝟐,𝒘𝟑
1
while trying to make w1=w2=w3
 The design is to force F(w3) to be as good as F(w1) while at the same
time keeping F(w1) to be a good score.
 The Lagrange multiplier 𝜆, 𝜑 are increased on a special schedule
►
Idea is to slowly force w1 and w3 together in a principled way
12
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Progression of the algorithm
13
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Progression of the algorithm
w3
w1
w2
14
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Progression of the algorithm
w3
w2
w1
15
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Progression of the algorithm
w21
w3
16
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Progression of the algorithm
w
w2 w 3
1
17
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Why the L2 and L1 norms
Versus
 We use an L1 norm to solve the constrained optimization
►
L2 norm results in a quadratic program which is a lot harder
 By tying the two optimization problems with w2, we get the best of both worlds
18
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Why gradually increase 𝜑, 𝜆
 If we initially set 𝜑, 𝜆 to infinity, then w1,w2,w3 take on the original value of w3
►
Meaning you get some arbitrarily bad (but legal!) solution
 Gradually increasing 𝜑, 𝜆 according to a principled schedule will allow learning
to occur
 When 𝜑, 𝜆 become large, then w1,w2,w3 will be similar,
►
Meaning the solution to 𝑭 𝒘𝟑 is both good and legal
19
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Updates on 𝜑, 𝜆
The formula for the updates is below:
 D = |w1 − w3|2
 Increase 𝜑, 𝜆 to force w2 to lie in the middle of w1 and w3 to ensure rapid
convergence
D
D
 λ( 2 )^2 = φa( 2 )

λ
D/2
γ
=a
 a>1
φ ← φa
 a<1
λ ← λ/a
 Afterword we multiply each of φ, λ by 1.01 – this is sufficient to ensure that
w1,w2,w3 converge to the same value
20
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Conclusions and Expansions
 LLCS optimization approach allows
►
Vast expansion in number of attributes
►
Ability to optimize complex linear constraints over many variables
 We dramatically improved speed without sacrificing model accuracy,
►
We even observed small performance improvements
 By improving the scalability we can now more efficiently solve a broader set of
problems
21
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
Name Julian Yarkony
Name:
Title : Scientist
Company Experian Datalabs
Company:
e:
t:
m:
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.