Transcript Slide 1
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. New Approaches to Scorecard Optimization Julian Yarkony Data Scientist ©2014 Experian Information Solutions, Inc. All rights reserved. Experian and the marks used herein are service marks or registered trademarks of Experian Information Solutions, Inc. Other product and company names mentioned herein are the trademarks of their respective owners. No part of this copyrighted work may be reproduced, modified, or distributed in any form or manner without the prior written permission of Experian. Experian Confidential. Overview Predictive models training is generally purely data driven ► Only considers accuracy, not explainability For FCRA reasons and general interpretability, we want models not only accurate but also easily explainable ► For example, the more delinquencies, the worse the score Explainability must be accounted for either manually or automatically ► Such a capability is not readily available ► Common practice in the industry is trial and error 3 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Constrained Optimization The lab developed an algorithm to automatically train score-cards with these constraints Define Define modeling modeling data data setset Satisfy Build most accurate the that satisfies model rules? the rules Build most Define accurate the rules model No Adjusting settings Constraint Optimization enables us to ► Include business knowledge ► Meet compliance requirements ► Remove unnecessary human intervention 4 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Scorecards is an additive model In order to represent scorecards as an optimization problem we convert the problem to the same form as a regression ► To do this we convert our bins into binary indicator functions… Original variable Bin range Parameter Score value New indicator variable Utilization 0-25 W(1,1) +50 isUtilLessThan25 25-75 W(1,2) 0 isUtilBetween25and75 75+ W(1,3) -50 isUtilGreaterThan75 0 W(2,1) +100 is0Delinq 1 W(2,2) -75 is1Delinq 2+ W(2,3) -50 is2orMoreDelinq Number of Delinquencies ► …and then express the model as a weighted sum of those indicator functions G(y|x,w)= 𝑎,𝑏 𝑤 𝑎, 𝑏 𝑋(𝑎, 𝑏) 5 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Framing the problem mathematically Once the problem has been converted to an optimization problem it is easier to see how to express the constraints mathematically min 𝐹 𝑤 = 𝑤 𝐸(𝑋|𝑤) 𝑋 Subject to: 𝑊 𝑊 𝑊 𝑊 𝑊 The score of each bin of the second variable must not exceed 50 (limit contribution) 𝑎, 1 𝑎, 2 𝑏, 1 𝑏, 2 𝑏, 3 ≤𝑊 ≤𝑊 ≤𝑊 ≤𝑊 ≤𝑊 𝑎, 2 𝑎, 3 𝑏, 2 𝑏, 3 𝑏, 4 The score of each bin of the first variable must increase (variable is monotonic) 𝑊 𝑎, 1 ≤ 50 𝑊 𝑏, 3 ≤ 50 6 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Benefits of Current Algorithm Correct trends in variables are ensured Relative contribution of variable can be controlled Bounds can be set to limit the impact of individual variables Variables are implicitly selected Can be applied to other types of models such as Linear/Logistic regression Scales to large problems? 7 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Challenges and Improvements # of Variables vs. Runtime 250 Runtime (hours) 200 150 Existing Solution New Solution 100 50 0 0 200 400 600 800 1000 # of variables We developed a new algorithm to address this challenge Lagrangian Learning/Constraint Splitting (LLCS) achieved significant improvement in speed 380K examples in the training dataset 8 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Formulation m𝑎𝑥 𝑭 𝒘 𝑤 (𝑭 𝒘 is “profit function") s. t. all constraints satified are met m𝑎𝑥 𝑭 𝒘 + 𝑫(𝒘) (𝑫 𝒘 is − ∞ if any of the constraints not met) 𝑤 𝑭 𝒘 : Learning problem with many observations) 𝑫 𝒘 : Finding a solution in a bounding box* on weights for example) Solving learning problem 𝑭 𝒘 can be done with gradient descent. Projecting the data onto a space 𝑫 𝒘 can be done using an LP (linear program)** … Solving both at the same time is hard *Actually a convex polytope **Simple constraints don’t need an LP to be solved and can be done in constant time 9 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Idea Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘 max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2 2 − 𝜑 𝑤2 − 𝑤3 1 , and, max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐 2 − 𝜑 𝒘𝟐 − 𝒘𝟑 𝒘𝟏,𝑤2,𝑤3 𝑤1,𝒘𝟐,𝒘𝟑 1 while trying to make w1=w2=w3 Since we know 𝑭 𝒘 and 𝑫 𝒘 can be easily solved separately why try to solve them together? We break the optimization into three separate problems which are encouraged to take on the same state by Lagrange multipliers 𝜆, 𝜑 10 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Idea Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘 max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2 2 − 𝜑 𝑤2 − 𝑤3 1 , and, max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐 2 − 𝜑 𝒘𝟐 − 𝒘𝟑 𝒘𝟏,𝑤2,𝑤3 𝑤1,𝒘𝟐,𝒘𝟑 1 while trying to make w1=w2=w3 Optimizing over w1 is a learning problem. ► It is regularized by an L2 norm drawing it to w2. Optimizing over w3 is a LP or projection ► Curiously jointly optimizing over w2 and w3 is also a LP or projection Optimizing w2 has a closed form solution. 11 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Idea Alternating between Solving 𝑭 𝒘 and 𝑫 𝒘 max 𝑭 𝒘𝟏 + 𝐷 𝑤3 − 𝜆 𝒘𝟏 − 𝑤2 2 − 𝜑 𝑤2 − 𝑤3 1 , and, max 𝐹 𝑤1 + 𝐷 𝒘𝟑 − 𝜆 𝑤1 − 𝒘𝟐 2 − 𝜑 𝒘𝟐 − 𝒘𝟑 𝒘𝟏,𝑤2,𝑤3 𝑤1,𝒘𝟐,𝒘𝟑 1 while trying to make w1=w2=w3 The design is to force F(w3) to be as good as F(w1) while at the same time keeping F(w1) to be a good score. The Lagrange multiplier 𝜆, 𝜑 are increased on a special schedule ► Idea is to slowly force w1 and w3 together in a principled way 12 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Progression of the algorithm 13 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Progression of the algorithm w3 w1 w2 14 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Progression of the algorithm w3 w2 w1 15 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Progression of the algorithm w21 w3 16 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Progression of the algorithm w w2 w 3 1 17 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Why the L2 and L1 norms Versus We use an L1 norm to solve the constrained optimization ► L2 norm results in a quadratic program which is a lot harder By tying the two optimization problems with w2, we get the best of both worlds 18 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Why gradually increase 𝜑, 𝜆 If we initially set 𝜑, 𝜆 to infinity, then w1,w2,w3 take on the original value of w3 ► Meaning you get some arbitrarily bad (but legal!) solution Gradually increasing 𝜑, 𝜆 according to a principled schedule will allow learning to occur When 𝜑, 𝜆 become large, then w1,w2,w3 will be similar, ► Meaning the solution to 𝑭 𝒘𝟑 is both good and legal 19 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Updates on 𝜑, 𝜆 The formula for the updates is below: D = |w1 − w3|2 Increase 𝜑, 𝜆 to force w2 to lie in the middle of w1 and w3 to ensure rapid convergence D D λ( 2 )^2 = φa( 2 ) λ D/2 γ =a a>1 φ ← φa a<1 λ ← λ/a Afterword we multiply each of φ, λ by 1.01 – this is sufficient to ensure that w1,w2,w3 converge to the same value 20 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Conclusions and Expansions LLCS optimization approach allows ► Vast expansion in number of attributes ► Ability to optimize complex linear constraints over many variables We dramatically improved speed without sacrificing model accuracy, ► We even observed small performance improvements By improving the scalability we can now more efficiently solve a broader set of problems 21 ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential. Name Julian Yarkony Name: Title : Scientist Company Experian Datalabs Company: e: t: m: ©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.