Massive Choice Data Co-Chairs: Prasad Naik and Michel Wedel 7 Triennial Choice Symposium
Download ReportTranscript Massive Choice Data Co-Chairs: Prasad Naik and Michel Wedel 7 Triennial Choice Symposium
Massive Choice Data Co-Chairs: Prasad Naik and Michel Wedel 7th Triennial Choice Symposium Wharton Business School June 13 -17, 2007 Impetus for “Massive” Data? Technological advances (Internet, RFID) Computing advances Methodological advances Detailed data Large sample, N Many variables, p Long time-series, T Several products and SKUs, K Different Types of Massive Data Structured Data Scanner panel, Loyalty card, CRM, Click-stream Unstructured Data Text data (e.g., product reviews, blogs, complaints) Images, Music Emerging Data Types RFID, Video, social networks, recommendations, auctions, games, eye tracking, semantic Web 2.0 Is the data set just getting bigger? What is the qualitatively difference? Sometimes Nothing Just a scale up problem But the bigger size makes it harder to analyze in real-time Sometimes Everything Empty space phenomenon Statistical Inference, diagnostics, sparseness Visualization becomes tricky when p > 10 Managers and Models Managers need real-time computation decision optimization Man – Machine engagement managerial inputs plus data analyses Models need to be both Simple for quick computation (real-time decisions), Complex for realism in assumptions How? The notion of “Workbench” Model averaging, forecast combination Estimation and Computation Estimation methods Identified promising approaches for massive data analysis Inverse regression methods Regularization techniques (e.g., Lasso) Particle filters Logistic regression or Support Vector Machines Computation power Grid computing is needed waiting for fast computer is not an option Gap between industry and practice Google has 2 Million processors Directions and Action Points Incentives for academics? Industry-Academic partnerships Cross-disciplinary collaborations Thank you for this forum to share ideas! Credits Lynd Bacon (LBA Inc) Anand Bodapati (UCLA) Wagner Kamakura (Duke) Jeffrey Kreulen (IBM Research) Peter Lenk (Michigan) David Madigan (Rutgers) Alan Montgomery (CMU)