IBM BigInsightsIntegration Case Study

Download Report

Transcript IBM BigInsightsIntegration Case Study

IBM BigInsights
Integration Case Study
BI Meetup – March 4th, 2015
Agenda
•
•
•
•
•
Introduction
Enterprise Data Warehouse Augmentation
Big Fish’s Goals and Approach
Big Fish’s Use Cases
Advice for Going Forward
Introduction
• Overview
• Speakers
– David Darden
([email protected])
– Don Smith
([email protected])
Important Terms
• IBM BigInsights
• IBM Pure Data for Analytics (PDA aka Netezza)
• Big Data
– Volume
– Velocity
– Variety
– Veracity
– Value
Enterprise Data Warehouse
Augmentation
Enterprise Data Warehouse
Augmentation
What is EDW Augmentation? Why and when do
you need it?
– Data warehouse offloading
– Queryable archive
– Schema on the fly
– Real time and data in motion
– Deep analytics leveraging structured, semistructured, and unstructured data sources
Augmentation –
Top Business Use Cases
•
•
•
•
Landing Zone (ie. Data Lake, Data Hub)
Queryable Archive
EDW Offload
Data Exploration
IBM BigInsights Solutions
• Landing Zone / Data Lake
– Open source tools – Sqoop, Flume, MapReduce, Oozie
– BigInsights tools – Big SQL, Jaql
• EDW Offload
– Adaptors
– Federation
• Queryable Archive
– Hive
– Big SQL
– Jaql
• Data Exploration
– AQL,
– BigSheets
– Big R, SystemML
Big Fish’s Goals and Approach
Who are we?
• Big Fish Business Intelligence Team
• World's largest producer and distributor of casual games
• Big Fish has distributed more than 2.5 billion games to
customers in 150 countries
• Small, agile, business focused
• Owners of the Enterprise Data Warehouse
What are our High Level Goals?
•
•
•
•
Provide the right data to the right people at the right time
Deliver business value fast
Give people the tools they need to do their job
Minimize reliance on engineering
Business Intelligence is the ability of an organization or business to
reason, plan, predict, solve problems, think abstractly, comprehend,
innovate, and learn in ways that increase organizational knowledge,
inform decision processes, enable effective actions, and help to
establish and achieve business goals.
- David Wells
Why is this hard?
Business
• Evolves rapidly
• Expands in scope
• Transitions from prototype to mission critical
Technically
• Combines a variety of sources
• Integrates a number of systems and workflows
• Requires a diverse mix of technology
• Needs to be highly maintainable
What is our general platform
approach?
What is our general technical approach?
Development
Environment
• Biml / Mist
• Framework
• Extensions
• Accelerators
Languages
• NZ SQL / Big SQL / HiveQL / Pig / Jaql
• Bash / PowerShell
• Perl / C# / Python / R
Integration
Engine
• SQL Server Integration Services (SSIS)
• Pushdown to Massively Paralleled Platforms
Big Fish’s Use Cases
Data Ingest – Use Raw Data
• Netezza is great for data analysis
– If you know the structure of the data already
• Minimize effort for analysis
– Reduce time to determine if data has value
– Reduce time to derive value from new data
– Reduce developer involvement
• Integration with ELT Framework
Data Ingest – Process and Query
• Process the data / Add value
– Parse, extract elements, etc.
– Adding/converting dimensions such as time, date,
game, etc.
• Performance
– Hive
– Big SQL MPP
Data Ingest – Process,
Import/Export to Netezza
Moving data in/out of Netezza
Data Ingest – Process,
Import/Export to Netezza
Moving data in/out of Netezza
Other Benefits
• Migration of ODS database
– Minimally processed source data
– Used for early analysis
– Used in downstream ELT processes
• Warm Archive
– Off load historical data from Netezza but still have
it in a query-able state.
Demos
Advice for Going Forward
How did we get here?
Business
• Identified the problem we were trying to solve
• Collaborated heavily and continuously across the
organization
• Planned for success
Technically
• Established key goals and vision
• Iterated towards a solution
• Built with flexibility in mind
Questions?
Thank You!
• David Darden
([email protected])
• Don Smith
([email protected])