Data Warehousing Case Study Akamai Technologies, Inc.

Download Report

Transcript Data Warehousing Case Study Akamai Technologies, Inc.

Data Warehousing
Case Study
Akamai Technologies, Inc.
Background
• In 1997, Tom Leighton (MIT Professor Applied Mathematics) and
Danny Lewin (MIT Graduate Student), along with others,
developed mathematical algorithms to handle the dynamic
routing of web content.
• In 1998, the group entered the annual MIT $50K
Entrepreneurship Competition, where the company's business
proposition was selected as one of 6 finalists among 100 entries.
• In April of 1999, Akamai launched it’s commercial service,
FreeFlow, for Yahoo! – Akamai’s 1st and charter customer.
Akamai Today
• Today, Akamai has over 1000 customer in countries all over the
world.
• Akamai's intelligent edge platform for content, streaming media,
and application delivery comprises more than 11,600 servers
within over 820 networks in 62 countries.
Reporting @ Akamai
• Company Growth of over 50% per Quarter from 1999 to 2001.
• Assets (Servers, Switches, etc.) in hundreds of Networks around
the World.
• Increased Product Lines from 1 Product (FreeFlow) to more than
a dozen Products (FreeFlow Streaming, Edgesuite, FirstPoint,
etc.).
• Internal Growth from one hundred employees to thousands in
one year.
• Internet Growth (dot.com) explosive through 2000.
Reporting @ Akamai, cont.
• Internet Bubble Explodes in March 2001, causing a backlash on
the Companies who serviced dot.coms
• Customer churn (cancellation) increases rapidly.
• Revenue collected from bankrupt customers declines.
• Accurate and Comprehensive Data to Base Management
Decisions becomes CRITICAL.
• Management Reporting Initiative (MRI) is born.
MRI Organization
MRI Team
Tim Weller
CFO
Executive Project Sponser
Todd Sewards
Director, Internal Applications
Technical Project Manager
Sara Stonner
Project Manager, Data Warehousing
Technical Team Leader
Lan Fang
Applications Developer
ETL Specialist
Contractor B
Contractor
ETL Developer
Contractor C
Contractor
ETL Developer
Contractor A
Contractor
ETL Developer
Chris Fiello
Director, Revenue Operations
Business Project Manager
Daniel Kim
Senior Business Systems Analyst
Senior Business Systems Analyst
Feng Tsang
DBA
Oracle DBA
Lauren Cherkas
Manager, Contracts Management
Subject Matter Expert (SME)
Mike DePrizio
Senior Applications Developer
Infrastructure Specialist
Stephanie Callini
Manager, Revenue Analysis
Subject Matter Expert
Where do you start??
• Prioritization Process
– Identify pain
– Determine readiness
– Data maturity
– Size
– Complexity
• In the end, who do you choose?
Requirements Gathering
• Requirements Gathering Team composed of Technical Leader
(myself) and Business Systems Analyst began a 2 month
process of gathering requirements
– Identified key verticals within company
– Identified single points of contact (SPOC) within vertical
– Identified subject matter experts (SME) within organization
– Identified key stakeholders within organization
– Conducted interviews, JAD sessions and working sessions
with individuals and groups as appropriate.
• Compiled 100+ pages of Requirements from the Business
Community.
Scope and Project Charter
• Defined Scope based on Requirements (Scope Creep!!!)
• Developed Project Charter defining
– Project Scope
– Project Organization
– Critical Success Factors
– Assumptions and Constraints
– Risks
– Issues
• Sign off from Executive Management and Project Sponsors
Technical Architecture
• Vendor selection
– ETL: Informatica PowerMart 4.7
– Front-end: Brio.Insight 6.3
– Middle-ware: Brio OnDemand Server 6.3
– Database: Oracle 8.1.7
– Database Design: ERWin 3.52
• Software/Hardware Procurement and Implementation
– 3 Solaris SPARC 2.7 boxes
– 750 GB Storage Area Network (SAN)
Technical Architecture
Project Plan
• Battle between the Technical Team and the Executive Sponsors
– Executive Sponsors couldn’t understand why it would take so
long to launch this new Enterprise Data Warehouse
– Technical Team was not proficient in the new technology, nor
were they staffed to accommodate the requested timeline (2
months requirements to rollout)
• Result = $$$ to hire Contractors
– Contractors require detailed ETL documentation
– Law of Diminishing returns
– Knowledge transfer from Contractors to DW Team Members
Project Plan, cont.
•
2-1/2 months to complete from April 30th (begin requirements gathering) to
July 16th (rollout)
– Project Definition – 1 week
– Requirements – 1 month
– Technical Analysis – 0 days
– Technical Design and Infrastructure Implementation – 2 months
– Data Model – 2 weeks
– Source to Target Mapping Document – 2 weeks
– ETL Coding – 4 weeks
– System and Unit Testing – 2 weeks
– UAT – 3 weeks
– Rollout – 1 week
Project Discrepancies
•
•
•
•
•
•
•
Support???
Bug Fixes???
Enhancement Requests???
Security Review???
Issue Resolution???
Dirty Data???
Broken and Undefined Business Processes….
Data Model
Source to Target Mapping
Document
• 50+ Pages of “instructions” on HOW to code the Data Mart
• Constantly changing
Project Execution
• 3 months for Development and Unit Testing
• 2 months (and counting) for User Acceptance Testing
• Rolled out to User Community August 6th, 2001 (nearly one
month late)
• Report Development is on-going, with a dozen reports published
and more coming in each day
• Bug queue is manageable
• Enhancement requests continue to pile up
Lessons Learned
• Allow the majority of the Project Plan to be consumed by:
– Requirements Analysis
– QA
• Maintain scope at all costs
• Never assume the data is correct or clean
• Understand that when user’s describe a “Process” that that
“Process” was not always in place
• Determine from the beginning how much historical data will be
included in the data mart
Lessons Learned, cont.
• Write down the goals of the Data Mart and pin them on the wall –
look at them EVERY day
• Write down EVERYTHING
• Know your team
• NEVER use a Data Warehouse to “smoke out broken or
undefined Business Processes”
• NEVER code for the Exception