Transcript Content
Building a smarter planet: Financial Services
Exploding Demands for Big Data, Analytics, Risk
Management, Ultra-low Latency and Compute Power
Requires Optimized HPC Infrastructures
Robert Brinkman
Infrastructure Architect for Banking and
Financial Markets
IBM Banking Center of Excellence
Let’s build a smarter planet
Panel
Vikram Mehta
Vice President, IBM System Networking
IBM Corp.
2
Emile Werr
VP, Global Data Services
Global Head of Enterprise Data Architecture
NYSE Euronext
Dino Vitale
Nick Werstiuk
Director, Cross Technology Services
Morgan Stanley
Product Line Executive
IBM Platform Computing
2009 IBM Corporation
© 2012
Let’s build a smarter planet
Workload Optimized Stacks
Applications
IBM Provided, ISVs, Partners, Custom
Messaging and Security
Cloud
Stack
Transaction
Stack
Appliances Low Latency
& Packages
Stack
Grid
Stack
Discrete
Components or
Applications
High
Transaction
Rates
Specialized
Workloads
High Message
Rates
Big Data
Variable
Workload
Complex Data
Models
Packaged
Hardware and
Software
Data Value
Decay
Big Compute
Financial Markets Industry Imperatives
• Re-engineer for profitable growth: Renewed focus on the customer, Near real
time analytics
• Improve the trade life cycle: Cloud and business process outsourcing
• Optimize enterprise risk management: Data driven transformation and common
industry services
3
2009 IBM Corporation
© 2012
Dino Vitale
Director, Cross Technology Services
Morgan Stanley
Morgan Stanley: Road to Compute As a Service
Trends
•
Maximize efficiency of compute infrastructure
• Cost / run-rate
• Utilization – more with less, linear scale, sharing
• Operational normalization
Challenges
•
•
•
•
•
•
•
Phasing
Dynamic provisioning and scaling on-demand of resources to applications according to
varying business needs and SLA
Multi-tenant workload protection
Application design and dependency management
Utility charge-back model options: pay-per-use, fixed allocation, hybrid approach
Sharing resources based on work load supply and demand
BCP
Convergence opportunities with “Big Data”
•
Increasing data volumes
• Adaptive/real-time Scheduling
• Resource management
• Metrics / Data mining
ON-DEMAND DATA IN HIGH PERFORMANCE ENVIRONMENT
Emile Werr, VP, Global Data Services
Global Head of Enterprise Data Architecture & Identity Management
Technology Challenges
Big Data (billions of transactions and multi-terabyte captured daily)
Speed and business agility are essential to our business
Different viewpoints and data patterns need to be analyzed
Data coming out of a Trading Plant is not user-friendly
Correlating disparate data & integration
Moving large data around is expensive and complex
System Capacity requirements need to efficiently handle 5x of our Avg daily
volume.
Data Spikes – the day after Flash Crash volume peaked over18.4 Bn transactions
for NYSE Classic Matching engine (this excludes Options and other markets like
Arca, Amex, Liffe, Euronext, etc.)
Transaction volume growth sustained year-over-year
Data needs to be readily available for a min of 7 years for Compliance
It is too expensive to keep it all online
Change is constant
Global Data Services
4
USE CASE: Market Reconstruction for Trading Surveillance
The Electronic Book (NYSE DBK) and Market Depth needs to be reconstructed and
accessible via Fast Database
Who Traded Ahead or Interpositioning ? This can be answered by a Database Query
Price level for calculating Shares Ahead
& Shares Available
Data Architecture Practice
Financial Services, Regulatory & Compliance Expertise
Order arrived:
BUY 10 @ 20.09
Trading systems generate vast transaction volumes at high speeds
The GRID is utilized to transform, normalize and enrich the time-series data using massive parallel
computing. This is done as EOD or Intra-Day batch processing.
Date-Level Table scans (Queries) need also massive parallel processing (MPP)
Appropriate technologies need to be utilized (10gb Network, Virtualized CPU/MEM, Appliance
Databases, Scalable Storage Pools)
8
Data Lifecycle Management Methodology
Trading Data
Market Data
Ref Data
User Generated Data
Transform, Normalize, Enrich
Partition, compress and archive in storage pools
Create Metadata (mappings)
Data Transformation &
Archive
Data Capture
Enterprise
Systems
On-Demand Data
(ODD)
User Analytics
“Business Intelligence”
Utilize MPP Databases & HDFS
Integrate Reporting Tools
Facilitate User Collaboration
Capture Knowledge (KM)
Automate Data Archive & Purge
End-User
Workflow
Secure Data Access & Navigation
Load, Extract, Stream, Filter, Transform, Purge
User-driven Data Mart Provisioning (“Sandboxing”)
Schema Change Capture (“Data Structure Lineage”)
Global Data Services
3
Managed Data Services & Data Flow Automation
Data Capture
Message
Bus
Transformation & Archive
Data Virtualization & Abstraction
Business Demand
Apps
Admins
Scale-Out Grid Fabric
distributed CPU/MEM
Feed
Handler
Analysts
Data Scientists
Researchers
Data
Provisioning
Data
Tools
files
Data
Pump
Continuous Flow
(Trickle Batch)
Storage Pools
Data Services
Fast Processing &
Data Movement
Simplified Access &
Administration
Common Secured
Access
Scalable
File & Database
Virtualization
Automation &
Workflow
Reliable
Hadoop
Netezza
Analytics Data Warehouse
Standardization & Consistency
Agile Framework – Metadata
Driven
Metering, Monitoring &Tracking
10
Let’s build a smarter planet
Vikram Mehta
Vice President, IBM System Networking
IBM Corporation
11
2009 IBM Corporation
© 2012
Let’s build a smarter planet
Nick Werstiuk
Product Line Executive
IBM Platform Computing
12
2009 IBM Corporation
© 2012
Let’s build a smarter planet
Convergence of Compute and Data
Workload
Data type
Compute Intensive
Structured
Data Intensive
Compute and Data
Intensive
All – Structured + Unstructured
Video, E-Mail, Web
RDBMS, Fixed records
Application
Use Case
CEP
Streaming
Trading
Characteristics
Infrastructure
13
“Real Time”
Dedicated servers,
Appliances, FPGAs
Unstructured
Risk Analytics
Gaming Simulation
Pricing
Intraday
Sentiment
Analysis/CRM
Genomics
AML/Fraud
Daily
Monthly
Compute grid,
Data caches, In-memory grid,
Shared services CPU + GPU
Commodity processors + storage
BI
Reporting
ETL
Quarterly
Disk & Tape,
SMP & Mainframe,
SAN/NAS Infrastructure
Data Warehouses
2009 IBM Corporation
© 2012
Let’s build a smarter planet
Support for Diverse Workloads & Platforms
B
A Geo-spatial integration,
Name classification
Signal processing
D
C
Metadata generation,
File classification,
Batch analysis
Search, Analysis,
Concept Recognition
Data Intensive Apps
Workload Manager
C
C
C
C
C
C
B
B
A
A
A
A
A
A
A
A
C
C
C
C
C
C
B
B
A
A
A
A
A
A
A
A
C
C
C
C
C
C
B
D
D
D
D
D
D
B
B
B
B
B
B
B
B
B
B
D
D
D
D
D
D
B
B
B
Resource Orchestration
14
2009 IBM Corporation
© 2012
14
Let’s build a smarter planet
Why IBM Platform Symphony is faster and more scalable
Latency
Inefficient scheduling, polling model
& heavy-weight transport protocols
limit scalability.
Other
Grid Servers
With a zero-wait time “push model” and
efficient binary protocols, Symphony
scales until the “wire” is saturated
Symphony
Scale
15
2009 IBM Corporation
© 2012
Let’s build a smarter planet
HPC Cloud – Multiple Approaches and Paths to Value
Infrastructure
Management
• Cluster consolidation into
an HPC Cloud
• Self-service cluster
provisioning and
management
• Workload-driven dynamic
cluster
Build out a more dynamic HPC
infrastructure as their HPC Cloud
16
HPC “In the
Cloud”
• ‘Bursting’ to Cloud Providers
• Hosted HPC in the cloud
• Enable HPC Cloud Service
Providers
Leverage the public cloud opportunity,
either to tap into additional resources,
or offer their own HPC cloud services
2009 IBM Corporation
© 2012
Building a smarter planet: Financial Services
Questions