Transcript Content
Building a smarter planet: Financial Services Exploding Demands for Big Data, Analytics, Risk Management, Ultra-low Latency and Compute Power Requires Optimized HPC Infrastructures Robert Brinkman Infrastructure Architect for Banking and Financial Markets IBM Banking Center of Excellence Let’s build a smarter planet Panel Vikram Mehta Vice President, IBM System Networking IBM Corp. 2 Emile Werr VP, Global Data Services Global Head of Enterprise Data Architecture NYSE Euronext Dino Vitale Nick Werstiuk Director, Cross Technology Services Morgan Stanley Product Line Executive IBM Platform Computing 2009 IBM Corporation © 2012 Let’s build a smarter planet Workload Optimized Stacks Applications IBM Provided, ISVs, Partners, Custom Messaging and Security Cloud Stack Transaction Stack Appliances Low Latency & Packages Stack Grid Stack Discrete Components or Applications High Transaction Rates Specialized Workloads High Message Rates Big Data Variable Workload Complex Data Models Packaged Hardware and Software Data Value Decay Big Compute Financial Markets Industry Imperatives • Re-engineer for profitable growth: Renewed focus on the customer, Near real time analytics • Improve the trade life cycle: Cloud and business process outsourcing • Optimize enterprise risk management: Data driven transformation and common industry services 3 2009 IBM Corporation © 2012 Dino Vitale Director, Cross Technology Services Morgan Stanley Morgan Stanley: Road to Compute As a Service Trends • Maximize efficiency of compute infrastructure • Cost / run-rate • Utilization – more with less, linear scale, sharing • Operational normalization Challenges • • • • • • • Phasing Dynamic provisioning and scaling on-demand of resources to applications according to varying business needs and SLA Multi-tenant workload protection Application design and dependency management Utility charge-back model options: pay-per-use, fixed allocation, hybrid approach Sharing resources based on work load supply and demand BCP Convergence opportunities with “Big Data” • Increasing data volumes • Adaptive/real-time Scheduling • Resource management • Metrics / Data mining ON-DEMAND DATA IN HIGH PERFORMANCE ENVIRONMENT Emile Werr, VP, Global Data Services Global Head of Enterprise Data Architecture & Identity Management Technology Challenges Big Data (billions of transactions and multi-terabyte captured daily) Speed and business agility are essential to our business Different viewpoints and data patterns need to be analyzed Data coming out of a Trading Plant is not user-friendly Correlating disparate data & integration Moving large data around is expensive and complex System Capacity requirements need to efficiently handle 5x of our Avg daily volume. Data Spikes – the day after Flash Crash volume peaked over18.4 Bn transactions for NYSE Classic Matching engine (this excludes Options and other markets like Arca, Amex, Liffe, Euronext, etc.) Transaction volume growth sustained year-over-year Data needs to be readily available for a min of 7 years for Compliance It is too expensive to keep it all online Change is constant Global Data Services 4 USE CASE: Market Reconstruction for Trading Surveillance The Electronic Book (NYSE DBK) and Market Depth needs to be reconstructed and accessible via Fast Database Who Traded Ahead or Interpositioning ? This can be answered by a Database Query Price level for calculating Shares Ahead & Shares Available Data Architecture Practice Financial Services, Regulatory & Compliance Expertise Order arrived: BUY 10 @ 20.09 Trading systems generate vast transaction volumes at high speeds The GRID is utilized to transform, normalize and enrich the time-series data using massive parallel computing. This is done as EOD or Intra-Day batch processing. Date-Level Table scans (Queries) need also massive parallel processing (MPP) Appropriate technologies need to be utilized (10gb Network, Virtualized CPU/MEM, Appliance Databases, Scalable Storage Pools) 8 Data Lifecycle Management Methodology Trading Data Market Data Ref Data User Generated Data Transform, Normalize, Enrich Partition, compress and archive in storage pools Create Metadata (mappings) Data Transformation & Archive Data Capture Enterprise Systems On-Demand Data (ODD) User Analytics “Business Intelligence” Utilize MPP Databases & HDFS Integrate Reporting Tools Facilitate User Collaboration Capture Knowledge (KM) Automate Data Archive & Purge End-User Workflow Secure Data Access & Navigation Load, Extract, Stream, Filter, Transform, Purge User-driven Data Mart Provisioning (“Sandboxing”) Schema Change Capture (“Data Structure Lineage”) Global Data Services 3 Managed Data Services & Data Flow Automation Data Capture Message Bus Transformation & Archive Data Virtualization & Abstraction Business Demand Apps Admins Scale-Out Grid Fabric distributed CPU/MEM Feed Handler Analysts Data Scientists Researchers Data Provisioning Data Tools files Data Pump Continuous Flow (Trickle Batch) Storage Pools Data Services Fast Processing & Data Movement Simplified Access & Administration Common Secured Access Scalable File & Database Virtualization Automation & Workflow Reliable Hadoop Netezza Analytics Data Warehouse Standardization & Consistency Agile Framework – Metadata Driven Metering, Monitoring &Tracking 10 Let’s build a smarter planet Vikram Mehta Vice President, IBM System Networking IBM Corporation 11 2009 IBM Corporation © 2012 Let’s build a smarter planet Nick Werstiuk Product Line Executive IBM Platform Computing 12 2009 IBM Corporation © 2012 Let’s build a smarter planet Convergence of Compute and Data Workload Data type Compute Intensive Structured Data Intensive Compute and Data Intensive All – Structured + Unstructured Video, E-Mail, Web RDBMS, Fixed records Application Use Case CEP Streaming Trading Characteristics Infrastructure 13 “Real Time” Dedicated servers, Appliances, FPGAs Unstructured Risk Analytics Gaming Simulation Pricing Intraday Sentiment Analysis/CRM Genomics AML/Fraud Daily Monthly Compute grid, Data caches, In-memory grid, Shared services CPU + GPU Commodity processors + storage BI Reporting ETL Quarterly Disk & Tape, SMP & Mainframe, SAN/NAS Infrastructure Data Warehouses 2009 IBM Corporation © 2012 Let’s build a smarter planet Support for Diverse Workloads & Platforms B A Geo-spatial integration, Name classification Signal processing D C Metadata generation, File classification, Batch analysis Search, Analysis, Concept Recognition Data Intensive Apps Workload Manager C C C C C C B B A A A A A A A A C C C C C C B B A A A A A A A A C C C C C C B D D D D D D B B B B B B B B B B D D D D D D B B B Resource Orchestration 14 2009 IBM Corporation © 2012 14 Let’s build a smarter planet Why IBM Platform Symphony is faster and more scalable Latency Inefficient scheduling, polling model & heavy-weight transport protocols limit scalability. Other Grid Servers With a zero-wait time “push model” and efficient binary protocols, Symphony scales until the “wire” is saturated Symphony Scale 15 2009 IBM Corporation © 2012 Let’s build a smarter planet HPC Cloud – Multiple Approaches and Paths to Value Infrastructure Management • Cluster consolidation into an HPC Cloud • Self-service cluster provisioning and management • Workload-driven dynamic cluster Build out a more dynamic HPC infrastructure as their HPC Cloud 16 HPC “In the Cloud” • ‘Bursting’ to Cloud Providers • Hosted HPC in the cloud • Enable HPC Cloud Service Providers Leverage the public cloud opportunity, either to tap into additional resources, or offer their own HPC cloud services 2009 IBM Corporation © 2012 Building a smarter planet: Financial Services Questions