Transcript operators

InfoSphere Streams for Real Time Analytics in Financial Services Industry

Krishna Mamidipaka, [email protected]

Roger Rea, [email protected]

Housekeeping

We value your feedback - don't forget to complete your evaluation for each session you attend and hand it to the room monitors at the end of each session

Overall Conference Evaluation will be provided at the General Session on Friday

Visit the Expo Solutions Centre

Please remember this is a 'non-smoking' venue!

Please switch off your mobile phones

Please remember to wear your badge at all times

Disclaimer

The Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.

Agenda

Financial Markets Business Challenges

Industry Technical Challenges

InfoSphere Streams

Trend Calculator

Financial Toolkit

Data Mining in Real Time

InfoSphere Streams Directions

4

Firms Must Capitalize on Drivers of Change

Drivers

Markets becoming electronic

Implications

Speed as source of Alpha

Actions

Accelerate the end-to-end marketplace connectivity and execution Real-time data pressures Volume is a barrier Information availability Transaction costs pressures Transparency is required Detailed analysis of trading process Increase capacity to handle current and forecasted volumes Store, retrieve and distribute comprehensive time series data in a timely manner Access to broader markets by accessing multiple markets

5

Real time data pressures

We are in a technology arms race Latency reductions with a clear business value or cost associated Exponential increases in volumes

For US equity electronic trading brokerage 1 millisecond = $4M in annual revenue Source: Tabb Group

6

The Volume, Complexity & Semantic Depth of data that to be analysed will increase significantly

Structured data Structured & Unstructured data

Historical Trade Data Risk Analytics Data Market Data

Analytics & Insight

Tomorrow?

Risk Analytics Data Historical Trade Data Market Data Internal Message Bus Government Statistics Real World Sensors Blogs & Commentary

Analytics & Insight

Video News Feeds Corporate Press Reports Weather Data Web Pages RSS Feeds + Other Feeds

Information overload

7

The Transaction Life Cycle or latency loop – end to end latency is the key to success and there are no prizes for coming second

Investment / trading goals Transaction Cost Analysis latency measurement is a competitive advantage to deliver Alpha Market Data

WAN Connectivity

Trading Decision What to Buy/Sell

Middleware

Execution Algorithm VWAP

CEP Engines

Order Routing Decision

OMS/EMS

Matching

Exchanges

,

Speed

 Current approaches reaching limits, based on x86 and networking technologies

8

The Manycore programming challenge

Programmers cannot cope with thousands of threads and complex data flows using existing programming models I/O NET DSK DSK RAM I/O CPU

Single Core Single Thread 100% Serial Programming Yesterday

I/O NET DSK RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core

Multicore (2-16) Multithread (10s) 80/20 Serial/Parallel Programming Today Manycore (32-100s) 20/80 Serial/Parallel Programming Threading model breaks as complexity exceeds programmer capability Tomorrow 9

Options for exposing parallelism in a programming model

Parallelism Fully Exposed Partial Exposure Parallelism Implicit

  Full exposure of machine details Only usable by experts   High performance Low productivity   Limits exposure to machine details Expands programmer community   High performance Higher productivity for C/C++ class programmers - Bounds checks, pointer checks, strong typing, etc.

  No exposure of machine details, e.g., Hadoop/map reduce,

IBM Streams Processing Language

Usable by larger number of programmers   High Performance High Productivity

10

Time is ripe for a new era of computing

Emerging trends create need for new languages

– – – – – Scientific programming  Fortran Business programming  Cobol Systems programming at higher level  C Increased productivity  C++ Web programming  Java

Streaming data sources and multicore architectures

– Streams Processing Language

11

Delivering ‘Continuous Intelligence’ with Powerful Analytics

Automated Options Market Making:

Peak throughput of 10 million messages per second

Mean latency under 100 micro seconds across 28 dual quad core x86 blades

Real time delivery Powerful Analytics Millions of events per second Traditional / Non-traditional data sources Microsecond Latency

12

IBM InfoSphere Streams v1.2

Development Environment Runtime Environment Toolkits & Adapters

Eclipse IDE StreamSight Stream Debugger RHEL v5.3 or v5.4

x86 multicore hardware InfiniBand support Up to 125 servers

Front Office 3.0

Connectors to data sources Operator Library Financial Toolkit Mining Toolkit

13

Scalable stream processing

• InfoSphere Streams provides – A programming model and IDE for defining

data sources

and software analytic modules called

operators

that are fused into process execution units

(PEs)

– infrastructure to support the composition of scalable

stream processing applications

from these components – deployment and operation of these applications across distributed

x86 processing nodes

, when scaled processing is required – stream connectivity between data sources and PEs of a stream processing application

14

Trend Calculator Example

Symbols to be output

Trend File 1 playback Trend File 2 playback Trend File 3 playback

Algo Parameters Per Symbol

Up/down trend for Requested symbols

15

Streams offers tremendous deployment flexibility

With only a simple re-compile of application:

All on one machine fused into one multi-threaded process All on one machine; each operator in its own process Each operator in its own process, each process on its own machine 16

Trend Calculator Example

17

Financial Services Toolkit

Speeds development of Streams financial domain applications • • • Adapters layer used by top two layers and user-written apps Functions layer used by top layer and user-written apps Solution Frameworks are “starter” applications that target a particular use case

18

Adapters, Functions, Utilities

• • • • • Financial Information Exchange (FIX) Adapters – fixInitiator Operator, fixAcceptor Operator, FixMessageToStream Operator, StreamToFixMessage Operator WebSphere Front Office for Financial Markets (WFO) Adapters – WFOSource Operator, WFOSink Operator WebSphere MQ Low-Latency Messaging (LLM) Adapters – MQRmmSink Operator Functions: – Coefficient of Correlation – “The Greeks” (Put/Call values, Delta, Theta, Rho, Charm, DualDelta, etc.) Operators: – Wrappering QuantLib financial analytics open source package.

– Provides operators to compute theoretical value of an option: • EuropeanOptionValue Operator – 11 different analytic pricing engines – e.g. Black Scholes, Integral, Finite Differences, Binomial, Monte Carlo, etc.

• AmericanOptionValue Operator - 11 different analytic pricing engines – e.g. Barone Adesi Whaley, Bjerksund Stensland, Additive Equiprobabilities, etc.

19

Equities Trading “Starter Application”

Modular design Components are plug-replaceable – extend these or substitute your own Demonstrates how trading strategies may be swapped out at runtime, without stopping the rest of the application

TradingStrategy

module looks for opportunities that have specific quality values and trends

OpportunityFinder

module looks for opportunities and computes quality metrics

SimpleVWAPCalculator

module computes a running volume-weighted average price metric

20

Options Trading “Starter Application”

DataSources

module consumes incoming data; formats and maps for later use

Pricing

module computes theoretical put and call values

Decision

module matches theoretical values against incoming market values to identify buying opportunities Option Price Stock Price Stock Information Risk Free Rate DataSources Data Filtering and Preparation OptionsValue Decision Identification of Buying Opportunities Pricing Stock RiskFreeRate OptionsPriceFeedData Theoretical Price Computation Data Sinks

21

Multinational Mutual Funds Manager and Broker

High speed market trend calculation system that can provide instant insights into the market behavior

Improved development time from days to hours to add new features to the trend calculation system using the Streams programming model

Customizable to run on one server or distributed across many servers to garner more compute power

Visualization tools for effective live trade monitoring and risk assessment 22

making

Transforming the Information Supply Chain to reduce the time to action!

Elapsed Time to Action Analytical Modeling & Information Dashboards Planning Scorecarding Operational Reports Bus Process & Event Mgmt Reports Ad-hoc Queries SOURCES WAREHOUSE DATA INTEGRATION OPERATIONAL DATA STORES DATAMARTS

23

Stream Computing:

Analytical Modeling & Information

Reduces Time to Action Widens the aperture Reduces costs

Time to Action Operational Reports Bus Process & Event Mgmt Analytical Modeling & Information Dashboards Planning Scorecarding Reports Ad-hoc Queries

More context

SOURCES WAREHOUSE DATA INTEGRATION OPERATIONAL DATA STORES DATAMARTS

24

Market Surveillance & Fraud applications

Solution User Interface

Real time analysis processing

Solution User Interface

Rule Parameters Alerts Market Feeds and Trade Data

Enrich ment Existing business rules Additional sophisticated analytics Collected results

Historical

PMML Model Scoring 25

What are key advantages of Streams?

Language built for Streaming applications:

Reusable operators

Rapid application development

Continuous “pipeline” processing

Compiling groups of operators into single processes enables:

Efficient use of cores

Distributed execution

Very fast data exchange

• •

Can be automatic or tuned Can be scaled with the push of a button

Use the data that gives you a competitive advantage:

Can handle virtually any data type

Use data that is too expensive and time sensitive for other approaches

Easy to extend:

Built in adaptors

Extend with C++ and Java

Extend running applications

Extremely flexible and high performance transport:

Very low latency

High data rates

26

IBM InfoSphere Streams directions

Tools

Streams Studio enhancements Video/audio analytics Text/unstructured analytics Streams Processing Language improvements Native XML support

Runtime

High Availability Expanded platform support Performance improvements

Adapters

WebSphere MQ RSS feeds Mashup Hub WebSphere Business Events Oracle SQL Server MySQL

Cognos 8BI

Millions of events per second

WebSphere Business Events InfoSphere Warehouse Data in motion Front Office IBM Mashup Hub

Millisecond Latency

Existing business information

All statements regarding IBM's plans, directions, and intent are subject to change or withdrawal without notice. Any reliance on these statements are at the relying party's sole risk and will not create any liability or obligation for IBM.

27

InfoSphere Streams sessions

Time

Thursday May 20 10:45 AM - 11:35 AM Friday May 21 09:00 AM – 09:50 AM

Session Title

3666A InfoSphere Streams for Real Time Analytics in Financial Services Industry 3661A 3692A InfoSphere Streams helps Stockholm build Ver 2.0 Traffic Control System InfoSphere Streams at Marine Institute of Ireland: Deep Dive Friday May 21 11:30 AM - 12:30 PM Wednesday 10AM - 6PM Thursday 10AM - 5PM Friday 9AM - 2PM Wednesday 10:30 – 11:30 Thursday 12:30 – 13:00 Thursday 16:30 – 17:00 Demo Room

Location

Marriott Park Hotel, Room 14 Marriott Park Hotel, Room 13 Marriott Park Hotel, IOD Mini Theatre 3 InfoSphere Streams Demonstrations Marriott Park Hotel, IOD Demo Room Station 19 Mini Theater on Expo Floor InfoSphere Streams in Telco InfoSphere Streams Business Insight Leverage Warehouse, SPSS with Streams Marriott Park Hotel, InfoSphere Mini Theater Expo Floor