Information Management

Download Report

Transcript Information Management

Smart Data Analysis for IoT (Internet of Things) Applications

Kun-Lung Wu, Ph.D., Manager Data-Intensive Systems & Analytics Group (IBM T. J. Watson Research Center) InfoSphere Streams Language & Research (IBM SWG)

Information Management

© 2014 IBM Corporation

Information Management

As IoT applications become more pervasive, there is a real-time big data explosion Internet of Things

Everything

Almost anything can be equipped and connected to the Internet Real-Time Big Data Explosion Real-time data analysis is an integral part of many IoT applications They can generate, in real-time , streams and streams of data © 2014 IBM Corporation

Information Management

Examples of IoT Applications • Smart cities

 Traffic control, emergency management, etc •

Health care

 Aiding the elderly, ICU alert management, health monitoring via wearable devices, etc •

Agriculture & food

 Precision farming, cold chain management, etc •

Industrial applications

 Manufacturing process monitoring, engine monitoring, etc •

Environmental monitoring

 Water, Waste, Air Quality, etc 3 •

Retail applications

© 2014 IBM Corporation

Information Management

What is different in IoT data?

There are many extremes

There are greater amounts

of data

Volume

Process and act on data

more quickly in real time

Velocity

Use

more types

data

Variety

Use

uncertain

data

Veracity

© 2014 IBM Corporation

Information Management Traditional versus IoT Big Data Traditional Approach IoT Big Data Approach

Analyzed Information Available Information

Analyze Small Subsets of Information

Analyze ALL Available Information

Analyze All Information Leverage more of the data being captured

© 2014 IBM Corporation

Information Management Traditional versus IoT Big Data Traditional Approach IoT Big Data Approach

Analyzed Information Analyzed Information A Small Amount of Carefully Cleansed Information

Carefully Cleanse Information Before Any Analysis

A Very Large Amount of Messy Information

Analyze Information As Is, Cleanse As Needed Reduce effort required to leverage data

© 2014 IBM Corporation

Information Management Traditional versus IoT Big Data Traditional Approach IoT Big Data Approach

Analyze data AFTER it has been processed and landed in a Warehouse or Mart Analyze data IN MOTION as it is generated, in real-time Leverage data as it is captured

© 2014 IBM Corporation

Information Management RE 8

Standard assumptions

Clean and correct data Transactional guarantees Normalized, structured data Explicit relationships kept ACID properties Centrally managed storage Store-and-process Reliable hardware Query, insert, delete with SQL Reference/context data on disk

Re-think for IoT data analysis

Take advantage of and tolerate uncertainty Good enough Store data in elemental form Relationships found at query Relaxed constraints Loosely distributed data Process in motion Built with full expectation of failures Query, operators, analytics at point of data Reference and context data in memory © 2014 IBM Corporation

Information Management

From data at rest to data in motion

Data at 9 Data in © 2014 IBM Corporation

Information Management

IBM InfoSphere Streams Delivers Real-Time Analytics For Big Data In Motion

Real time delivery

Volume Terabytes per second Petabytes per day

ICU Monitoring Environment Monitoring Algorithmic Trading Powerful Analytics Cyber Security Government / Law enforcement Telco Churn Prediction Smart Grid

Variety All kinds of data All kinds of analytics

Millions of events per second Microsecond Latency

Velocity Insights in microseconds

Traditional / Non-traditional data sources Example Streaming Data Sources: Video, audio, networks, social media

© 2014 IBM Corporation

Information Management Big Data in Real Time with Stream Processing Filter / Sample Modify Annotate Analyze Fuse Classify Score Windowed Aggregates © 2014 IBM Corporation

Information Management InfoSphere Streams: For superior real time analytic processing

Streams Processing Language (SPL) built for Streaming applications: Reusable operators Rapid application development Continuous “pipeline” processing

Use the data that gives you a competitive advantage:

Can handle virtually any data type Use data that is too expensive and time sensitive for traditional approaches

Compile groups of operators into single processes:

Efficient use of cores Distributed execution Very fast data exchange Can be automatic or tuned Scaled with push of a button

12 Easy to extend:

Built in adaptors Users add capability with familiar C++ and Java

Easy to manage:

Automatic placement Extend applications incrementally without downtime Multi-user / multiple applications

Flexible and high performance transport:

Very low latency High data rates

Dynamic analysis:

Programmatically change topology at runtime Create new subscriptions Create new port properties

© 2014 IBM Corporation

13 Information Management

What Are People Doing With Streams?

Telephony

 CDR processing  Social analysis  Churn prediction  Geomapping

Transportation

 Intelligent traffic management

Stock market

 Impact of weather on securities prices  Analyze market data at ultra-low latencies

Law Enforcement,

Defense & Cyber-Security

Real-time multimodal surveillance

 Situational awareness  Cyber security detection

Smart Grid & Energy

 Transactive control  Phasor Monitoring Unit

Health & Life Sciences

 Neonatal ICU monitoring  Epidemic early warning system  Remote healthcare monitoring

Natural Systems

 Wildfire management  Water management

Fraud prevention

Detecting multi-party fraud

Real-time fraud prevention

e-Science

 Space weather prediction  Detection of transient events 

Synchrotron atomic research

Other

 Manufacturing  Text Analysis  Who’s Talking to Whom?

 ERP for Commodities  FPGA Acceleration © 2014 IBM Corporation

14 Information Management Asian telco reduces billing costs and improves customer satisfaction

Problem

: Call volume increased to the point that batch processing in a warehouse no longer worked 1) Too expensive, 2) too slow, and 3) no capacity left for BI

Solution:

Real-time mediation and analysis of

8B CDRs per day

Data processing time reduced

12 hrs to 1 sec

from

Hardware cost reduced to 1/8 th

Further enabled: Proactively addressing issues impacting customer satisfaction, real time offers based on usage © 2014 IBM Corporation

Information Management

Harnessing the Largest Predictive Focus Group in the World

Purpose

– Understand public sentiment towards an event: movie trailers – Deeply understand the potential customer profile: gender, occupation, intent to watch – Alter marketing launch plans based on insight 

Background

– 1.1 Billion Tweets analyzed – 5.7 Million blogs/forum posts – 3.5 million messages – Also: Facebook, Google+, Tumblr, Flickr © 2014 IBM Corporation

Information Management University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner 16

“Helps detect life threatening conditions up to 24 hours sooner”

• Performing real-time analytics using physiological data from neonatal babies • Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner • Early warning gives caregivers the ability to proactively deal with complications © 2014 IBM Corporation

Information Management Challenges and opportunities  Approach overload – Is there a convergence of approaches?

– Is there a “write once, use any technology” approach across tool types  Skills to apply techniques – Reduce the skill required?

– More people who can be data scientists, developers, and business/domain savvy?  Uncertain data – Confidence levels need to follow data and decisions  New analytic algorithms – Real time learning and adaptation?

– More automation  Availability – What does it mean for in-memory systems?

– How should disaster recovery work?

 Cloud – Security of Data – Data movement  Data governance, security, and privacy  What new problems can we solve?

© 2014 IBM Corporation

Information Management

To Learn more

Resources – Streams: streamsDev – IBM Big Data: ibm.com/bigdata – IBMBigDataHub.com

– BigDataUniversity.com

– Books / analyst papers © 2014 IBM Corporation

19 Information Management

Try Stream Processing

http://Ibm.co/streamsqs 2 download options!

© 2014 IBM Corporation

Information Management 20 © 2014 IBM Corporation