Platform Overlays: Enabling In-Network Stream Processing

Download Report

Transcript Platform Overlays: Enabling In-Network Stream Processing

Addressing Data Compatibility
on Programmable Network
Platforms
Ada Gavrilovska, Karsten Schwan
College of Computing
Georgia Tech
advanced network services:
Delta AirLines:
An Operational Information System
Airplane
Data Traffic
High Performance Computing
Real-Time Information Transport
Real-time Decision Tools
Simulation
Optimization
Cluster
ClusterComputing
Computing
Real
RealTime
Time
Information
InformationProcessing
Processing
capture, display,
transport, filter,
transform
Recovery and Replay
FAA
Flight
Data
Passenger
paging and
response
Real-time
Situation
Assessment
– transformation for interoperability
with external partners/heterogeneous
clients
– data integration from multiple sources
– distribution to/specialization for
multiple sinks
Equipment
Inspection
Wide-area
Transport
Airport
Airport
LAN
LAN
distributed networked
infrastructures
Scalable Robust
Services
Airport
Airport
LAN
LAN
Gate
Readers
Visualization
Storage
Operational
Flight
Displays
Baggage
Displays
Crew and
Equipment
Status
Baggage Status
Security Systems
service quality
guarantees
Need for Efficient Data Exchanges
• Data Exchanges:
– across distributed components, from
heterogeneous sources to variety of clients
– discrepancies among the data representations
used by sources, clients, or intermediate
application components
• (e.g., due to natural mismatches or due to dynamic
component evolution)
– requirements to route, combine, or otherwise
manipulate data as it is being transferred
– efficiency == perceived service quality, honoring
performance guarantees (SLAs…)
• expressed in the context of the application
Network-near service execution
• Existing middleware-level approaches
enable ‘horizontally’ flexible service
deployment solutions:
– on nodes along data path
• Our approach: enable ‘vertical’’ flexibility
by permitting applications to push service
execution closer to the network
• Our assumption: nodes have multiple
execution contexts
– smart, programmable NICs
– dedicated/specialized cores for communication
– attached network processors (Intel IXP NPs)
‘In-transit’ Data Transformations
• by default, deliver data to/from
application components on general
purpose processing context
– kernel or application (OS-bypass
capabilities)
• enable execution of middleware/application-level processing actions
jointly with communications
– use metadata to describe application
data, processing actions &
requirements, platform state…
– offload computational CPU, enable
direct data placement of needed
data, benefit from specialized
hardware…
• configure paths dynamically
– application needs, context
capabilities…
Enabling In-Transit Data Morphing
Represent application-level data
– reliable UDP in NP
– application meta-data for format description
Classification
– flexible classification based on tag consisting of network and
application-level fields
Handlers
– stream handlers – computational unit applied to application data, can
be executed jointly with fast path at well-defined application points
– may be chained -> handler chains
Operations
– support individual data manipulations as well as merging and splitting
Reconfiguration
– modify data path through platform, parameters, or deploy new codes
Execution environment
Meeting application-level quality
• every 3s deliver complete, or at least partial updates
– if latency increases, drop immediately remainder of data
item
– e.g., data reformatting…
• under heavy loads, maintain acceptable service-levles
for critical data/customers only
– discard all other data, deploy specialized `handlers’ for
critical data
– e.g., filter + downsample…
• deploy service to processing context better suited
for its execution
– service implementation and performance profile differs
based on context/resources available
– e.g., multicast…
Physical Testbed
• plus IXP2850 as
alternate switch
Handler chaining: feasibility and
complexity
100
90
Throughput (Mbps)
80
70
60
50
40
30
20
Series1
caterer
10
Series2
blocked
seats
0
1
2
Rule Chains
3
Benefit from specialized hardware
• In transit data morphing
• Other services
– data reformatting,
multicast/mirroring,
filtering…
– `database-operands’
60000
50000
Processing time (cycles)
– merging data from
multiple clients,
distributing subsets of it,
reformatting…
– varied merge criteria,
performance dominated
by merge operation, not
occupancy of hash tables
in our case
40000
30000
20000
10000
0
host, 2
ixp, 2
ixp, 3
Merging handler - location, fan-in
ixp, 4
Benefit from appropriate service
placement
Throughput (generic vs. specialized handler)
800
Output Throughput (Mbps)
700
600
500
Generic
400
Specialized
300
200
100
0
200
350
495
695
Input Stream (Mbps)
• Performance advantage:
– offload and overlap communication/computation
– deploy specialized actions
830
Conclusions and future work
• Programmable networking platforms are suitable for
efficient execution of higher-level services
• Select classes of services benefit for parallelism and
specialized hardware components available
• Flexible reconfiguration needed to address dynamics in
application interests and operating conditions
• Understanding of handler resource requirements,
efficient monitoring of platform resource capability and
compiler tools needed
• Currently integrating runtime environment underneath
an existing event-based middleware system
• Considering future (heterogeneous) multicore platforms
• Other services – e.g., virtualization
• www.cercs.gatech.edu/projects/npg
Query Performance
Scalable Data Distribution (contd.)
• Graphs…