ppt - University of Connecticut
Download
Report
Transcript ppt - University of Connecticut
Storage Systems in HPC
John A. Chandy
Department of Electrical and Computer Engineering
University of Connecticut
Research Summary
• Storage Systems
– Active Storage
– Parallel File Systems
– Reliable Data Storage
– Active Storage Networks
Storage Systems
• Parallel Computing
– Building parallel file systems to support HPC
– Computation at the storage node
– Data organization methods to improve performance
• Reliable Data Storage
– Customizable and extensible storage for reliability
– Backup strategies using personal storage devices
– Data security, trust, and reliability in the cloud
Parallel File Systems
• Network Attached Storage
– Put the storage on the network with a
computer (server) acting as the go-between
Network
Parallel File Systems
• Separate the metadata from the storage
Metadata
Network
Parallel File Systems
• How do you improve metadata performance?
– Distribute metadata services on data nodes
– Use active storage and object services
Active Storage
• Allows us to run applications on storage nodes
• Can dramatically reduce data traffic
– Eliminate large network latencies
• Take advantage of fast RAID arrays and SSDs
– Drives bottle-necked by slow networks
• Run applications in parallel across multiple nodes
• Make use of unused processor time
Programming Model
• Based on object storage
• RPC based
– Executable objects
– RPC calls have full access to all object
functions – read, write, create, set attribute,
etc.
• Functions can be synchronous or async
• Supports multiple languages (C, Java,
Python)
Programming Model
• Based on work by Acharya, Riedel - Stream based
• Our model is Remote Procedure Call (RPC) based
o Use executable objects
o Added command to begin execution
o Allow full access to all OSD functions
• Functions can be run sync or async
o Due to iSCSI 30sec timeout
o Working to allow queries for async
• Allow parallel execution using async
• Support multiple languages (c, java, python)
Security
• Multiprocess implementation
– Limits AS functions from directly accessing
objects
– Limits access to the object services library
– Enforces use of object security mechanisms
• chroot sandboxing
– C/Java engines run in a chroot directory
– Allows limited system libraries – e.g. libc
Security
• Multiprocess Implementation
o Limits AS functions from directly accessing objects
o Limits access to the OSD services library
Forces the use of RPC
o Enforces the use of OSD security mechanisms
• Chroot Sandboxing
o Applied to engines
o Limits engines inside a single directory
o Allows limiting of libraries
AS versions of libraries possible
Active Storage Code Example
Results: AES Local vs. Active Storage
Results: Scaling with Multiple OSDs
Results: C vs. Java
High Performance Computing
• Active storage network
– Computing in the network
– SIMD-like processing of data in motion
– Adaptive computing network elements
– Application optimizations for database queries, scientific applications,
data mining, sort, etc.
Active Storage Networks
Data Sort
BECAT Collaboration
• Large Data Problems
• Parallel File Systems Implementation