Transcript CSCE590/822 Data Mining Principles and Applications
CSCE569 Parallel Computing
Lecture 1 TTH 03:30AM-04:45PM Dr. Jianjun Hu http://mleg.cse.sc.edu/edu/csce569/ University of South Carolina Department of Computer Science and Engineering
CSCE569 Course Information
Meet time: TTH 03:30AM-04:45PM Swearingen 2A21 4 Homework Use CSE turn-in system to submit your Homework (https://dropbox.cse.sc.edu) Deadline policy 1 Midterm Exam (conceptual understanding) 1 Final Project (deliverable to your future employer!) Teamwork Implementation project/research project TA: No TA.
CSCE569 Course Information
Textbook and references Parallel Programming: for Multicore and Cluster Systems By:Thomas Rauber (Author), Gudula Rünger (Author) Publisher: Springer; 1st Edition. edition (March 10, 2010) Good reference book: Parallel Programming in C with
MPI and OpenMP
by Michael J. Quinn Most important information sources: Slides.
Grading policy 4 homeworks, 1 midterm, 1 final project, in-class participation
About Your Instructor
Dr. Jianjun Hu ( [email protected]
) Office hours: TTH 2:30-3:20PM or Drop by any time Office Phone#: 803-7777304 3A66 SWNG Background: Mechanical Engineering/CAD Machine learning/Computational intelligence/Genetic Algorithms/Genetic Programming (PhD) Bioinformatics and Genomics (Postdoc) Multi-disciplinary just as parallel computing app.
Outline
Motivation Modern scientific method Evolution of supercomputing Modern parallel computers Seeking concurrency Data clustering case study Programming parallel computers
Why You are Here?
Solve BIG problems Use Supercomputers Write parallel programs
Why Faster Computers?
Solve compute-intensive problems faster Make infeasible problems feasible Reduce design time Solve larger problems in same amount of time Improve answer’s precision Reduce design time Gain competitive advantage
Why Parallel Computing?
The massively parallel architecture of GPUs, coming from its graphics heritage, is now delivering transformative results for scientists and researchers all over the world.
For some of the world’s most challenging problems in medical research, drug discovery, weather modeling, and seismic exploration
observation.
– computation is the ultimate tool. Without it, research would still be confined to trial and error-based physical experiments and
What problems need Parallel Computing?
Parallel Computing in the Real-world Engineering Science Business Game Cloud-computing
What This course can do for You?
Understanding of parallel computer architectures Developing parallel programs for both clusters and shared memory multi-core system MPI/OpenMP Know basics of CUDA programming Learn to do performance analysis of parallel programs
Definitions
Parallel computing Using parallel computer to solve single problems faster Parallel computer Multiple-processor/core system supporting parallel programming Parallel programming Programming in a language that supports concurrency explicitly
Classical Science
Physical Experimentation Nature Observation Theory
Modern Scientific Method
Nature Observation Numerical Simulation Physical Experimentation Theory
Evolution of Supercomputing
World War II Hand-computed artillery tables Need to speed computations ENIAC Cold War Nuclear weapon design Intelligence gathering Code-breaking
Supercomputer
General-purpose computer Solves individual problems at high speeds, compared with contemporary systems Typically costs $10 million or more Traditionally found in government labs
Commercial Supercomputing
Started in capital-intensive industries Petroleum exploration Automobile manufacturing Other companies followed suit Pharmaceutical design Consumer products
CPUs 1 Million Times Faster
Faster clock speeds Greater system concurrency Multiple functional units Concurrent instruction execution Speculative instruction execution
Systems 1 Billion Times Faster
Processors are 1 million times faster Combine thousands of processors Parallel computer Multiple processors Supports parallel programming Parallel computing = Using a parallel computer to execute a program faster
Beowulf Concept
NASA (Sterling and Becker) Commodity processors Commodity interconnect Linux operating system Message Passing Interface (MPI) library High performance/$ for certain applications
Computing speed of supercomputers
Projected Computing speed of supercomputers
Top 10 Supercomputers 2010.11
GPU
What you can use
Hardware Multicore chips (2011: mostly 2 cores and 4 cores, but doubling) (cores=processors) Servers (often 2 or 4 multicores sharing memory) Clusters (often several, to tens, and many more servers not sharing memory) Supercomputer at USC CEC
Supercomputers at USC CEC
76 Compute Nodes w/ dual 3.4 GHz 64 Nodes: Dual CPU
Supercomputers at USC CEC
SGI Altix 4700 Shared-memory system
Hardware
128 Itanium Cores @ 1.6 GHz/ 8MB Cache 256 GB RAM 8TB storage NUMAlink Interconnect Fabric
Software
SUSE10 w/SGI PROPACK Intel C/C++ and Fortran Compilers VASP PBSPro scheduling software Message Passing Toolkit Intel Math Kernel Library GNU Scientific Library Boost library
Some historical machines
Earth Simulator was #1
Some interesting hardware
Nvidia Cell Processor Sicortex – “Teraflops from Milliwatts” http://www.sicortex.com/products/sc648 http://www.gizmag.com/mit-cycling-human-powered-computation/8503/
GPU-based supercomputing+CUDA
Topic1: Hardware Architecture of parallel computing system
Topic2: Programming/Software
Common parallel computing methods PBS- job scheduling system MPI: The Message Passing Interface Low level “lowest common denominator” language that the world has stuck with for nearly 20 years Can get performance, but can be a hindrance as well Pthread for multi-core shared memory parallel programming CUDA GPU programming MapReduce Google style high-performance computing
Why MPI?
MPI = “Message Passing Interface” Standard specification for message-passing libraries Libraries available on virtually all parallel computers Free libraries also available for networks of workstations or commodity clusters
Why OpenMP?
OpenMP an application programming interface (API) for shared-memory systems Supports higher performance parallel programming of symmetrical multiprocessors
Topic3: Performance
Single processor speeds for now no longer growing.
Moore’s law still allows for more real estate per core (transistors double/nearly every two years) http://www.intel.com/technology/mooreslaw/index.htm
People want performance but hard to get Slowdowns seen before speedups Flops (floating point ops / second) Gigaflops (10 9 ), Teraflops (10 12 ), Petaflops(10 15 )
Summary (1/2)
High performance computing U.S. government Capital-intensive industries Many companies and research labs Parallel computers Commercial systems Commodity-based systems
Summary (2/2)
Power of CPUs keeps growing exponentially Parallel programming environments changing very slowly Two standards have emerged MPI library, for processes that do not share memory OpenMP directives, for processes that do share memory
Places to Look
Best current news: http://www.hpcwire.com/ Huge Conference: http://sc09.supercomputing.org/ http://www.interactivesupercomputing.com
Top500.org