AMMPI - Summary • Active Messages–2 (AM) implementation over MPI version 1.1 – Porting is trivial - works on virtually any platform.
Download ReportTranscript AMMPI - Summary • Active Messages–2 (AM) implementation over MPI version 1.1 – Porting is trivial - works on virtually any platform.
AMMPI - Summary • Active Messages–2 (AM) implementation over MPI version 1.1
– Porting is trivial - works on virtually any platform that has MPI 1.1
– Often provides very high performance – vendors tune their MPI well – Linux/Myrinet, MPICH, IBM SP3, Origin 2000, Cray T3E, many others…
• Based on the AMUDP code base, same cool features
– Robust, clear error reporting for ease of debugging – SPMD bootstrapping library (but we use site-specific mpirun) – Network performance/utilization monitoring API
• MPI Interface
– Non-blocking sends, non-blocking receives – Uses MPI communicators to co-exist happily with other MPI-aware layers
AMMPI – Latency Performance
ROCKS MPI-Myrinet (naïve) ROCKS MPI-Myrinet (pre-post recv) ROCKS MPI-Myrinet (full) Millennium MPI-TCP (Gigabit Ethernet) Millennium *AMUDP* (Gigabit Ethernet) NOW MPI-AM-Myrinet Cray MPI-shmem Origin 2000 - NCSA SP3 - Blue Horizon (within node) SP3 - Seaborg (within node) SP3 - Blue Horizon (between nodes) SP3 - Seaborg (between nodes - us) SP3 - Seaborg (between nodes - ip) 0 20 Good 40
AMMPI Round-trip Latency
60
microseconds
80 100 28 35 33 Bad 120 115 120 31 32 36 46 56 58 140 320 273 minimal small message, round-trip time measured from application
AMMPI – Bandwidth Performance
ROCKS MPI-Myrinet (full) Millennium MPI-TCP (Gigabit Ethernet) Millennium *AMUDP* (Gigabit Ethernet) NOW MPI-AM-Myrinet Cray MPI-shmem Origin 2000 - NCSA SP3 - Blue Horizon (within node) SP3 - Seaborg (within node) SP3 - Blue Horizon (between nodes) SP3 - Seaborg (between nodes - us) SP3 - Seaborg (between nodes - ip) 0 20 50 40 47 60 Bad 100
AMMPI Bandwidth MB/sec
150 200 112 80 93 170 167 Good 250 240 267 300 with 64 KB messages == MAX_MEDIUM == MAX_LONG
AMMPI – Raw Performance Data
ROCKS MPI-Myrinet (naïve) ROCKS MPI-Myrinet (pre-post recv) ROCKS MPI-Myrinet (full) Millennium MPI-TCP (Gigabit Ethernet) Millennium *
AMUDP* (Gigabit Ethernet)
NOW MPI-AM-Myrinet Cray MPI-shmem Origin 2000 - NCSA Round-trip Latency (us) 35 33 28 320 120 115 36 46 SP3 - Blue Horizon (within node) SP3 - Seaborg (within node) SP3 - Blue Horizon (between nodes) SP3 - Seaborg (between nodes - us) SP3 - Seaborg (between nodes - ip) 31 32 56 58 273 Pipelined Inverse Throughput (us) Bandwidth (MB/s) - 64 KB msgs 17 229 27 49 32 27 18 20 51 27 152 112 20 60 40 93 80 267 240 170 167 47 notes buffered send, probe/block recv buffered send, pre-posted non-blocking recv latency highly variable, no MPI-Myrinet yet (Gigabit upgrade - used to get 12 MB/sec and 250-300 us on 100 Mbit) AM-over-MPI-over-AM-over-Myrinet (native AM is 20-30 us RT latency, hardw are B/W 40MB/sec)