Grids Challenged by a Web 2.0 and Multicore Sandwich CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive.
Download ReportTranscript Grids Challenged by a Web 2.0 and Multicore Sandwich CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive.
Grids Challenged by a Web 2.0 and Multicore Sandwich CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org 1 Abstract Grids provide managed support for distributed Internet Scale services and although this is clearly a broadly important capability, adoption of Grids has been slower than perhaps expected. Two important trends are Web 2.0 and Multicore that have tremendous momentum and large natural communities and that both overlap in important ways with Grids. Web 2.0 has services but does not require some of the strict protocols needed by Grids or even Web Services. Web 2.0 offers new approaches to composition and portals with a rich set of user oriented services. Multicore anticipates 100’s of core per chip and the ability and need to “build a Grid on a chip”. This will use functional parallelism that is likely to derive its technologies from parallel computing and not the Grid realm. We discuss a Grid future that embraces Web 2.0 and multicore and suggests how it might need to change. Virtual Machines (Virtualization) is another important development which has attracted more interest than Grids in Enterprise scene 2 e-moreorlessanything is an Application ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ from its inventor John Taylor Director General of Research Councils UK, Office of Science and Technology e-Science is about developing tools and technologies that allow scientists to do ‘faster, better or different’ research Similarly e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. This generalizes to e-moreorlessanything A deluge of data of unprecedented and inevitable size must be managed and understood. People (see Web 2.0), computers, data and instruments must be linked. On demand assignment of experts, computers, networks and storage resources must be supported 3 Role of Cyberinfrastructure Supports distributed science – data, people, computers Exploits Internet technology (Web2.0) adding (via Grid technology) management, security, supercomputers etc. It has two aspects: parallel – low latency (microseconds) between nodes and distributed – highish latency (milliseconds) between nodes Parallel needed to get high performance on individual 3D simulations, data analysis etc.; must decompose problem Distributed aspect integrates already distinct components Cyberinfrastructure is in general a distributed collection of parallel systems Cyberinfrastructure is made of services (often Web services) that are “just” programs or data sources packaged for distributed access 4 Not so controversial Ideas Distributed software systems are being “revolutionized” by developments from e-commerce, e-Science and the consumer Internet. There is rapid progress in technology families termed “Web services”, “Grids” and “Web 2.0” The emerging distributed system picture is of distributed services with advertised interfaces but opaque implementations communicating by streams of messages over a variety of protocols • Complete systems are built by combining either services or predefined/preexisting collections of services together to achieve new capabilities Note messaging (MPI and some thread systems) interesting in parallel computing to support either “safe concurrency without side effects” or distributed memory We can use the term Grids strictly (Narrow or even more strictly OGSA Grids) or just call any collections of services as “Broad Grids” which actually is quite often done – in this talk Grid means Narrow or Web service Grid Web 2.0 and Web Services I Web Services have clearly defined protocols (SOAP) and a well defined mechanism (WSDL) to define service interfaces • There is good .NET and Java support • The so-called WS-* specifications provide a rich sophisticated but complicated standard set of capabilities for security, fault tolerance, metadata, discovery, notification etc. “Narrow Grids” build on Web Services and provide a robust managed environment with growing adoption in Enterprise systems and distributed science (so called e-Science) Web 2.0 supports a similar architecture to Web services but has developed in a more chaotic but remarkably successful fashion with a service architecture with a variety of protocols including those of Web and Grid services • Over 400 Interfaces defined at http://www.programmableweb.com/apis Web 2.0 also has many well known capabilities with Google Maps and Amazon Compute/Storage services of clear general relevance There are also Web 2.0 services supporting novel collaboration modes and user interaction with the web as seen in social networking sites, portals, MySpace, YouTube, Web 2.0 and Web Services II I once thought Web Services were inevitable but this is no longer clear to me Web services are complicated, slow and non functional • WS-Security is unnecessarily slow and pedantic (canonicalization of XML) • WS-RM (Reliable Messaging) seems to have poor adoption and doesn’t work well in collaboration • WSDM (distributed management) specifies a lot There are de facto standards like Google Maps and powerful suppliers like Google which “define the rules” One can easily combine SOAP (Web Service) based services/systems with HTTP messages but the “lowest common denominator” suggests additional structure/complexity of SOAP will not easily survive Applications, Infrastructure, Technologies The discussion is confused by inconsistent use of terminology – this is what I mean Multicore, Narrow and Broad Grids and Web 2.0 (Enterprise 2.0) are technologies These technologies combine and compete to build infrastructures termed e-infrastructure or Cyberinfrastructure • Although multicore can and will support “standalone” clients probably most important client and server applications of the future will be internet enhanced/enabled so key aspect of multicore is its role and integration in e-infrastructure e-moreorlessanything is an emerging application area of broad importance that is hosted on the infrastructures e-infrastructure or Cyberinfrastructure Attack of the Killer Multicores Today commodity Intel systems are sold with 8 cores spread over two processors Specialized chips such as GPU’s and IBM Cell processor have substantially more cores Moore’s Law implies and will be satisfied by and imply exponentially increasing number of cores doubling every 1.5-3 Years • Modest increase in clock speed • Intel has already prototyped a 80 core Server chip ready in 2011? Huge activity in parallel computing programming (recycled from the past?) • Some programming models and application styles similar to Grids We will have a Grid on a chip ……………. 9 IBM Cell Processor • This supports pipelined (through 8 cores) or data parallel operations distributed on 8 SPE’s Applications running well on Cell or AMD GPU should run scalably on future mainline multicore chips Focus on memory bandwidth key (dataflow not deltaflow) PC07Intro [email protected] 10 Grids meet Multicore Systems The expected rapid growth in the number of cores per chip has important implications for Grids With 16-128 cores on a single commodity system 5 years from now one will both be able to build a Grid like application on a chip and indeed must build such an application to get the Moore’s law performance increase • Otherwise you will “waste” cores ….. One will not want to reprogram as you move your application from a 64 node cluster or transcontinental implementation to a single chip Grid However multicore chips have a very different architecture from Grids • Shared not Distributed Memory • Latencies measured in microseconds not milliseconds Thus Grid and multicore technologies will need to “converge” and converged technology model will have different requirements from current Grid assumptions 11 Grid versus Multicore Applications It seems likely that future multicore applications will involve a loosely coupled mix of multiple modules that fall into three classes • Data access/query/store • Analysis and/or simulation • User visualization and interaction This is precisely mix that Grids support but Grids of course involve distributed modules Grids and Web 2.0 use service oriented architectures to describe system at module level – is this appropriate model for multicore programming? Where do multicore systems get their data from? 12 RMS: Recognition Mining Synthesis Recognition Mining Synthesis What is …? Is it …? What if …? Model Find a model instance Create a model instance Today Model-less Real-time streaming and transactions on static – structured datasets Very limited realism Tomorrow Model-based multimodal recognition Real-time analytics on dynamic, unstructured, multimodal datasets Photo-realism and physics-based animation Intel has probably most sophisticated analysis of future “killer” multicore applications Pradeep K. Dubey, [email protected] 13 Recognition What is a tumor? Mining Synthesis Is there a tumor here? What if the tumor progresses? It is all about dealing efficiently with complex multimodal datasets Images courtesy: http://splweb.bwh.harvard.edu:8000/pages/images_movies.html Pradeep K. Dubey, [email protected] 14 Intel’s Application Stack PC07Intro [email protected] 15 Role of Data in Grid/Multicore I One typically is told to place compute (analysis) at the data but most of the computing power is in multicore clients on the edge These multicore clients can get data from the internet i.e. distributed sources • This could be personal interests of client and used by client to help user interact with world • It could be cached or copied • It could be a standalone calculation or part of a distributed coordinated computation (SETI@Home) Or they could get data from set of local sensors (videocams and environmental sensors) naturally stored on client or locally to client 16 Role of Data in Grid/Multicore Note that as you increase sophistication of data analysis, you increase ratio of compute to I/O • Typical modern datamining approach like Support Vector Machine is sophisticated (dense) matrix algebra and not just text matching • http://grids.ucs.indiana.edu/ptliupages/presentations/PC2007/PC07BYOPA.ppt Time complexity of Sophisticated data analysis will make it more attractive to fetch data from the Internet and cache/store on client • It will also help with memory bandwidth problems in multicore chips In this vision, the Grid “just” acts as a source of data and the Grid application runs locally 17 Three styles of Multicore “Jobs” A1 A2 A3 A4 B C D1 D2 E F • Totally independent or nearly so (B C E F) – This used to be called embarrassingly parallel and is now pleasingly so – This is preserve of job scheduling community and one gets efficiency by statistical mechanisms with (fair) assignment of jobs to cores – “Parameter Searches” generate this class but these are often not optimal way to search for “best parameters” – “Multiple users” of a server is an important class of this type – No significant synchronization and/or communication latency constraints • Loosely coupled (D) is “Metaproblem” with several components orchestrated with pipeline, dataflow or not very tight constraints – This is preserve of Grid workflow or mashups – Synchronization and/or communication latencies in millisecond to second or more range • Tightly coupled (A) is classic parallel computing program with components synchronizing often and with tight timing constraints – Synchronization and/or communication latencies around a microsecond PC07Intro [email protected] 18 Multicore Programming Paradigms • At a very high level, there are three broad classes of parallelism • Coarse grain functional parallelism typified by workflow and often used to build composite “metaproblems” whose parts are also parallel – This area has several good solutions getting better – Pleasingly parallel applications can be considered special cases of functional parallelism • Large Scale loosely synchronous data parallelism where dynamic irregular work has clear synchronization points as in most large scale scientific and engineering problems • Fine grain thread parallelism as used in search algorithms which are often data parallel (over choices) but don’t have universal synchronization points – Discrete Event Simulations are either a fourth class or a variant of thread parallelism PC07Intro [email protected] 19 Programming Models • So the Fine grain thread parallelism and Large Scale loosely synchronous data parallelism styles are distinctive to parallel computing while • Coarse grain functional parallelism of multicore overlaps with workflows from Grids and Mashups from Web 2.0 • Seems plausible that a more uniform approach evolve for coarse grain case although this is least constrained of programming styles as typically latency issues are not critical – Multicore would have strongest performance constraints – Web 2.0 and Multicore the most important usability constraints • A possible model for broad use of multicores is that the difficult parallel algorithms are coded as libraries (Fine grain thread parallelism and Large Scale loosely synchronous data parallelism styles) while the general user uses composes with visual interfaces, scripting and systems like Google MapReduce Google MapReduce Simplified Data Processing on Large Clusters • http://labs.google.com/papers/mapreduce.html • This is a dataflow model between services where services can do useful document oriented data parallel applications including reductions • The decomposition of services onto cluster engines is automated • The large I/O requirements of datasets changes efficiency analysis in favor of dataflow • Services (count words in example) can obviously be extended to general parallel applications • There are many alternatives to language expressing either dataflow and/or parallel operations and indeed one should support multiple languages in spirit of services PC07Intro [email protected] 21 Old and New (Web 2.0) Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P Collaboration – text, audio-video conferencing, files del.icio.us, Connotea, Citeulike, Bibsonomy, Biolicious manage shared bookmarks MySpace, YouTube, Bebo, Hotornot, Facebook, or similar sites allow you to create (upload) community resources and share them; Friendster, LinkedIn create networks • http://en.wikipedia.org/wiki/List_of_social_networking_websites Writely, Wikis and Blogs are powerful specialized shared document systems ConferenceXP and WebEx share general applications Google Scholar tells you who has cited your papers while publisher sites tell you about co-authors • Windows Live Academic Search has similar goals Note sharing resources creates (implicit) communities • Social network tools study graphs to both define communities and extract their properties “Best Web 2.0 Sites” -- 2006 Extracted from http://web2.wsj2.com/ Social Networking Start Pages Social Bookmarking Peer Production News Social Media Sharing Online Storage (Computing) 23 Web 2.0 Systems are Portals, Services, Resources Captures the incredible development of interactive Web sites enabling people to create and collaborate Mashups v Workflow? Mashup Tools are reviewed at http://blogs.zdnet.com/Hinchcliffe/?p=63 Workflow Tools are reviewed by Gannon and Fox http://grids.ucs.indiana.edu/ptliupages/publications/Workflow-overview.pdf Both include scripting in PHP, Python, sh etc. as both implement distributed programming at level of services Mashups use all types of service interfaces and do not have the potential robustness (security) of Grid service approach Typically “pure” HTTP (REST) 25 Grid Workflow Datamining in Earth Science NASA GPS Work with Scripps Institute Grid services controlled by workflow process real time data from ~70 GPS Sensors in Southern California Earthquake Streaming Data Support Archival Transformations Data Checking Hidden Markov Datamining (JPL) Real Time Display (GIS) 26 Web 2.0 uses all types of Services Here a Gadget Mashup uses a 3 service workflow with a JavaScript Gadget Client 27 Web 2.0 APIs http://www.programmable web.com/apis has (May 14 2007) 431 Web 2.0 APIs with GoogleMaps the most often used in Mashups This site acts as a “UDDI” for Web 2.0 The List of Web 2.0 API’s Each site has API and its features Divided into broad categories Only a few used a lot (42 API’s used in more than 10 mashups) RSS feed of new APIs Amazon S3 growing in popularity APIs/Mashups per Protocol Distribution Number of APIs google maps Number of Mashups del.icio.us virtual earth 411sync yahoo! search yahoo! geocoding technorati netvibes yahoo! images trynt yahoo! local amazon ECS google search flickr SOAP ebay youtube amazon S3 REST live.com XML-RPC REST, REST, REST, JS Other 4 more Mashups each day Growing number of commercial Mashup Tools For a total of 1906 April 17 2007 (4.0 a day over last month) Note ClearForest runs Semantic Web Services Mashup competitions (not workflow competitions) Some Mashup types: aggregators, search aggregators, visualizers, mobile, maps, games Mash Planet Web 2.0 Architecture http://www.imagine -it.org/mashplanet Display too large to be a Gadget 32 Searched on Transit/Transportation 33 Google Maps Server Marion County Map Server (ESRI ArcIMS) Must provide adapters for each Map Server type . Tile Server requests map tiles at all zoom levels with all layers. These are converted to uniform projection, indexed, and stored. Overlapping images are combined. A “Grid” Workflow (built in Java!) Hamilton County Map Server (AutoDesk) Adapter Adapter Adapter Tile Server Cache Server Browser + Google Map API Cass County Map Server (OGC Web Map Server) Browser client fetches image tiles for the bounding box using Google Map API. The cache server fulfills Google map calls with cached tiles at the requested bounding box that fill the bounding box. Uses Google Maps clients and server and non Google map APIs 34 Indiana Map Grid Workflow/Mashup GIS Grid of “Indiana Map” and ~10 Indiana counties with accessible Map (Feature) Servers from different vendors. Grids federate different data repositories (cf Astronomy VO federating different observatory collections) 35 Grid-style portal as used in Earthquake Grid The Portal is built from portlets – providing user interface fragments for each service that are composed into the full interface – uses OGCE technology as does planetary science VLAB portal with University of Minnesota Now to Portals 36 Note the many competitions powering Web 2.0 Mashup Development Portlets v. Google Gadgets Portals for Grid Systems are built using portlets with software like GridSphere integrating these on the server-side into a single web-page Google (at least) offers the Google sidebar and Google home page which support Web 2.0 services and do not use a server side aggregator Google is more user friendly! The many Web 2.0 competitions is an interesting model for promoting development in the world-wide distributed collection of Web 2.0 developers I guess Web 2.0 model will win! 37 Typical Google Gadget Structure Google Gadgets are an example of Start Page technology See http://blogs.zdnet.com/Hinchcliffe/?p=8 … Lots of HTML and JavaScript </Content> </Module> Portlets build User Interfaces by combining fragments in a standalone Java Server Google Gadgets build User Interfaces by combining fragments with JavaScript on the client Web 2.0 v Narrow Grid I Web 2.0 allows people to nurture the Internet Cloud and such people got Time’s person of year award Whereas Narrow Grids support Internet scale Distributed Services with similar architecture Maybe Narrow Grids focus on (number of) Services (there aren’t many scientists) and Web 2.0 focuses on number of People Both agree on service oriented architectures but have different emphasis Narrow Grids have a strong emphasis on standards and structure; Web 2.0 lets a 1000 flowers (protocols) and a million developers bloom and focuses on functionality, broad usability and simplicity • Semantic Web/Grid has structure to allow reasoning • Annotation in sites like del.icio.us and uploading to MySpace/YouTube is unstructured and free text search replaces structured ontologies Web 2.0 v Narrow Grid II Web 2.0 has a set of major services like GoogleMaps or Flickr but the world is composing Mashups that make new composite services • End-point standards are set by end-point owners • Many different protocols covering a variety of de-facto standards Narrow Grids have a set of major software systems like Condor and Globus and a different world is extending with custom services and linking with workflow Popular Web 2.0 technologies are PHP, JavaScript, JSON, AJAX and REST with “Start Page” e.g. (Google Gadgets) interfaces Popular Narrow Grid technologies are Apache Axis, BPEL WSDL and SOAP with portlet interfaces Robustness of Grids demanded by the Enterprise? Not so clear that Web 2.0 won’t eventually dominate other application areas and with Enterprise 2.0 it’s invading Grids The world does itself in large numbers! Implication for Grid Technology of Multicore and Web 2.0 I Web 2.0 and Grids are addressing a similar application class although Web 2.0 has focused on user interactions • So technology has similar requirements Multicore differs significantly from Grids in component location and this seems particularly significant for data • Not clear therefore how similar applications will be • Intel RMS multicore application class pretty similar to Grids Multicore has more stringent software requirements than Grids as latter has intrinsic network overhead 41 Implication for Grid Technology of Multicore and Web 2.0 II Multicore chips require low overhead protocols to exploit low latency that suggests simplicity • We need to simplify MPI AND Grids! Web 2.0 chooses simplicity (REST rather than SOAP) to lower barrier to everyone participating Web 2.0 and Multicore tend to use traditional (possibly visual) (scripting) languages for equivalent of workflow whereas Grids use visual interface backend recorded in BPEL • Google MapReduce illustrates a popular Web 2.0 and Multicore approach to dataflow 42 Implication for Grid Technology of Multicore and Web 2.0 III Web 2.0 and Grids both use SOA Service Oriented Architectures • Seems likely that Multicore will also adopt although a more conventional object oriented approach also possible • Services should help multicore applications integrate modules from different sources • Multicore will use fine grain objects but coarse grain services “System of Systems”: Grids, Web 2.0 and Multicore are likely to build systems hierarchically out of smaller systems • We need to support Grids of Grids, Webs of Grids, Grids of Multicores etc. i.e. systems of systems of all sorts 43 Implication for Grid Technology of Multicore and Web 2.0 IV Portals are likely to feature both Web and “desktop client” technology although it is possible that Web approach will be adopted more or less uniformly Web 2.0 has a very active portal activity which has similar architecture to Grids • A page has multiple user interface fragments Web 2.0 user interface integration is typically Client side using Gadgets AJAX and JavaScript while • Grids are in a special JSR168 portal server side using Portlets WSRP and Java Multicore doesn’t put special constraints on portal technology but it could tend to favor non browser client or client side Web browser integrated portals 44 The “Momentum” Effects Web 2.0 has momentum as it is driven by success of social web sites and the user friendly protocols attracting many developers of mashups Grids momentum driven by the success of eScience and the commercial web service thrusts largely aimed at Enterprise • Enterprise software area not quite as dominant as in past • Grid technical requirements are a bit soft and could be compromised if sandwiched by Web 2.0 and Multicore • Will commercial interest in Web Services survive? Multicore driven by expectation that all servers and clients will have many cores • Multicore latency requirements imply cannot compromise in some technology choices Simplicity, supporting many developers and stringent multicore requirements are the forces pressuring Grids! The Ten areas covered by the 60 core WS-* Specifications WS-* Specification Area Typical Grid/Web Service Examples 1: Core Service Model XML, WSDL, SOAP 2: Service Internet WS-Addressing, WS-MessageDelivery; Reliable Messaging WSRM; Efficient Messaging MOTM 3: Notification WS-Notification, WS-Eventing (PublishSubscribe) 4: Workflow and Transactions BPEL, WS-Choreography, WS-Coordination 5: Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation 6: Service Discovery UDDI, WS-Discovery 7: System Metadata and State WSRF, WS-MetadataExchange, WS-Context 8: Management WSDM, WS-Management, WS-Transfer 9: Policy and Agreements WS-Policy, WS-Agreement 10: Portals and User Interfaces WSRP (Remote Portlets) WS-* Areas and Web 2.0 WS-* Specification Area Web 2.0 Approach 1: Core Service Model XML becomes optional but still useful SOAP becomes JSON RSS ATOM WSDL becomes REST with API as GET PUT etc. Axis becomes XmlHttpRequest 2: Service Internet No special QoS. Use JMS or equivalent? 3: Notification Hard with HTTP without polling– JMS perhaps? 4: Workflow and Transactions (no Transactions in Web 2.0) Mashups, Google MapReduce Scripting with PHP JavaScript …. 5: Security SSL, HTTP Authentication/Authorization, OpenID is Web 2.0 Single Sign on 6: Service Discovery http://www.programmableweb.com 7: System Metadata and State Processed by application – no system state – Microformats are a universal metadata approach 8: Management==Interaction WS-Transfer style Protocols GET PUT etc. 9: Policy and Agreements Service dependent. Processed by application 10: Portals and User Interfaces Start Pages, AJAX and Widgets(Netvibes) Gadgets WS-* Areas and Multicore WS-* Specification Area Typical Grid/Web Service Examples 1: Core Service Model Fine grain Java C# C++ Objects and coarse grain services as in DSS. Information passed explicitly or by handles. MPI needs to be updated to handle non scientific applications as in CCR 2: Service Internet Not so important intrachip 3: Notification Publish-Subscribe for events and Interrupts 4: Workflow and Transactions Many approaches; scripting languages popular 5: Security Not so important intrachip 6: Service Discovery Use libraries 7: System Metadata and State Environment Variables 8: Management == Interaction Interaction between objects key issue in parallel programming trading off efficiency versus performance 9: Policy and Agreements Handled by application 10: Portals and User Interfaces Web 2.0 technology popular CCR as an example of a Cross Paradigm Run Time • Naturally supports fine grain thread switching with message passing with around 4 microsecond latency for 4 threads switching to 4 others on an AMD PC with C#. Threads spawned – no rendezvous • Has around 50 microsecond latency for coarse grain service interactions with DSS extension which supports Web 2.0 style messaging • MPI Collectives – Shift and Exchange vary from 10 to 20 microsecond latency in rendezvous mode • Not as good as best MPI’s but managed code and supports Grids Web 2.0 and Parallel Computing Microsoft CCR • Supports exchange of messages between threads using named ports • FromHandler: Spawn threads without reading ports • Receive: Each handler reads one item from a single port • MultipleItemReceive: Each handler reads a prescribed number of items of a given type from a given port. Note items in a port can be general structures but all must have same type. • MultiplePortReceive: Each handler reads a one item of a given type from multiple ports. • JoinedReceive: Each handler reads one item from each of two ports. The items can be of different type. • Choice: Execute a choice of two or more port-handler pairings • Interleave: Consists of a set of arbiters (port -- handler pairs) of 3 types that are Concurrent, Exclusive or Teardown (called at end for clean up). Concurrent arbiters are run concurrently but exclusive handlers are • http://msdn.microsoft.com/robotics/ PC07Intro [email protected] 50 Rendezvous exchange as two shifts Latency/Overhead Rendezvous exchange customized for MPI Rendezvous Shift 25 20 Time 15 AMDExch Time Microseconds AMD Exch as AMD Shift 10 5 Stages (millions) 0 0 2 4 6 8 10 Millions Overhead (latency) of AMD 4-core PC with 4 execution threads on MPI style Stages Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 10 seconds divided by number of stages Latency/Overhead up to million stages Rendezvous exchange as two shifts Rendezvous exchange customized for MPI Rendezvous Shift 90 80 Time Microseconds INTELEX INTEL Ex INTEL Sh 70 Time 60 50 40 30 20 10 Stages (millions) 0 0 0.2 0.4 0.6 0.8 1 Millions threads on MPI Overhead (latency) of INTEL 8-core PC with 8 execution style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as Stages custom CCR pattern. Compute time is 15 seconds divided by number of stages Average run time (microseconds) 350 DSS Service Measurements 300 250 200 150 100 50 0 1 10 100 1000 10000 Timing of HP Opteron Multicore as aRound functiontrips of number of simultaneous twoway service messages processed (November 2006 DSS Release) CGL Measurements of Axis 2 shows about 500 microseconds – DSS is 10 times better PC07Intro [email protected] 53