Computation of High-Resolution Global Ocean Model using Earth Simulator By Norikazu Nakashiki Yoshikatsu Yoshida Takaki Tsubono Dong-Hoon.
Download ReportTranscript Computation of High-Resolution Global Ocean Model using Earth Simulator By Norikazu Nakashiki Yoshikatsu Yoshida Takaki Tsubono Dong-Hoon.
Computation of High-Resolution Global Ocean Model using Earth Simulator By Norikazu Nakashiki Yoshikatsu Yoshida Takaki Tsubono Dong-Hoon. Kim Frank O. Bryan Richard D. Smith Mathew E. Maltrud Julie L. McClean (CRIEPI) (CRIEPI) (CRIEPI) (CRIEPI) (NCAR) (LANL) (LANL) (NPS) Parallel Ocean Program (POP) 1. Designed for Massive Parallel Computer -> sheared memory, massive parallel computing 2. Free-surface boundary condition -> no island problem -> unsmoothed bottom topography -> prognostic sea-surface height 3. General Orthogonal Coordinate -> displaced-pole grid (singularity free Arctic Ocean) 4. Vertical mixing parameterization 1) simple constant mixing 2) Richardson-number dependent mixing 3) KPP mixing parameterization 5. Convective Adjustment 1) convection adjustment 2) large mixing coefficient 6. Horizontal mixing 1) 2) 3) 4) laplacian bi-harmonic Gent-McWilliams isopycnal tracer diffusion Anisotropic viscosity 7. Equation of State 1) UNESCO eq. (based on potential temperature) 2) full UNESCO eq. (polynomial fit) 3) linear eos 8. Topographic stress 1) Holloway’s topographic stress parameterization POP (Parallel Ocean Program) 1) High Resolution Global Ocean model Resolution : 0.1x0.1x40L (3600x2400x40) (pole on North America) Horizontal : Bi-harmonic Mixing for Momentum & Tracer Vertical : Kpp Mixing Time step : 220/day (≒ 6min.) 2) Global Model for CCSM2 Resolution : 1x1x40L (320x384x40) (pole on Green Land) Horizontal : Anisotropic Mixing for Momentum GM Mixing for Tracer Vertical : Kpp Mixing Time step : 23/day (≒60min.) Computational Grid of POP x0.1 Horizontal Mesh Vertical Mesh POP timing measurement on ES • 1 degree model – 320 x 384 x 40 grid division, 23 full-step/day – KPP vertical mixing scheme – GM horizontal mixing for tracer – Anisotropic viscosity parameterization – 3rd upwind tracer advection • 0.1 degree model – 3600 x 2400 x 40 grid division, 220 full-step/day – KPP vertical mixing scheme – Bi-harmonic horizontal mixing for tracer and momentum • No history output. No forcing data input. Cost distribution in POP (a) 20 PEs (b) 160 PEs 0 baroclinic w/o optimization baroclinic w/ optimization 400 800 baroclinic 1200 1600 barotropic w/o optimization baroclinic (c) 960 PEs 0 100 w/ optimization 200 barotropic w/o optimization w/ optimization 0 resolution: x0.1 deg 200 400 wallclock seconds per simulated day 600 Scalability in baroclinic/barotropic mode wallclock seconds per simulated day 2 10 4 8 baroclinic 1 10 barotropic w/o optimization 0 10 w/ optimization 1 2 4 8 16 32 # of processors (a) 1 degree 2 16 wallclock seconds per simulated day # of nodes 1 2 3 10 4 # of nodes 16 32 8 baroclinic 64 128 w/o optimization barotropic 2 10 w/ optimization 1 64 128 10 16 32 64 128 256 # of processors 512 (b) 0.1 degree Significant improvement in barotropic mode Scalability wall around 2-node (1 deg) and 80-node (0.1 deg) Slight speedup in baroclinic mode 1024 POP performance on ES # of nodes # of nodes 2 4 8 16 32 64 128 1 1.2 3 2 4 8 0.1 degree 10 0.1 degree 2 10 1 degree 16 32 64 128 w/ optimization 1.0 w/o optimization 1 10 w/ optimization parallel efficiency wallclock days per simulated century 1 0.8 1 degree 0.6 0.4 0.2 0 10 1 w/o optimization 2 4 8 16 32 64 128 256 512 1024 0.0 # of PEs 2 4 8 16 32 64 128 256 512 1024 # of PEs (a) wallclock v.s. # of PEs 1.64 day/century, 27.1 day/century, 1 70.2 Gflop/s, 1.60 Tflop/s, (b) efficiency v.s. # of PEs at 1 degree (4 nodes) at 0.1 degree (120 nodes) Parallel Efficiency of POP x1 (Relation with Vertical Resolution) Parallel Efficiency (%) Wall Clock Time (sec) for 2 Days Integration Vertical Resolution -> 40L, 80, 160, 200L Num. of PE Num. of PE Parallel efficiency ≧ 50 % (10 Node ) on ES center Further optimizations for POP code • POP version 1 – Distributed parallel I/O w/ horizontal data decomposition (J. Ueno, will be completed in March) – Tests of NEC’s new MPI library (incl. all-reduce) – Merge CRIEPI version and CRAY version into one • POP version 2 – POP2 beta2 code ported to ES – Vector optimization – Timing measurement in progress (H. Komatsu, J. Ueno) Some problems w/ OpenMP • NEC’s compiler supports OpenMP1.1, not OpenMP2 • Some features of f90 cannot be used w/ OpenMP1.1 10 years Spin up of POP (x0.1) 10 year Computation (10year * 1cycle) Atmospheric Boundary Conditions NCEP, etc. (1990-2000) (1) Wind Daily (2) Surface Heat Flux Daily (3) Surface Fresh Water Flux Monthly Surface Boundary Condition POP x0.1 Earth Simulator Initial data From LANL/NPS 40 node (320 PE) Global Diagnostics Global Mean KE Kinetic Energy at Surface Global Mean SAL Global Mean PT Annual Mean Sea Surface Temp. Annual Mean Sea Surface Sal. Levitus POP x1 (2000) POP x0.1 m(2000) Kuroshio CCSM2 (for climate simulation) X1 deg. (100km×100km) High Resolution Model x0.1 deg. (10km×10km) Equatorial Current CCSM2 (for climate simulation) X1 deg. (100km×100km) High Resolution Model x0.1 deg. (10km×10km) Sea Surface Temp. Glonbal1990-2000 Monthly SST Kuroshio Gulf Stream 1990-2000 Monthly Vel. 1990-1991 Daily SST SSH & Volume Transport Section Soya Tsushima Tsugaru Kuroshio Ohsumi Tokara Volume Transport 60 Soya 50 黒潮の各断面通過流量 (Sv) Kuroshio (tokara) Kuroshio (izu) 40 Tsushima Tsugaru 30 Kuroshio Izu 30-60 Sv Kuroshio 20 10 Ohsumi Tokara 0 1990 1995 2000 Tokara 13 Sv 年 Japan Sea 5 日本海における各断面通過流量 (Sv) 4 Tsushima Tsugaru Soya 3 Tsushima 2.2Sv Tsugaru 1.5Sv Soya 0.7Sv 2 1 0 -1 1990 1995 年 2000 Sensitivity Analysis of POP x0.1 To improve Gulf Stream & Kuroshio, etc. → Change Strength of Horizontal Mixing Viscosity & Diffusivity of Bi-harmonic Mixing case 01a: am = -2.7e18 , ah = -9.0e17 case 01b: Same Horizontal Mixing (basic) x1/2 case 01c: x1/3 Surface Forcing : Monthly Climatology Global Diagnostics case 01a Same Global Mean KE case 01b x1/2 case 01c x1/3 Global Mean PT Global Mean SAL (basic) SSH 9th case 01a x1/3 9th, 10th year 10th SSH 9th case 01c x1/3 9th, 10th year 10th Viscosity – SSH (Kuroshio) Global Mean KE - 01a basic - 01b x1/2 - 01c x1/3 basic Low High x1/2 x1/3 Soya Tsushima Tsugaru Es_01b Es_01a Es_01c Volume Transport Es_01a 0.14 Es_01b Es_01c 0.16 Obs. Tsuhima 2.6 2.74 3.0 (Korea) 0.18 0.21 Tsugaru 2.22 2.40 2.61 2.7 Soya 1.1 0.38 0.34 0.38 Soya 1.6 (Sv) Tsushima Tsugaru Tokara Kuroshio case_01c year-12 , Jan.,Mar.,and May Jan Mar May New Sections are planning for Volume Transport Checking Volume Transport (Kuroshio) case 01a Same (basic) case 01b x1/2 case 01c x1/3 Global Mean KE - 01a basic - 01b x1/2 - 01c x1/3 Viscosity – SSH (Gulf Stream) basic x1/2 x1/3 Sensitivity Analysis of POP x0.1 To improve Gulf Stream, etc. → Change Restoring condition, Topography Viscosity & Diffusivity of Bi-harmonic Mixing case 01e viscosity & diffusivity x1/3 + w/o restoring case 01f viscosity & diffusivity x1/3 + w/ topography change case 01f w/ topography change SSH at 6th year case 01e x1/3 w/o restoring case 01f x1/3 w/ topo. change Global Diagnostics Global Mean KE case 01a case 01b case 01c case 01d case 01e case 01f Same (basic) x1/2 x1/3 GM scheme x1/3 w/o restoring x1/3 w/ topo. change Global Mean SAL Global Mean PT Global Mean KE Global Diagnostics case 20d (basic) case 00a GM mixing With NCEP daily forcing Global Mean SAL Global Mean PT Future Research Plan 1) POP x0.1 deg. * Improvement of the Model Sensitivity Analysis Horizontal & Vertical Mixing etc. Vertical Resolution 40 Layer -> 106 Layer Active Ice Model 2) POP x1 deg. * Tuning for CCSM2 Computation Research Plan in FY2003 1) POP x0.1 deg. * Improvement of the Model Sensitivity Analysis : Horizontal & Vertical Mixing etc. Vertical Resolution : 40 Layer -> 106 Layer Active Ice Model ? * Analysis the Results and Write Paper 2) POP x1 deg. * Tuning for CCSM2 Computation 3) Regional Nesting model Porting to ES Center Nesting to POP x1 deg.