Computation of High-Resolution Global Ocean Model using Earth Simulator By Norikazu Nakashiki Yoshikatsu Yoshida Takaki Tsubono Dong-Hoon.

Download Report

Transcript Computation of High-Resolution Global Ocean Model using Earth Simulator By Norikazu Nakashiki Yoshikatsu Yoshida Takaki Tsubono Dong-Hoon.

Computation of
High-Resolution Global Ocean Model
using Earth Simulator
By
Norikazu Nakashiki
Yoshikatsu Yoshida
Takaki Tsubono
Dong-Hoon. Kim
Frank O. Bryan
Richard D. Smith
Mathew E. Maltrud
Julie L. McClean
(CRIEPI)
(CRIEPI)
(CRIEPI)
(CRIEPI)
(NCAR)
(LANL)
(LANL)
(NPS)
Parallel Ocean Program (POP)
1. Designed for Massive Parallel Computer
-> sheared memory, massive parallel computing
2. Free-surface boundary condition
-> no island problem
-> unsmoothed bottom topography
-> prognostic sea-surface height
3. General Orthogonal Coordinate
-> displaced-pole grid (singularity free Arctic Ocean)
4. Vertical mixing parameterization
1) simple constant mixing
2) Richardson-number dependent mixing
3) KPP mixing parameterization
5. Convective Adjustment
1) convection adjustment
2) large mixing coefficient
6. Horizontal mixing
1)
2)
3)
4)
laplacian
bi-harmonic
Gent-McWilliams isopycnal tracer diffusion
Anisotropic viscosity
7. Equation of State
1) UNESCO eq. (based on potential temperature)
2) full UNESCO eq. (polynomial fit)
3) linear eos
8. Topographic stress
1) Holloway’s topographic stress parameterization
POP (Parallel Ocean Program)
1) High Resolution Global Ocean model
Resolution : 0.1x0.1x40L (3600x2400x40)
(pole on North America)
Horizontal : Bi-harmonic Mixing for Momentum & Tracer
Vertical
: Kpp Mixing
Time step : 220/day (≒ 6min.)
2) Global Model for CCSM2
Resolution : 1x1x40L (320x384x40)
(pole on Green Land)
Horizontal : Anisotropic Mixing for Momentum
GM Mixing for Tracer
Vertical
: Kpp Mixing
Time step : 23/day (≒60min.)
Computational Grid of POP x0.1
Horizontal Mesh
Vertical Mesh
POP timing measurement on ES
• 1 degree model
– 320 x 384 x 40 grid division, 23 full-step/day
– KPP vertical mixing scheme
– GM horizontal mixing for tracer
– Anisotropic viscosity parameterization
– 3rd upwind tracer advection
• 0.1 degree model
– 3600 x 2400 x 40 grid division, 220 full-step/day
– KPP vertical mixing scheme
– Bi-harmonic horizontal mixing for tracer and momentum
• No history output. No forcing data input.
Cost distribution in POP
(a) 20 PEs
(b) 160 PEs
0
baroclinic
w/o optimization
baroclinic
w/ optimization
400
800
baroclinic
1200
1600
barotropic
w/o optimization
baroclinic
(c) 960 PEs
0
100
w/ optimization
200
barotropic
w/o optimization
w/ optimization
0
resolution: x0.1 deg
200
400
wallclock seconds per simulated day
600
Scalability in baroclinic/barotropic mode
wallclock seconds per simulated day
2
10
4
8
baroclinic
1
10
barotropic
w/o optimization
0
10
w/ optimization
1
2
4
8
16
32
# of processors
(a) 1 degree
2
16
wallclock seconds per simulated day
# of nodes
1
2
3
10
4
# of nodes
16
32
8
baroclinic
64
128
w/o optimization
barotropic
2
10
w/ optimization
1
64
128
10
16
32
64
128
256
# of processors
512
(b) 0.1 degree
 Significant improvement in barotropic mode
 Scalability wall around 2-node (1 deg) and 80-node (0.1 deg)
 Slight speedup in baroclinic mode
1024
POP performance on ES
# of nodes
# of nodes
2
4
8
16 32
64 128
1
1.2
3
2
4
8
0.1 degree
10
0.1 degree
2
10
1 degree
16
32
64 128
w/ optimization
1.0
w/o optimization
1
10
w/ optimization
parallel efficiency
wallclock days per simulated century
1
0.8 1 degree
0.6
0.4
0.2
0
10
1
w/o optimization
2
4
8
16
32
64 128 256 512 1024
0.0
# of PEs
2
4
8
16
32
64 128 256 512 1024
# of PEs
(a) wallclock v.s. # of PEs
 1.64 day/century,
 27.1 day/century,
1
70.2 Gflop/s,
1.60 Tflop/s,
(b) efficiency v.s. # of PEs
at 1 degree (4 nodes)
at 0.1 degree (120 nodes)
Parallel Efficiency of POP x1
(Relation with Vertical Resolution)
Parallel Efficiency (%)
Wall Clock Time (sec) for 2 Days Integration
Vertical Resolution -> 40L, 80, 160, 200L
Num. of PE
Num. of PE
Parallel efficiency ≧ 50 % (10 Node ) on ES center
Further optimizations for POP code
• POP version 1
– Distributed parallel I/O w/ horizontal data decomposition (J.
Ueno, will be completed in March)
– Tests of NEC’s new MPI library (incl. all-reduce)
– Merge CRIEPI version and CRAY version into one
• POP version 2
– POP2 beta2 code ported to ES
– Vector optimization
– Timing measurement in progress (H. Komatsu, J. Ueno)
Some problems w/ OpenMP
• NEC’s compiler supports OpenMP1.1, not OpenMP2
• Some features of f90 cannot be used w/ OpenMP1.1
10 years Spin up of POP (x0.1)
10 year Computation
(10year * 1cycle)
Atmospheric Boundary Conditions NCEP, etc. (1990-2000)
(1) Wind
Daily
(2) Surface Heat Flux
Daily
(3) Surface Fresh Water Flux
Monthly
Surface Boundary Condition
POP x0.1
Earth Simulator
Initial data From LANL/NPS
40 node (320 PE)
Global Diagnostics
Global Mean KE
Kinetic Energy at Surface
Global Mean SAL
Global Mean PT
Annual Mean
Sea Surface Temp.
Annual Mean
Sea Surface Sal.
Levitus
POP x1 (2000)
POP x0.1 m(2000)
Kuroshio
CCSM2 (for climate simulation)
X1 deg. (100km×100km)
High Resolution Model
x0.1 deg. (10km×10km)
Equatorial Current
CCSM2 (for climate simulation)
X1 deg. (100km×100km)
High Resolution Model
x0.1 deg. (10km×10km)
Sea Surface Temp.
Glonbal1990-2000
Monthly SST
Kuroshio
Gulf Stream
1990-2000 Monthly Vel.
1990-1991 Daily SST
SSH & Volume Transport Section
Soya
Tsushima
Tsugaru
Kuroshio
Ohsumi
Tokara
Volume Transport
60
Soya
50
黒潮の各断面通過流量 (Sv)
Kuroshio (tokara)
Kuroshio (izu)
40
Tsushima
Tsugaru
30
Kuroshio
Izu
30-60 Sv
Kuroshio 20
10
Ohsumi
Tokara
0
1990
1995
2000
Tokara
13 Sv
年
Japan Sea
5
日本海における各断面通過流量 (Sv)
4
Tsushima
Tsugaru
Soya
3
Tsushima 2.2Sv
Tsugaru 1.5Sv
Soya 0.7Sv
2
1
0
-1
1990
1995
年
2000
Sensitivity Analysis of POP x0.1
To improve Gulf Stream & Kuroshio, etc.
→ Change Strength of Horizontal Mixing
Viscosity & Diffusivity of Bi-harmonic Mixing
case 01a: am = -2.7e18 , ah = -9.0e17
case 01b:
Same Horizontal Mixing (basic)
x1/2
case 01c:
x1/3
Surface Forcing : Monthly Climatology
Global Diagnostics
case 01a Same
Global Mean KE
case 01b x1/2
case 01c
x1/3
Global Mean PT
Global Mean SAL
(basic)
SSH
9th
case 01a x1/3 9th, 10th year
10th
SSH
9th
case 01c x1/3 9th, 10th year
10th
Viscosity – SSH (Kuroshio)
Global Mean KE
- 01a basic
- 01b x1/2
- 01c x1/3
basic
Low
High
x1/2
x1/3
Soya
Tsushima
Tsugaru
Es_01b
Es_01a
Es_01c
Volume Transport
Es_01a
0.14
Es_01b Es_01c
0.16
Obs.
Tsuhima 2.6
2.74
3.0
(Korea)
0.18
0.21
Tsugaru 2.22
2.40
2.61
2.7
Soya
1.1
0.38
0.34
0.38
Soya
1.6
(Sv)
Tsushima
Tsugaru
Tokara
Kuroshio
case_01c year-12 , Jan.,Mar.,and May
Jan
Mar
May
New Sections are
planning
for
Volume Transport
Checking
Volume Transport
(Kuroshio)
case 01a Same
(basic)
case 01b x1/2
case 01c
x1/3
Global Mean KE
- 01a basic
- 01b x1/2
- 01c x1/3
Viscosity – SSH
(Gulf Stream)
basic
x1/2
x1/3
Sensitivity Analysis of POP x0.1
To improve Gulf Stream, etc.
→ Change Restoring condition, Topography
Viscosity & Diffusivity of Bi-harmonic Mixing
case 01e
viscosity & diffusivity x1/3
+ w/o restoring
case 01f
viscosity & diffusivity x1/3
+ w/ topography change
case 01f w/ topography change
SSH at 6th year
case 01e x1/3 w/o restoring
case 01f x1/3 w/ topo. change
Global Diagnostics
Global Mean KE
case 01a
case 01b
case 01c
case 01d
case 01e
case 01f
Same (basic)
x1/2
x1/3
GM scheme
x1/3 w/o restoring
x1/3 w/ topo. change
Global Mean SAL
Global Mean PT
Global Mean KE
Global Diagnostics
case 20d
(basic)
case 00a
GM mixing
With NCEP daily forcing
Global Mean SAL
Global Mean PT
Future Research Plan
1) POP x0.1 deg.
* Improvement of the Model
Sensitivity Analysis
Horizontal & Vertical Mixing etc.
Vertical Resolution
40 Layer -> 106 Layer
Active Ice Model
2) POP x1 deg.
* Tuning for CCSM2 Computation
Research Plan in FY2003
1) POP x0.1 deg.
* Improvement of the Model
Sensitivity Analysis : Horizontal & Vertical Mixing etc.
Vertical Resolution : 40 Layer -> 106 Layer
Active Ice Model ?
* Analysis the Results and Write Paper
2) POP x1 deg.
* Tuning for CCSM2 Computation
3) Regional Nesting model
Porting to ES Center
Nesting to POP x1 deg.