NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT Adoption and field tests of M.I.T General.

Download Report

Transcript NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT Adoption and field tests of M.I.T General.

NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Adoption and field tests of M.I.T
General Circulation Model
(MITgcm) with ESMF
Chris Hill
ESMF Community Meeting
MIT, July 2005
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Outline
• MITgcm – a very quick overview
– algorithmic features
– software characteristics
• Adopting ESMF
– strategy
– steps
• Field test applications
– MITgcm coupling with everything (including itself!) interoperating with NCAR, GFDL and UCLA atmosphere models,
intermediate complexity coupled system.
– high-end parameterization as a coupled problem.
• Next steps
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
MITgcm algorithmic characteristics
•
•
•
•
•
•
General orthogonal curvilinear coordinate finite-volume dynamical kernel.
Flexible, scalable domain decomposition 1CPU  2000+ CPU’s.
Can apply to wide range of scales, hydrostatic  non-hydrostatic.
Pressure-height isomorphism allows kernel to apply to ocean or atmosphere.
Many optional packages spanning biogeochemistry, atmospheric physics,
boundary layers, sea-ice etc…
Adjoints to most parts for assimilation/state-estimation and sensitivity
analysis.
and more….
see http://mitgcm.org
~20 m
~100 m
~1 km
NON-HYDROSTATIC
~10 km
~1000 km
~100 km
HYDROSTATIC
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
MITgcm software characteristics
• Fortran (what else )
• Approx 170K executable statements
• Generic driver code (superstructure), coupling code,
computational kernel code and parallelism, I/O etc…
support code (infrastructure) are modularized  aligns
with ESMF’s “sandwich” architecture.
• Target hardware – my laptop to largest supercomputers
(Columbia, Blue Genes)  it tries to be portable!
• OSes - linux, HPUX, Solaris, AIX etc…
• Parallel - MPI parallelism binding, threads parallelism
binding (dormant), platform specific parallelism library
support e.g active messages, shmem (dormant).
• Distributed openly on web. Supported through
user+developer mailing list, website. Users all over the
world.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Outline
• MITgcm – a very quick overview
– algorithmic features
– software characteristics
• Adopting ESMF
– strategy
– steps
• Field test applications
– MITgcm coupling with everything (including itself!) interoperating with NCAR, GFDL and UCLA atmosphere models,
intermediate complexity coupled system.
– high-end parameterization as a coupled problem.
• Next steps
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Adoption strategy
• Currently only in-house (i.e. ESMF binding not part
of default distribution). Practical consideration as
many MITgcm user systems do not have ESMF
installed.
• Set of ESMF experiments maintained in MITgcm
CVS source repository that we keep up to date with
latest ESMF (with one/two week lag).
• These experiments use
– ESMF component model ( init(), run(), finalize() )
– Clocks, configuration attributes, field communications
– Primarily sequential mode component execution (more on
this later)
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Adoption steps – top level
• Introduction of internal init(), run(), finalize().
• Development of couplers (and stub
components to test against)
– coupler_init(), coupler_run()
• Development of drivers
– driver_run(), driver_init()
• Code can be seen under CVS repository at
mitgcm.org. “MITgcm_contrib/ESMF”
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Outline
• MITgcm very quick overview
– algorithmic features
– software characteristics
• Adopting ESMF
– strategy
– steps
• Field test applications
– MITgcm coupling with everything (including itself!) interoperating with NCAR, GFDL and UCLA atmosphere models.
– high-end parameterization as a coupled problem.
• Next steps
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Field test: M.I.T. General Circulation Model (MITgcm)
to NCAR Community Atmospheric Model (CAM).
Versions of CAM and MITgcm were adapted to
a. have init(), run(), finalize() interfaces
b. accept, encode and decode ESMF_state variables
A coupler component that maps MITgcm grid to CAM grid was written
Runtime steps
MITgcm prepares
1 export state.
128x64
on 1x16 PE’s
Export state
passes through
2 parent to
coupler
Coupler returns
CAM gridded SST
array which is
3 passed as import
state to CAM
gridded
component.
•Uses ESMF_GridComp, ESMF_CplComp
and ESMF_Regrid sets of functions.
Kluzek,
Hill
180x90
on 1x16 PE’s
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Field test: M.I.T. General Circulation Model (MITgcm)
to GFDL Atmosphere/Land/Ice (ALI).
Versions of MOM and MITgcm were adapted to components
a. work within init(), run(), finalize() interfaces
Smithline, Zhou,
b. accept, encode and decode ESMF_state variables
Hill
A coupler component that maps MITgcm grid to ALI grid was written
MITgcm component is substituted for MOM component with MITgcm-ALI coupler
Runtime steps
MITgcm prepares
1 export state.
128x60
on 1x16 PE’s
Export state
passes through
2 parent to
coupler
Coupler returns
ALI gridded SST
array which is
3 passed to ALI.
•Uses ESMF_GridComp, ESMF_CplComp
and ESMF_Regrid sets of functions.
144x90
on 16x1 PE’s
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
SI experiment: M.I.T. General Circulation Model
(MITgcm) ECCO assimilation ocean and POP to UCLA
atmosphere.
Obs. analysis
3 mo.
forecast A
3 mo.
forecast B
•Uses ESMF_GridComp, ESMF_CplComp
and ESMF_Regrid sets of functions.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
New app: High-end resolution embedding as a coupled
problem.
For a climate related
ocean simulation domain
decomposition is limited
on the number of
processors it can usefully
scale to.
For ~1O model maybe no
scaling beyond ~64 cpu’s
0
1
2
3
4
5
6
Question: Are there other things beside
ensembles of runs we can do with a thousand+
processor system?
Increasing resolution is hard because explicit
scheme timesteps drop with resolution – not
good for millenial simulations.
7
This limit is because
parallelism costs (comm
overhead, overlap
computations) exceed
parallelism benefits.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
New app: High-end resolution embedding as a coupled
problem.
What about embedding
local sub-models, running
concurrently on separate
processors but coupled to
coarse resolution run.
0
1
2
3
4
5
6
7
65
66
67
68
319
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
New app: High-end resolution embedding as a coupled
problem.
What about embedding
local sub-models, running
concurrently on separate
processors but coupled to
coarse resolution run.
0
1
2
3
4
5
6
7
65
66
67
68
319
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Implementation with ESMF
• ESMF provides nice
tools for developing this
embedded system
Top component
0
1
2
3
4
5
6
sub-components
……
sub-sub-components
7
64
65
66
67
– component model
abstraction for managing
different pieces
– parallel regrid/redist
provides great tool for N
to M coupling.
regrid()/redist()
precompute data flows
316
at initialization. At each
317
timestep resolving data
318
transport between ~300319
400 components is
about 15 lines of user
code.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
MITgcm with ESMF next steps
• Continued work in house. Directions
– Embedding with dynamic balancing
– High-resolution ocean and coupled work
• ESMF in default MITgcm distribution
– Most MITgcm user systems do not have
ESMF installed yet. This will take time to
change – how long?
– Hopeful that in the next year this will evolve.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
Summary
• ESMF implementation functionality has
grown significantly over last year
– optimized regrid/redist scaling
– concurrent components
• Performance is always within factor of 2 of
custom code at infrastructure level, at
superstructure (code driver, coupling)
ESMF overhead is comparable to our own
code.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
– coupler_init()
– coupler_run()
BACK
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT
– driver_init()
– driver_run()
BACK