Le modèle ROMS et son utilisation sur NYMPHEA Patrick Marchesiello

Download Report

Transcript Le modèle ROMS et son utilisation sur NYMPHEA Patrick Marchesiello

Centre IRD de Bretagne
Le modèle ROMS et son
utilisation sur NYMPHEA
Patrick Marchesiello
Brest, 13 Janvier 2005
ROMS History
 Descendant of SPEM & SCRUM (relative of POM)
(Song & Haidvogel 1994; Barnier et al., 1998)
 UCLA: more like developer’s code (Shchepetkin et al., 1998, 2003, 2004;
Marchesiello et al., 2001, 2003 … )
http://www.atmos.ucla.edu/cesr/ROMS_page.html
 Rutgers: larger user community & support
http://marine.rutgers.edu/po/index.php?model=roms
 IRD Brest & UCLA & INRIA
http://www.brest.ird.fr/Roms_tools
- AGRIF: Adaptive Grid Refinement In Fortran (Debreu 1999)
- Pre-processing tools (Penven, Marchesiello)
Collaborators and Users
FRANCE
 IRD Brest: Penven, Marchesiello et al.
 LMC Grenoble: Debreu et al.
 LPO Brest: Le Gentil et al.
USA
 UCLA: McWilliams, Shchepetkin, et al.
 JPL: Chao et al.
 Rutgers U.: Arango et al.
USERS
 France: Brest, Paris, Toulouse, Noumea
 Europe: Germany (U. Bremerhaven), Italy (JRC), Portugal (IPIMAR),
Spain (AZTI)
 Africa: Morocco (INRH), Senegal (LPA), South Africa (U. Captown)
 America: California, Peru (IMARPE), Chili (U. Conception), Brazil
ROMS Main features
• Hydrostatic, Boussinesq Primitive Equations
• Free surface
• Generalized vertical s-coordinate
• Horizontal curvilinear coordinates
• High order, low dispersion numerics
• Embedded domains: AGRIF
• Open boundary conditions
• Boundary layers parameterizations
• Parallelization: OMP, MPI
• Domain partitionning
• Optimized for vector computers
• Fortran 95
• UNIX/Linux
• C preprocessor
• NetCDF library, used for all I/O
Numerics: Motivation
Kantha and Clayson (2000) after Durran (1991)
Numerics: Strategy
High order accurate methods:
Sanderson (1998): optimal choice (lower cost for a
given accuracy) for general ocean circulation models
is 3RD OR 4TH ORDER accurate methods
With special care to:
• Numerical dispersion
• Pressure gradient
• Mode splitting
• Combination of methods
Numerics in ROMS
(Shchepetkin & McWilliams, 1998, 2003, 2004)
 Horizontal (“C”) and vertical staggered grids
 Time stepping
– Split-explicit barotropic and baroclinic modes with 2-way time filter
– Predictor-corrector Leapfrog-Adams-Molton 3rd order scheme
with feed-back between momentum & tracer equations
– Non-uniform density in barotropic mode
– Conservative & constancy preserving advection for tracers.
 Advection
– 3rd order upstream biased (QUICK)
 Vertical terms
– parabolic spline reconstruction for horiz. pressure gradient and
advection terms (equivalent 8th order)
– Implicite Crank-Nicholson scheme for vertical mixing terms
Numerics: Perfomances
POG - 0.25 deg
C. Blanc
ROMS – 0.25 deg
C. Blanc
ROMS_AGRIF
The same model
(executable) runs on grids
with different space/time
resolutions
2
20 45 34 59 3 3 3
• Each domain has its own input/output files
• Grid’s locations specified in AGRIF_FixedGrids.in
• Works in OPENMP/MPI
30 55 70 89 3 3 2
0
1
10 30 20 40 5 3 5
0
• Forcings, initial conditions generated with an interactive
matlab tool: « nesting gui »
Nymphea
Implementation
 Compilation
– Software required: Fortran95, Unix, C preprocessor, NetCDF library
– Compilation interface in ROMS which defines machine dependent
options (Tru64 UNIX)
 Parallelisation
– OpenMP: 1 knot of 4 processors
– MPI: for process studies (S. Le Gentil); needs work for realistic
applications
 Applications
– Realistic: coastal regions of West Africa (Morocco and Senegal),
Iroise sea,Bay of Brest
– Process studies at high resolution
Mercator
ROMS_AGRIF for West AFRICA
Sahara 5 km
W. Africa 25 km
Levitus
C. Blanc
C. Blanc
Clipper
C. Vert
242*252*32 points
dt=720s
PERFORMANCES: COST
CONFIGURATION
• 2 Embedded grids with refinement coef=5
• Size (child grid): 242*252*32 points with dt=720s
• Duration of simulation: 10 model years
• Processors: 1 knot of 4 processors Alpha EV68 (1GHz)
• Parallelization with OpenMP
• Partitionning: 4*8
Cost: c = 6. 10-6 CPU seconds / grid point / time step
(Total run time = 15 days)
Comparisons:
• PC Xeon 2.8Ghz: c=1.10-5
• SGI/CRAY Origin2000: c=8.10-5
• Earth Simulator (NEC SX): c=5. 10-7
PERFORMANCES:
SCALABILITY
• Nymphea: 95 % for 1-4
proc.
• SGI/CRAY-Origin2000:
OMP opt. part.
95% with saturation above
128 proc.
• Earth Simulator: 95-60%
for 1-512 proc.
MPI (1 sub/proc)
OMP (1 sub/proc)
Partitioning
NSUB_E
Senegal ideal case on Nymphea
(P. Estrade)
• Domain: 150*500*40 with dt=480s
• Partitioning 1*1 : Cost = 7.5 10-6
• Partitioning 1*64 : Cost = 6. 10-6
(units= CPU s/ grid point/ time step)
25 % gain due to optimal cache use
20
20
X
30
20
X
20
X
10
10
X
10
8
8
X
6
X
6
X
2
30
25
20
15
10
5
0
2
100 % gain due to
optimal cache use
Effet du partitionnement
Tiki, pentium III (biprocesseur)
seconde/itération
New Caledonia region
on PC (J. Lefêvre)
NSUB_X
Partition du dom aine en Latitude et Longitude
Domain: 159*171*20 with dt=480s
CONCLUSION
 ROMS is well optimized (code and methods)
and adapted to Nymphea which allows to
perform large runs in a reasonable time
without excessive queuing time
 The model is ready for faster, more
numerous processors (provided AGRIF is
fully tested with MPI)
 More storage would be welcome