Document 7792347

Download Report

Transcript Document 7792347

CCOS Seasonal Modeling:
The Computing Environment
S.Tonse, N.J.Brown & R. Harley
Lawrence Berkeley National Laboratory
University Of California at Berkeley
1
Overview
• Modeling Approach
• Linux Cluster: OS & hardware
• Parallel computing & performance
• CMAQ flavours and features
2
Modeling Approach
MM5 model provides 3D gridded spatial/temporal wind
vectors, temperature fields, cloud cover etc. MM5 code
run by ETL group of NOAA (J.Wilczak et al)
CMAQ (Community Multiscale Air Quality) Model (US
EPA) incorporates meteorology results from MM5 with
emissions. Fortran 90. Runs on Linux platforms with
Pentiums or Athlons. 3D time-stepping grid simulation
includes atmospheric chemistry and advective and
diffusive transport. Provides concentration for each 4km
x 4km grid cell at every hour.
3
Modeling Approach
Meteorological Model
(MM5) simulation of
week-long ozone
episode
Scenarios of Central
Utility Generator
emissions
Other emissions:
motor vehicle,
biogenic, major point
sources, and area
Photochemical AQM:
(CMAQ). Ozone
Transport, Chemistry,
Sources and Sinks
4
Sarmap Grid inside CCOS Grid
CCOS:
190 x 190
SARMAP
96 x 117
The Mariah Linux Cluster:
Hardware and OS
• Purchased with DOE funds
• Maintained by LBNL under
•
•
the Scientific Cluster Support
Program.
24 nodes, each has 2 Athlon
processors, 2GB RAM, see
eetd.lbl.gov/AQ/stonse/mariah/ for
more pictures.
Centos Linux (similar to Red
Hat Linux), run in a Beowulfcluster configuration
Parallel Computing on
Cluster (1)
• Typical input file sizes for a 5-day run:
•
Meteorological files: 2.7 GigaBytes, Emission file:
3 Gigabytes.
Typical output file sizes : Hourly output for all (or
selected) grid cells, for (1) Concentration file of
selected species: ~2GB (2) Process analysis
files, i.e. analysis of the individual contributions
from CMAQ’s several physical processes and the
SAPRC99 mechanism’s numerous chemical
reactions: (2-4 GB)
7
Parallel Computing on
Cluster (2)
• Typically split the 96x117 grid 3 ways in each direction at
•
•
•
•
beginning of run and use 9 processor elements (PE’s)
Message Passing Interface (MPI) sends data between
PE’s to handle the needs of a PE for data from a
neighbouring sub-domain. MPI subroutine calls
embedded in CMAQ code
A 5-day run takes about 5 days with a stiff Gear solver
Takes about 1 day with backward Euler solver (MEBI)
hard-coded for calculations of the SAPRC99 mechanism
Often we have to use the Gear solver. Can we use more
PE’s to speed up?
8
Parallel Computing on
Cluster (3)
Performance of CMAQ as number of Processor increases:
No. PE’s
Time (s)
Scalability Effective
no. PE’s
1
4500
100%
1
2
2431
92%
1.84
6
1019
74%
4.44
12
739
50%
6
24
539
35%
8.4
9
Parallel Computing on
Cluster (4)
•
Computational load is not balanced between PE’s, as
geographical locations with more expensive chemistry
use more CPU time.
•
Code only runs as fast as PE with greatest load. Others
must wait for it at “barriers”.
•
As no. of PE’s increases, probability of larger
discrepancies increases.
10
Parallel Computing on
Cluster (5)
More PE’s  sub-domains decrease in size, 
relatively more expense toward inter-processor
communication
Key parameter is ratio between a sub-domain’s
Perimeter/Area.
Cost of scientific calculation depends on Area
increases which as N2
Cost of communication depends on Perimeter
which increases as N
11
CMAQ Flavours (1)
Sensitivity Analysis, via Decoupled Direct
Method CMAQ DDM4.3, for sensitivity of any output to:
•Emissions (from all or part of the grid)
•Boundary Conditions
•Initial Conditions
•Reaction rates (not implemented yet)
•Temperature (not implemented yet)
Process Analysis: Analyse internal goings-on during
course of a run, eg. Change in O3 in a cell due to vertical
diffusion or to a particular chemical reaction.
All versions of CMAQ cannot do all of these things
12
CMAQ Flavours (2)
CMAQ
Version
Solver
Sensitivity
Process
Analysis
Comments
4.3
GEAR
No
Yes
Slow
4.3
MEBI
No
No
Fast.
Inaccurate for
minor species
DDM 4.3
MEBI
Yes
No
Fast.
Inaccurate for
minor species
4.4
GEAR
No
Yes
Slow
4.4
EBI
No
No
Fast
X.x
???
Yes
Yes
Fast
Last Slide Here.