MFE Simulation Data Management SLAC DMW 2004 March 16, 2004 W. W. Lee and S.

Download Report

Transcript MFE Simulation Data Management SLAC DMW 2004 March 16, 2004 W. W. Lee and S.

MFE Simulation Data Management SLAC DMW 2004

March 16, 2004 W. W. Lee and S. Klasky Princeton Plasma Physics Laboratory Princeton, NJ

Spatial & Temporal Scales Present Major Challenge to Theory & Simulations

Huge range of spatial and temporal scales.

Overlap in scales often means strong (simplified) ordering not possible

Different codes/theory for different scales.

5+years: Integration of physics into Fusion Simulation Project

atomic mfp skin depth electron-ion mfp system size tearing length ion gyroradius Debye length electron gyroradius Spatial Scales (m) 10 -6 10 -4 10 -2 10 0 Inverse ion plasma frequency pulse length current diffusion inverse electron plasma frequency confinement ion gyroperiod electron gyroperiod Ion collision electron collision 10 2 10 -10 10 -5 10 0 Temporal Scales (s) 10 5

Major Fusion Codes

Code

GTC

Gyro GS2 Degas2 Transp Data Rates of Major Fusion Codes (GB) now / 5yr 4,000 / 100,000 Runtime now/5yr (hr) 300/150 Processors Now/5yr 2048 Mbs Now/5yr 80/ 1600 10 / 100 10 / 100 .1

.05

30/30 30/30 1 3 512/2048 512/2048 10 1 .8/ 8 .8 / 8 .2

.04

5/ 50 20/20 128 .6/ 6 Nimrod M3D NSTX

Total (TB)

10 / 100 .25/shot 1/ 4

4.3 / 101

20/20 0.25 * 40 128 1.1/ 11 9, 36

Plasma Turbulence Simulation

• Gyrokinetic Particle-In-Cell Simulation -- Reduced Vlasov-Maxwell Equations • Simulations on MPP Platforms -- Cray T3E & IBM SP (NERSC), Cray-X1 (ORNL), SX6 (Earth Simulator, Japan) • Simulation of Burning Plasmas -- International Tokamak Experimental Reactor (ITER) • Integrated Fusion Simulation Project (MFE) • Visualization -- turbulence evolution & particle orbits

Gyrokinetic Approximation

• Gyromotion • Polarization provides quasineutrality [W. W. Lee, PF ‘83; JCP ‘87]

Earth Simulator 18% 10 (Ethier)

Ion Temperature Gradient Driven Turbulence QuickTime™ and a Video decompressor are needed to see this picture.

QuickTime™ and a Cinepak decompressor are needed to see this picture.

Electrostatic Potential Particle Trajectories

Data Management challenges

• GTC is producing TBs of data – Data rates: 80Mbs now, 1.6Gbs 5 years.

– Need QOS to stream data.

• This data needs to be post-processed – Essential to parallelize the post-processing routines to handle our larger datasets.

– We need a cluster to post process this data.

• M (supercomputer processors) x N (cluster processors) problem.

• QOS becomes more important to sustain this post-processing.

• The post-processed data needs to be shared among collaborators – Different sections of the post-processed data may go to different users .

– Post-processed data, along with other metadata should be archived into a relational database.

Post processing of GTC Data.

• Particle Data – No compression possible.

– Sent to 1 cluster for visualization/analysis.

– Work being done with K. Ma, U.C. Davis: Visualize a million particles.

– Gain new insights into the theory. • Field Data – Geometric/Temporal compression of the data is possible.

– Data needs to be streamed to a local cluster at PPPL.

– Reduced subset needs to be sent to PPPL + collaborators. • Use Logistic Network. [Beck, UT-K] • Data transfer needs to be automatic, and integrated into a dataflow/webflow for use with parallel analysis routines.

– We desire to see post-processed data during the simulation.

After the analysis

• Post-processed data needs to be saved into a relational database – How do we query this abstract data to compare it with experiments?

– 3D correlation functions – Processing of TBs of data/run now, 100’s of TBs of data/run in 5 years.

– Data mining techniques will be necessary to understand this data.