Multiresolution Analysis

Download Report

Transcript Multiresolution Analysis

A Parallel Hierarchical Solver for the
Poisson Equation
http://wavelets.mit.edu/~darshan/18.337
Seung Lee
Deparment of Mechanical Engineering
[email protected]
R.Sudarshan
Department of Civil and Environmental Engineering
[email protected]
13th May 2003
Outline
• Introduction
– Recap of hierarchical basis methods
• Used for preconditioning and adaptivity
• Implementation
– Methodology
– Mesh subdivision
– Data Layout
– Assembly and Solution
• Results and Examples
– Comparison with finite element discretization
– Speed up with number of processors
• Conclusions and Further Work
Introduction
•
•
•
•
Problem being considered:
Discretizing the operator using finite difference or finite element
methods leads to stiffness matrices with large condition numbers
–
Need a good preconditioner (ILU, IC, circulant, domain decomp.)
–
Choosing such a preconditioner is “black magic”
Why create an ill-conditioned matrix and then precondition it, when
you can create a well-conditioned matrix to begin with!
Instead of using:
Use
Single Level Vs. Multilevel Approaches
Single Level
Multilevel


Is better conditioned than
+
+
Formulation and Solution
•
Discretize using piecewise bilinear hierarchical bases:
•
Solve coarsest level problem with PCG (and diagonal
preconditioning). Then determine first level details,
second level details, etc
How well does this method parallelize?  Big Question
– Multiscale stiffness matrix is not as sparse
– Are the savings over the single level method really
significant?
•
Implementation - I: The Big Picture
Read input, subdivide the mesh, and
distribute node and edge info to
processors
For I=1:N_level*,
Use solution from previous
mesh as a guess
Done by the “Oracle”
Done in parallel
Distribute elements
Assemble K
Solve system of Eqs.
Perform inverse wavelet
transform and
consolidate solution
No
I > N_level??
Yes
We (hopefully) have the
converged solution!!
*N_levels is known a priori
Implementation – II: Mesh Subdivision
Level 0
Level 1
Level 2
1 element
4 elements
16 elements
• Each parent element is subdivided into four children
elements
• Number of elements and DoFs increase geometrically , and
the solution convergences with only a few subdivision levels
Implementation – III: Data Layout
• Degrees of freedom in the mesh are
distributed linearly
– Uses a naïve partitioning algorithm
– Each processor gets roughly
NDoF/NP dofs (the Update set)
– Each processor assembles the rows of
the stiffness matrix corresponding to
elements in its update set
• Each processor has info about all faces
connected to vertices in its update set and
all vertices connected to such faces
– Equivalent to “basis partitioning”
II
I
III
IV
Implementation – IV: Assembly and Solution
• Stiffness matrices stored in the modified sparse row format
– Requires less storage than CSR or CSC formats
• Equations solved using Aztec
– Solves linear systems in parallel
– Comes with PCG with diagonal preconditioning
• Inverse wavelet transform (synthesis of the final solution)
– Implemented using Aztec as a parallel matrix-vector
multiply
Input File Format
7
4
4
1
1.0 1.0
-1.0 1.0
-1.0 -1.0
1.0 -1.0
301
101
211
231
0123
Number of subdivisions
Number of vertices
Number of edges
Number of elements
Coordinate (x,y) of the vertices, and boundary info
(i.e. 1 = constrained, 0 = free)
1
1
1
1
Edge definition (vertex1, vertex2), and boundary info
(i.e. 1 = constrained, 0 = free)
Element definition (edge1, edge2, edge3, edge4)
1
1
0
2
1
0
2
3
3
Vertices
Edges
Elements
Results – I: Square Domain
Level 1
Level 6
Single Scale vs. Hierarchical Basis
Number of Iterations
Number of iterations vs. Degrees of Freedom
Degrees of Freedom
• Same order of convergence, but fewer number of iterations
for larger problems
Solution Time Vs. Number of Processors
Square Domain Solution
• Coarsest mesh (level 1) – 9 DoFs, 1 iteration to solve, took
0.004 seconds on 4 procs
• Finest mesh (level 6) – 4225 DoFs, 43 iteration to solve,
took 0.123 seconds on 4 procs
Results – II: “L” Domain
• Coarsest mesh (level 1) – 21 DoFs, 3 iteration to solve,
took 0.0055 seconds on 4 procs
• Finest mesh (level 6) – 12545 DoFs, 94 iteration to solve,
took 0.280 seconds on 4 procs
Single Level Vs. Hierarchical Basis
Solution Time Vs. Number of Processors
Results – III: “MIT” Domain
• Coarsest mesh (level 1) – 132 DoFs, 15 iteration to solve,
took 0.012 seconds on 4 procs
• Finest mesh (level 6) – 91520 DoFs, 219 iteration to solve,
took 4.77 seconds on 8 procs
Single Level Vs. Hierarchical Basis
Did not converge
after 500 iters
Solution Time Vs. Number of Processors
Conclusions and Further Work
• Hierarchical basis method parallelizes well
– Provides a cheap and effective parallel preconditioner
– Scales well with number of processors
• With the right libraries, parallelism is easy!
– With Aztec, much of the work involved writing (and
debugging!) the mesh subdivision, mesh partitioning
and the matrix assembly routines
• Further work
– Parallelize parts of the oracle (e.g. mesh subdivision)
– Adaptive subdivision based on error estimation
– More efficient geometry partitioning
– More general element types (right now, we are
restricted to rectangular four-node elements)
The End
(Galerkin) Weak Form
• Formally,
• Leads to a multilevel system of equations
Coarse <=> coarse interactions
Coarse <=> fine interactions
Fine <=> fine interactions