4.2.1 Dune and PETSc
The solver was implemented in the Distributed and Unified Numerics Environment (Dune), which is a grid-based C++ toolbox for solving partial differential equations. Dune includes the discretisation module Dune-Fem, which allows implementations of finite element solvers
for parallel computers. It provides functions to implement local grid adaptivity, dynamic load balancing and higher order discretisation schemes (Dedner et al., 2010). Apart from native implementations of conjugate gradients solvers it also provides interfaces to the solvers and preconditioners of the Dune-Istl module, Ufmpack (Davis, 2004) for unsymmetric problems and PETSc (Balay et al., 2015), which has an extensive collection of solvers and preconditioners. Dune-Fem supports two types of parallelism, MPI and pthread. Dune is licensed under the GNU General Public Licence version 2.0 and thus free to use for everyone.
The decision to use Dune-Fem was made, because it is a fast, template-based and thus versatile C++ library that allowed the implementation of the complete electrode model (CEM), which has an uncommon weak formulation and is thus not easily implementable in most finite element libraries. Furthermore, Dune-Fem provides an interface to many different preconditioner and solver libraries and supports tetrahedral elements. The module is still in development and was flexibly adjusted to the requirements of this project.
PETSc is a C++ library providing data structures and routines for the solution of systems obtained from discretisation of partial differential equations. Its focus is on providing scalable parallel routines supporting MPI, pthread and GPU parallelism. Dune interfaces to PETSc in order to use the linear system solvers included in PETSc:
• Krylov Methods- PETSc has more than a dozen Krylov solvers available, such as CG, Bi-CG, Bi-CG-stab and GMRES.
• Preconditioners- Many preconditioners are provided, some from external packages. For non-parallel applications there are implementations for incomplete LU decomposition and successive over-relaxation. Supporting parallelism, the most interesting precondi- tioners are the ’classic’ algebraic multigrid from Hypre (Falgout, 2015) and the smoothed aggregation AMG from Trilinos (Gee et al., 2006).
• Direct Solvers- Some direct solvers are available, e.g. Mumps (Amestoy et al., 2006).
4.2.2 Complete Electrode Model and Jacobian Matrix
The derivation of the weak formulation of the CEM, which was used in Peits, is given in section 2.2.4. It does not directly solve for(u, U)∈(H1(Ω)⊕
RM)/R, but instead first solves for the internal potentialuand then computes the electrode potentialsUmin a second step
Z Ω σ∇v∇u+ M X m=1 1 zm Z Γm vu− M X m=1 1 zm|Γm| Z Γm v Z Γm u= M X m=1 1 |Γm| Z Γm vIm. (4.1)
One important observation in this weak formulation is the uncommon third term of type R
Γmu R
between processes, it has to be ensured that each electrode is not split onto different partitions (see Section 4.3.1 for the implementation).
Different grounding conditions can be applied. One option would be to set the average surface potential to zero (2.10c), but for ease of implementation it was decided to set one surface node to0V by applying a Dirichlet boundary condition. It is important to note, that the traditional CEM grounding conditionPM
m=1Um= 0is not easily implemented here, since the electrode
potentials are only computed in a later step. Once the forward solutions are computed, most EIT inversion algorithms require theJacobian matrixwhich translates a change in conductivity to a change in measured voltages by linearisation at the simulated conductivity distribution. The lead (or adjoint) fields method (section 2.2.7) was used to compute the Jacobian
δVda=−
Z
Ω
δσ∇u(Id)· ∇u(Ia) dΩ, (4.2)
whereu(Id)∈H1(Ω)is the electric potential emerging when the drive currentIdis applied
to the electrodes andu(Ia) ∈H1(Ω)the electric potential when a unit current is applied to the two measurement electrodes.δVda ∈Ris then the linearly approximated voltage change between the two measurement electrodes when the conductivity changes byδσ ∈L∞(Ω).
4.2.3 Methods
Unless otherwise noted, all run times were computed on a head mesh with different conductivities for the scalp, skull, cerebro-spinal fluid (CSF), white matter, grey matter and superior sagittal sinus (figure 4.1). The meshes were created from a CT and MRI scan of the same patients head as described in chapter 3. The assembly of the system matrix will be slower, the more elements are part of an electrode. To ensure that the results presented here can be compared to each other, the ratio of electrode elements to other elements was kept fixed by having a constant element size throughout the mesh. For real applications it is advantageous to refine the mesh around the electrodes and use larger elements towards the center of the head.
To measure the parallel scalability of the code, the efficiency was calculated as follows forp
parallel processes
strong scaling efficiency(p) = runtime(1)
runtime(p)·p. (4.3)
This efficiency is a value for the strong scaling. For the weak scaling it should be shown how much more elements can be computed in the same time by using more processors. The following definition for the efficiency of the weak scaling was used, wherexis a fixed number of elements
andruntime(px)means the time it takes to computep·xelements onpprocessors
weak scaling efficiency(p) = runtime(x)
Figure 4.1: Layered cut through a 5 million element head mesh- The mesh was created with Cgal from a segmentation of a CT and an MRI scan of the same person and contains the following tissues: scalp, skull, cerebro-spinal fluid, grey and white matter, and also parts of the superior sagittal sinus and air cavities, which are not visible in this image.
All run times were taken on a cluster with 5 nodes. Each node had two 6 core 2.40 GHz Intel Xeon processors with 12 MB cache and a total of 192 GB of memory. The nodes were connected by a dedicated 1 GB Ethernet switch. PETSc version 3.4.2 and Zoltan version 3.6 were used.
4.2.4 Overall Structure of Peits
The code is structured in different files that contain classes, structs and functions for specific tasks. The main file isdune_peits.cc, which performs the following important steps in this
order:
• Loading the mesh and partitioning it. If the mesh was already partitioned before, those partitions are loaded by the parallel processes directly.
• The electrode positions are loaded into a struct. This struct has query functions that evaluate if a given element belongs to an electrode.
disassembled into unique injections. The solution for each unique current injection is computed just once. This reduces the number of required forward solutions for a standard EIT protocol with around 1000 lines to around 60.
• The system matrix is assembled. The function which computes the system matrix entries is located in the fileelliptic.hh.
• In a for-loop the following steps are performed for each unique current injection: – The right-hand side of the weak formulation is assembled using a function inrhs.hh. – The CG solver computes the resulting electric potentials and the result is stored in a
vector.
– If specified in the parameter file, the first solution is written to a VTK file for visual inspection.
• In a second for-loop the following steps are performed for each line in the current protocol: – Trace back which forward solutions correspond to the drive current and measurement
current of this protocol line.
– Compute the measured voltage and save to a binary file.
– Compute the row of the Jacobian matrix using the forward solutions for the drive and measurement current and save it to a binary file.