Tampilan Petugas: A Comparison of Preconditioning Techniques for Parallelized PCG Solvers for the CCFD Problem

A Comparison of Preconditioning Techniques for Parallelized PCG Solvers for the CCFD Problem

Main Authors:	Naff, Richard, Wilson, John
Format:	Proceeding
Terbitan:	, 2006
Online Access:	https://zenodo.org/record/3535612

ctrlnum	3535612
fullrecord	<?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Naff, Richard</creator><creator>Wilson, John</creator><date>2006-06-18</date><description>Parallel algorithms for solving sparse symmetric matrix systems that might result from the cell-centered finite difference (CCFD) scheme are compared. Parallelization is based in partitioning the mass matrix such that each partition is controlled by a separate process. These processes may then be distributed among a networked cluster of processors using the standard Message-Passing Interface (MPI). MPI software allows for multiple, simultaneous processes to coordinate and exchange information for some central purpose. In this study, partitioning of the mass matrix is based in decomposing the CCFD domain into non-overlapping subdomains; each subdomain corresponds to a partition of the mass matrix. The subdomain partitions are numbered alternately, using a red/black numbering scheme. Partitions are linked by the CCFD coefficients corresponding to cell edges that coincide with subdomain boundaries internal to the domain. A major portion of this work examines the best way to handle connectivity information between partitions. Algorithms considered in this study are based in the preconditioned conjugate gradient scheme (PCG) and differ only in the preconditioning used. Parallelization of the PCG solver entails running essentially identical conjugate-gradient loops on separate processes for every subdomain partition. These loops must exchange information globally to calculate inner products and locally for sharing connectivity information between partitions. The classic incomplete Cholesky preconditioner with zero fill (IC(0)) is used as the standard for comparison. Another preconditioning scheme considered is based principally in an approximate block Gaussian (BG) iterative solution to the problem. In both these preconditioners, arrays containing connectivity information are passed between processes corresponding to adjacent partitions. Because of this need to pass connectivity information, the incomplete Cholesky preconditioner is limited, in practice, to a zero fill application. This limitation can be partially alleviated by using a BG iteration as a preconditioner, which requires approximate solves of individual matrix partitions. These approximate solves can be carried out using any number of preconditioners, including IC(0); however, BG iteration involves an additional level of approximation, giving the result that BG iteration with approximate IC(0) solves is less efficient than using simple IC(0) as the preconditioner in the parallel PCG algorithm. Preconditioners based in the Jacobi scheme are also considered as they require no connectivity information. These preconditioners represent a trade off between lower communication costs at the expense of increased work to obtain convergence. We are presently exploring variants of these methods for use as preconditioners in the parallelized conjugate gradient scheme; we expect to report on the results of this work.</description><description>Presenters: name: Naff, Richard affiliation: U S Geological Survey</description><identifier>https://zenodo.org/record/3535612</identifier><identifier>10.4122/1.1000000386</identifier><identifier>oai:zenodo.org:3535612</identifier><relation>info:eu-repo/semantics/altIdentifier/doi/10.4122/1.1000000387</relation><relation>url:https://zenodo.org/communities/cmwrxvi</relation><relation>url:https://zenodo.org/communities/dtuproceedings</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><title>A Comparison of Preconditioning Techniques for Parallelized PCG Solvers for the CCFD Problem</title><type>Journal:Proceeding</type><type>Journal:Proceeding</type><recordID>3535612</recordID></dc>
format	Journal:Proceeding Journal
author	Naff, Richard Wilson, John
title	A Comparison of Preconditioning Techniques for Parallelized PCG Solvers for the CCFD Problem
publishDate	2006
isbn	9781000000382
url	https://zenodo.org/record/3535612
contents	Parallel algorithms for solving sparse symmetric matrix systems that might result from the cell-centered finite difference (CCFD) scheme are compared. Parallelization is based in partitioning the mass matrix such that each partition is controlled by a separate process. These processes may then be distributed among a networked cluster of processors using the standard Message-Passing Interface (MPI). MPI software allows for multiple, simultaneous processes to coordinate and exchange information for some central purpose. In this study, partitioning of the mass matrix is based in decomposing the CCFD domain into non-overlapping subdomains; each subdomain corresponds to a partition of the mass matrix. The subdomain partitions are numbered alternately, using a red/black numbering scheme. Partitions are linked by the CCFD coefficients corresponding to cell edges that coincide with subdomain boundaries internal to the domain. A major portion of this work examines the best way to handle connectivity information between partitions. Algorithms considered in this study are based in the preconditioned conjugate gradient scheme (PCG) and differ only in the preconditioning used. Parallelization of the PCG solver entails running essentially identical conjugate-gradient loops on separate processes for every subdomain partition. These loops must exchange information globally to calculate inner products and locally for sharing connectivity information between partitions. The classic incomplete Cholesky preconditioner with zero fill (IC(0)) is used as the standard for comparison. Another preconditioning scheme considered is based principally in an approximate block Gaussian (BG) iterative solution to the problem. In both these preconditioners, arrays containing connectivity information are passed between processes corresponding to adjacent partitions. Because of this need to pass connectivity information, the incomplete Cholesky preconditioner is limited, in practice, to a zero fill application. This limitation can be partially alleviated by using a BG iteration as a preconditioner, which requires approximate solves of individual matrix partitions. These approximate solves can be carried out using any number of preconditioners, including IC(0); however, BG iteration involves an additional level of approximation, giving the result that BG iteration with approximate IC(0) solves is less efficient than using simple IC(0) as the preconditioner in the parallel PCG algorithm. Preconditioners based in the Jacobi scheme are also considered as they require no connectivity information. These preconditioners represent a trade off between lower communication costs at the expense of increased work to obtain convergence. We are presently exploring variants of these methods for use as preconditioners in the parallelized conjugate gradient scheme; we expect to report on the results of this work. Presenters: name: Naff, Richard affiliation: U S Geological Survey
id	IOS16997.3535612
institution	DEFAULT
institution_type	library:public library
library	DEFAULT
collection	DEFAULT
city	DEFAULT
province	DEFAULT
repoId	IOS16997
first_indexed	2022-06-06T06:53:57Z
last_indexed	2022-06-06T06:53:57Z
recordtype	dc
merged_child_boolean	1
_version_	1739403646033985536
score	17.538404

A Comparison of Preconditioning Techniques for Parallelized PCG Solvers for the CCFD Problem

Lihat Juga