preconditioned conjugate gradient

data:The data structure with all necessary CG vectors preallocated
max_iter:The maximum number of iterations to perform, even if tolerance is not met.
normr:The 2-norm of the residual vector after the last iteration.
normr0:The 2-norm of the residual vector before the first iteration.
times:The 7-element vector of the timing information accumulated during all of the iterations.
doPreconditioning:The flag to indicate whether the preconditioner should be invoked at each iteration.

ComputeWAXPBY

compute the update of a vector with the sum of two
scaled vectors where: w = alpha x + beta y

参数形式（n，alpha，x，beta，y，w）

对于alpha或者beta等于1的情况，可以特殊考虑

#ifndef HPCG_NO_OPENMP
   #pragma omp parallel for
#endif
   for (local_int_t i=0; i<n; i++)
    wv[i] = alpha * xv[i] + beta * yv[i];
 }

ComputeDotProduct

time_allreduce：the time it took to perform the communication between processes通信所花费时间

参数形式（n，x，y，result，time_allreduce)

若x，y相等，特殊处理
利用openMP中的reduction
#pragma omp parallel for reduction (+:local_result)
并且利用MPI’s reduce function来collect all partial sums
MPI_Allreduce(&local_result, &global_result, 1, MPI_DOUBLE, MPI_SUM,MPI_COMM_WORLD);

迭代过程

首先计算r0=b-Ax0

k=1； k<=max_iter && normr/normr0 > tolerance; k++

preconditioning:
ComputeMG_ref(A, r, z)
Z <- M^-1 * r

no preconditioning:
copy r to z

if(k==1)
p=z
rtz = r * z (s)

else
older rtz = rtz  
new rtz = r * z
beta = new rtz / older rtz
p = beta * p + z

SPMV:Ap = A p
Dot: pAp = p Ap
alpha = rtz/ pAp
WAXPBY: x = x + alpha p
WAXPBY: r = r - alpha Ap
Dot: normr = sqrt(r * r)

break条件：normr/normr0 <= tolerance || k > max_iter

time record

times[1] += t1; // dot product time
times[2] += t2; // WAXPBY time
times[3] += t3; // SPMV time
times[4] += t4; // AllReduce time
times[5] += t5; // preconditioner apply time
times[6] += t6; // exchange halo time

preconditioned conjugate gradient

ComputeWAXPBY

ComputeDotProduct

迭代过程

time record

近期文章

近期评论

标签

热门

文章归档

分类目录

功能