- data:The data structure with all necessary CG vectors preallocated
- max_iter:The maximum number of iterations to perform, even if tolerance is not met.
- normr:The 2-norm of the residual vector after the last iteration.
- normr0:The 2-norm of the residual vector before the first iteration.
- times:The 7-element vector of the timing information accumulated during all of the iterations.
- doPreconditioning:The flag to indicate whether the preconditioner should be invoked at each iteration.
ComputeWAXPBY
compute the update of a vector with the sum of two
scaled vectors where: w = alpha x + beta y
参数形式(n,alpha,x,beta,y,w)
对于alpha或者beta等于1的情况,可以特殊考虑
1 |
#ifndef HPCG_NO_OPENMP |
ComputeDotProduct
time_allreduce:the time it took to perform the communication between processes通信所花费时间
参数形式(n,x,y,result,time_allreduce)
若x,y相等,特殊处理
利用openMP中的reduction
#pragma omp parallel for reduction (+:local_result)
并且利用MPI’s reduce function来collect all partial sumsMPI_Allreduce(&local_result, &global_result, 1, MPI_DOUBLE, MPI_SUM,MPI_COMM_WORLD);
迭代过程
首先计算r0=b-Ax0
k=1; k<=max_iter && normr/normr0 > tolerance; k++
preconditioning:
ComputeMG_ref(A, r, z)
Z <- M^-1 * r
no preconditioning:
copy r to z
1 |
if(k==1) |
SPMV:Ap = A p
Dot: pAp = p Ap
alpha = rtz/ pAp
WAXPBY: x = x + alpha p
WAXPBY: r = r - alpha Ap
Dot: normr = sqrt(r * r)
break条件:normr/normr0 <= tolerance || k > max_iter
time record
- times[1] += t1; // dot product time
- times[2] += t2; // WAXPBY time
- times[3] += t3; // SPMV time
- times[4] += t4; // AllReduce time
- times[5] += t5; // preconditioner apply time
- times[6] += t6; // exchange halo time
近期评论