Commit 9ca299e0 authored by Alexis SALZMAN's avatar Alexis SALZMAN

[xLinAlg] add reducing mechanism to mumps distributed interface

When dealing with large number of cores with mumps distributed interface,
if the size of the problem is small, poor performance are obtained with
mumps library. The fact that matrix is presented in distributed format
alleviate some internal optimization (available with centralize matrix
format) and mumps among other has difficulty then to estimate its memory
needs. This lead to underestimate memory evaluation during analyse phase
which stop computation during factorization with a -9 error. Even if
this memory issue is bypassed CPU time increase with number of cores
when more then needed are used.

In this commit a extra parameter ratio_reduce_comm_ is added to
connectMatrix method of xLinearSystemSolverMumpsDistributed class.
It correspond some how to an "ideal" ratio between the number of core to
use for a given problem size "n". mx=ratio_reduce_comm * n is giving a
roughs estimate of the maximum number of cores needed by mumps
computation for a "n" size linear system. If the communicator given to
mumps interface is larger then this mx estimate, only mx cores will be
used with mumps. In this case interface will allocate its own memory to
store matrix terms and user will have to use reduceMatrices() if he
updates value in matrix storage connected to the interface. Because in
this case interface groups matrix terms in the reduced set of core
participating to mumps computation. Thus some communication need to be
done every times terms changes in connected matrix. This alleviate some
how the nice unique memory space shared between the connect matrix and
mumps.

When the communicator given to mumps interface is smaller then this mx
estimate the interface behaves has before. Same if ratio_reduce_comm_ is
null.

This solution represent an intermediate between centralized and full
distributed matrix format for mumps. If ratio lead to mx=1 a even more
clearer implementation would be to switch to centralized  matrix format
for mumps. TODO.

For now memory consumption is not optimized as regrouped terms on process
that hold matrix may be duplicate (i.e. a term i,j may appearers many
time due to its presence in many process). Its not a problem for mumps
as it will sum them but it cost memory. Reducing this consumption is not
so easy to do. Communication buffer have to transfers all those
terms so their size may be important anyway. TODO.
parent b65dc7d7
......@@ -142,13 +142,11 @@ class xLinearSystemSolverMumpsBase
int icntl[XLINEARSOLVERMUMPS_ICNTL_MX_SIZE]; // control parameter to store setting permanantely
bool cntl[15]; // control parameter to store setting permanantely (uggly as 15 may evolve form version :-( simplest for now)
double
rcntl[15]; // Here a arbitrary choice is made. Whatever T is we feed rcntl and id_cntl with a double. This will
// make no diference regaring real or complex arithmetic chosed as this array is alwayse of type REAL
// but this REAL type differs from float to double depending on arimethic (float for s,c and double for d,z)
// if using d or z no problem
// if using s or c compiler may complaine about a possible loss of data as double value will be casted
// to float. About To check/test
double rcntl[15]; // Here a arbitrary choice is made. Whatever T is we feed rcntl and id_cntl with a double. This will
// make no diference regaring real or complex arithmetic chosed as this array is alwayse of type REAL
// but this REAL type differs from float to double depending on arimethic (float for s,c and double for
// d,z) if using d or z no problem if using s or c compiler may complaine about a possible loss of data as
// double value will be casted to float. About To check/test
char write_problem[256];
void *mumps_struct; // real type is SMUMPS_STRUC_C/DMUMPS_STRUC_C/CMUMPS_STRUC_C/ZMUMPS_STRUC_C
......@@ -319,7 +317,7 @@ class xLinearSystemSolverMumpsDistributed : public xLinearSystemSolverMumpsBase
/// Connecting a Matrix to the solver
template <typename M>
void connectMatrix(M &matrix);
void connectMatrix(M &matrix, double ratio_reduce_comm_ = 0.);
/// Connecting a Matrix with Shur complement to the solver
template <typename M>
......@@ -352,16 +350,32 @@ class xLinearSystemSolverMumpsDistributed : public xLinearSystemSolverMumpsBase
/// Reset Mumps instance. This mainly remove factor allready computed if any.
//! This diconnect solver from already connect Matrix if any
//! It clean also reducing container
void reset();
/// Reduce connected Matrix
void reduceMatrices();
protected:
// initialisation methodes
template <typename M>
void init();
void init(int n);
void initDefaultParameter();
bool schur_connected;
// reducing members
bool reduce_comm;
int procid_reduced;
double ratio_reduce_comm;
MPI_Comm reduced_comm;
MPI_Comm gathering_comm;
T *unreduced_data;
int nnz_loc;
std::vector<int> nnz_loc_per_proc;
std::vector<int> nnz_loc_disp;
// instance status
char status;
// some rhs type need this extra location to gather on master proc
......@@ -433,7 +447,7 @@ class xLinearSystemSolverMumpsException : public std::exception
/////////////////////////////////////// End xLinearSystemSolverMumpsException class /////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
} // end namespace
} // namespace xlinalg
#include "xLinearSystemSolverMumps_imp.h"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment