• Alexis SALZMAN's avatar
    [xLinAlg] add reducing mechanism to mumps distributed interface · 9ca299e0
    Alexis SALZMAN authored
    When dealing with large number of cores with mumps distributed interface,
    if the size of the problem is small, poor performance are obtained with
    mumps library. The fact that matrix is presented in distributed format
    alleviate some internal optimization (available with centralize matrix
    format) and mumps among other has difficulty then to estimate its memory
    needs. This lead to underestimate memory evaluation during analyse phase
    which stop computation during factorization with a -9 error. Even if
    this memory issue is bypassed CPU time increase with number of cores
    when more then needed are used.
    
    In this commit a extra parameter ratio_reduce_comm_ is added to
    connectMatrix method of xLinearSystemSolverMumpsDistributed class.
    It correspond some how to an "ideal" ratio between the number of core to
    use for a given problem size "n". mx=ratio_reduce_comm * n is giving a
    roughs estimate of the maximum number of cores needed by mumps
    computation for a "n" size linear system. If the communicator given to
    mumps interface is larger then this mx estimate, only mx cores will be
    used with mumps. In this case interface will allocate its own memory to
    store matrix terms and user will have to use reduceMatrices() if he
    updates value in matrix storage connected to the interface. Because in
    this case interface groups matrix terms in the reduced set of core
    participating to mumps computation. Thus some communication need to be
    done every times terms changes in connected matrix. This alleviate some
    how the nice unique memory space shared between the connect matrix and
    mumps.
    
    When the communicator given to mumps interface is smaller then this mx
    estimate the interface behaves has before. Same if ratio_reduce_comm_ is
    null.
    
    This solution represent an intermediate between centralized and full
    distributed matrix format for mumps. If ratio lead to mx=1 a even more
    clearer implementation would be to switch to centralized  matrix format
    for mumps. TODO.
    
    For now memory consumption is not optimized as regrouped terms on process
    that hold matrix may be duplicate (i.e. a term i,j may appearers many
    time due to its presence in many process). Its not a problem for mumps
    as it will sum them but it cost memory. Reducing this consumption is not
    so easy to do. Communication buffer have to transfers all those
    terms so their size may be important anyway. TODO.
    9ca299e0
Name
Last commit
Last update
Trellis Loading commit data...
xAnalyticalSolution Loading commit data...
xCrack Loading commit data...
xCut Loading commit data...
xDomainDecomp Loading commit data...
xExport Loading commit data...
xExt Loading commit data...
xFEM Loading commit data...
xFastMarching Loading commit data...
xGeom Loading commit data...
xGraph Loading commit data...
xInterface Loading commit data...
xLinAlg Loading commit data...
xMapping Loading commit data...
xMeshTool Loading commit data...
xOctree Loading commit data...
xPhysics Loading commit data...
xQuadrature Loading commit data...
xTLS Loading commit data...
xTensor Loading commit data...
xTool Loading commit data...
xUtil Loading commit data...
.clang-format Loading commit data...
.gitignore Loading commit data...
CMakeLists.txt Loading commit data...
CONTRIBUTING.md Loading commit data...
Ext_dependence.dot Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
Xfiles_dependence.dot Loading commit data...