[Core][TrilinosApplication] Adding fallback linear solver #12309

loumalouomega · 2024-04-24T14:45:40Z

📝 Description

This PR adds FallbackLinearSolver. This solver enhances the robustness of linear system solving by implementing a fallback mechanism across multiple solver options. This solver attempts to solve linear systems using a predefined list of solvers sequentially until one succeeds. The idea would be to consider in first place an iterative solver and switch to a direct solver in case this fails.

REQUIRES THAT THE SOLVER RETURNS FALSE WHEN FAILING; NOT WITH ERROR

Features:

Sequential solver trials: Tries multiple solvers in a specified order, falling back to the next option if the current solver fails.
Configurable solver list: Users can define the list of fallback solvers via configuration parameters, allowing for customized solver sequences tailored to specific needs.
Automatic solver switching: Seamlessly switches between solvers during runtime without requiring user intervention, enhancing usability and efficiency.

Example Parameters

A typical configuration for the FallbackLinearSolver might look like this in JSON format:

{
    "solver_type": "fallback_linear_solver",
    "solvers": [
        {
            "solver_type": "amgcl",
            "smoother_type": "ilu0",
            "krylov_type": "lgmres",
            "max_iteration": 1000,
            "tolerance": 1e-6,
            "verbosity": 1
        },
        {
            "solver_type": "skyline_lu_factorization",
            "scaling": false,
            "verbosity": 1
        }
    ],
    "reset_solver_index_each_try": false
}

Parameters Explained

solver_type: This should always be "simple_fallback_linear_solver" for using this solver.
solvers: A collection of solvers to be tried in sequence. Each solver is identified by a key (solver_0, solver_1, etc.), and contains its own settings:
- solver_type: The type of solver to use (e.g., "amgcl", "skyline_lu_factorization"). This must be compatible with the available solvers in Kratos.
- Other settings within each solver depend on the specific solver type and may include things like smoother_type, krylov_type, max_iteration, tolerance, and verbosity.
reset_solver_index_each_try: A boolean value (true or false) indicating whether to start from the first solver in the list for each new solve attempt or continue from where the last attempt left off. Setting this to true can be useful if you expect the nature of the linear systems to change in a way that might affect solver performance differently over time.

🆕 Changelog

…tanceToSkinProcess

The commit adds the `FallbackLinearSolver` class to the `TrilinosApplication` in order to provide a fallback option for linear solvers. This allows for more robust and flexible solving capabilities.

RiccardoRossi

this is ok to me ... however i admit beforehand that i did not have the time to look long into this, so if anyone is doing a deeper review i think it would be much better.

my only comment is thet defaulting to skyline is VERY pessimistic!

kratos/linear_solvers/fallback_linear_solver.h

loumalouomega · 2024-04-24T17:00:40Z

this is ok to me ... however i admit beforehand that i did not have the time to look long into this, so if anyone is doing a deeper review i think it would be much better.

my only comment is thet defaulting to skyline is VERY pessimistic!

I pointed in the code your concern

philbucher · 2024-04-24T21:01:06Z

I would like to review, can you please wait so that I can take a look latest on the weekend?

kratos/linear_solvers/fallback_linear_solver.h

matekelemen · 2024-04-26T22:18:15Z

kratos/linear_solvers/fallback_linear_solver.h

+    FallbackLinearSolver(const FallbackLinearSolver& rOther)
+        : mSolvers(rOther.mSolvers),
+          mParameters(rOther.mParameters),
+          mCurrentSolverIndex(rOther.mCurrentSolverIndex)
+    {
+    }


this is a shallow copy. I'd either delete the copy constructor, or do a proper deep copy.

matekelemen · 2024-04-26T22:19:00Z

kratos/linear_solvers/fallback_linear_solver.h

+    FallbackLinearSolver& operator=(const FallbackLinearSolver& rOther)
+    {
+        mSolvers = rOther.mSolvers;
+        mParameters = rOther.mParameters;
+        mCurrentSolverIndex = rOther.mCurrentSolverIndex;
+        return *this;
+    }


same comment as for the copy constructor

kratos/linear_solvers/fallback_linear_solver.h

matekelemen · 2024-04-26T22:40:40Z

kratos/linear_solvers/fallback_linear_solver.h

+                // Empty in defaults. Should be filled with the solvers to try. For example:
+                // {
+                //     "solver_type": "amgcl"
+                // },
+                // {
+                //     "solver_type": "skyline_lu_factorization"
+                // }
+                // Label of the solver is solver_x, where x is the index in the list


comments are stripped from Parameters. Documentation should be written in the docstrings (preferably the class' docstring).

matekelemen · 2024-04-26T22:42:26Z

kratos/linear_solvers/fallback_linear_solver.h

+        if (mCurrentSolverIndex < mSolvers.size()) {
+            KRATOS_INFO("FallbackLinearSolver") << "Current solver " << GetCurrentSolver()->Info() << " failed with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
+        } else {
+            KRATOS_WARNING("FallbackLinearSolver") << "Current solver index is out of bounds." << std::endl;


this should be an exception

See my comments below

matekelemen · 2024-04-26T22:43:21Z

kratos/linear_solvers/fallback_linear_solver.h

+        if (mCurrentSolverIndex < mSolvers.size()) {
+            KRATOS_INFO("FallbackLinearSolver") << "Switching to new solver " << GetCurrentSolver()->Info() << " with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
+        } else {
+            KRATOS_WARNING("FallbackLinearSolver") << "New solver index is out of bounds." << std::endl;


again, just throw an error

Nope, with this it will simply continues as it is done now with current standard linear solvers

I would say that this method should not even be called when the last solver is reached. Seems too complicated, and you need to handle the out-of-bounds case in other places too

matekelemen · 2024-04-26T22:44:39Z

kratos/linear_solvers/fallback_linear_solver.h

+     *       outlines a placeholder for enhanced future functionality, such as more comprehensive logging or
+     *       additional transition actions.
+     */
+    void UpdateCounterSolverIndex()


counter ... index

I think one of them is enough

matekelemen · 2024-04-26T22:50:42Z

btw shouldn't this be in the LinearSolversApplication?

loumalouomega · 2024-04-29T07:10:33Z

btw shouldn't this be in the LinearSolversApplication?

Nope, that application is for external libraries linear solvers. This is a blackbox linear solver that takes as input 2+ linear solvers, not solving anything by itself.

Co-authored-by: Máté Kelemen <44344022+matekelemen@users.noreply.github.com>

matekelemen · 2024-04-29T07:36:49Z

Nope, that application is for external libraries linear solvers.

If that really is the case, then I'd put it into core. There's nothing in this solver that would require Trilinos and it would be a shame to demand Trilinos just to get access to it.

…nd integration test. (#12288) * Incremental displacement variable, output and integration test. * Output enabled fhrough the C++ route too.

…thermal element (#12303) * Added thermal line element * Fixed a bug * Added test cases * Added 3D thermal line element + test cases * modifications based on review 3 Added README.md for the test cases * fix for documentation * Modifications based on review 3 * Modifications based on review 4 * improvements in README * Fix in README.md for units * A small fix

loumalouomega · 2024-04-29T13:53:30Z

seems that you messed up a master-merge 🤔

loumalouomega · 2024-04-29T13:54:49Z

seems that you messed up a master-merge 🤔

Fixed

philbucher · 2024-04-30T07:17:36Z

Lgtm
I will approve on Monday to give others some time to review

Pls ping me in case I forget

philbucher

hm seems that last time I looked I didnt see much code.

Now I have quite a few comments, mainly why you add the low-level access functions

kratos/tests/cpp_tests/linear_solvers/test_fallback_linear_solver.cpp

kratos/linear_solvers/fallback_linear_solver.h

philbucher · 2024-05-05T21:46:41Z

kratos/linear_solvers/fallback_linear_solver.h

+        if (mCurrentSolverIndex < mSolvers.size()) {
+            KRATOS_INFO("FallbackLinearSolver") << "Switching to new solver " << GetCurrentSolver()->Info() << " with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
+        } else {
+            KRATOS_WARNING("FallbackLinearSolver") << "New solver index is out of bounds." << std::endl;


I would say that this method should not even be called when the last solver is reached. Seems too complicated, and you need to handle the out-of-bounds case in other places too

philbucher · 2024-05-05T21:47:21Z

kratos/linear_solvers/fallback_linear_solver.h

+            if constexpr (TSparseSpaceType::IsDistributed()) {
+                faster_direct_solvers = std::vector<std::string>({ // May need to be updated and reordered. In fact I think it depends of the size of the equation system
+                    "mumps2",         // Amesos2 (if compiled with MUMPS-support)
+                    "mumps",          // Amesos (if compiled with MUMPS-support)
+                    "super_lu_dist2", // Amesos2 SuperLUDist (if compiled with MPI-support)
+                    "super_lu_dist",  // Amesos SuperLUDist (if compiled with MPI-support)
+                    "amesos2",        // Amesos2
+                    "amesos",         // Amesos
+                    "klu2",           // Amesos2 KLU
+                    "klu",            // Amesos KLU
+                    "basker"          // Amesos2 Basker


this should be in the trilinos app, do you think it could be accomodated?

It could be added to the space so it will depend on the space and avoid this issue

kratos/linear_solvers/fallback_linear_solver.h

philbucher · 2024-05-05T21:49:30Z

kratos/python/add_linear_solvers_to_python.cpp

+    .def("AddSolver", [](FallbackLinearSolverType& rSelf, LinearSolverType::Pointer pSolver) {
+        rSelf.AddSolver(pSolver);
+    })
+    .def("AddSolver", [](FallbackLinearSolverType& rSelf, const Parameters ThisParameters) {
+        rSelf.AddSolver(ThisParameters);
+    })
+    .def("GetSolvers", &FallbackLinearSolverType::GetSolvers)
+    .def("SetSolvers", &FallbackLinearSolverType::SetSolvers)
+    .def("GetResetSolverEachTry", &FallbackLinearSolverType::GetResetSolverEachTry)
+    .def("SetResetSolverIndexEachTry", &FallbackLinearSolverType::SetResetSolverIndexEachTry)
+    .def("GetParameters", &FallbackLinearSolverType::GetParameters)
+    .def("GetCurrentSolverIndex", &FallbackLinearSolverType::GetCurrentSolverIndex)
+    .def("ClearCurrentSolverIndex", &FallbackLinearSolverType::ClearCurrentSolverIndex)


My main usage concern are the setter methods, why do you need those? IMO they are only required if you want to access stuff on a very low level, which none of the usual functionalities like B&S or strategies do.

Same for the other methods, do you add them just because? I would prefer to keep such difficult things easy to use, and extend if there is an actual need

Can you please explain the usecase?

For testing and benchmarking mainly

philbucher · 2024-05-05T21:51:49Z

applications/TrilinosApplication/custom_python/add_trilinos_linear_solvers_to_python.cpp

+        .def(py::init<Parameters>())
+        .def(py::init<TrilinosLinearSolverType::Pointer, TrilinosLinearSolverType::Pointer, Parameters>())
+        .def(py::init<const std::vector<TrilinosLinearSolverType::Pointer>&, Parameters>())
+        .def("AddSolver", [](TrilinosFallbackLinearSolverType& rSelf, TrilinosLinearSolverType::Pointer pSolver) {
+            rSelf.AddSolver(pSolver);
+        })
+        .def("AddSolver", [](TrilinosFallbackLinearSolverType& rSelf, const Parameters ThisParameters) {
+            rSelf.AddSolver(ThisParameters);
+        })
+        .def("GetSolvers", &TrilinosFallbackLinearSolverType::GetSolvers)
+        .def("SetSolvers", &TrilinosFallbackLinearSolverType::SetSolvers)
+        .def("GetResetSolverEachTry", &TrilinosFallbackLinearSolverType::GetResetSolverEachTry)
+        .def("SetResetSolverIndexEachTry", &TrilinosFallbackLinearSolverType::SetResetSolverIndexEachTry)
+        .def("GetParameters", &TrilinosFallbackLinearSolverType::GetParameters)
+        .def("GetCurrentSolverIndex", &TrilinosFallbackLinearSolverType::GetCurrentSolverIndex)
+        .def("ClearCurrentSolverIndex", &TrilinosFallbackLinearSolverType::ClearCurrentSolverIndex)
+        ;


(if we leave those functions after the discussion) you can refactor this into a function so that updates to the core will also be done here

Co-authored-by: Philipp Bucher <philipp.bucher@tum.de>

This reverts commit 2abd9b8.

loumalouomega · 2024-05-10T08:29:56Z

@matekelemen and @philbucher are you happy with the changes?

matekelemen · 2024-05-13T07:46:27Z

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

loumalouomega · 2024-05-13T07:48:54Z

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

matekelemen · 2024-05-13T08:14:17Z

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

Yes, that's the point.

With the current behavior, the solver would fail after moving on to the first fallback that needs extra data. Am I not being clear, or am I missing something?

loumalouomega · 2024-05-21T07:38:48Z

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

Yes, that's the point.

With the current behavior, the solver would fail after moving on to the first fallback that needs extra data. Am I not being clear, or am I missing something?

Maybe the clever thing would be to call the corresponding functions when switching solver launch

matekelemen · 2024-05-23T07:51:24Z

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.

P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

…in solver iterations

loumalouomega · 2024-05-27T09:13:29Z

@matekelemen check if you agree with the change

loumalouomega · 2024-06-10T07:40:45Z

@KratosMultiphysics/technical-committee I think this is now up to you

RiccardoRossi · 2024-06-11T07:56:53Z

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.

P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

quite honestly i do not understand what you mean ...

loumalouomega · 2024-06-11T08:09:17Z

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.
P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

quite honestly i do not understand what you mean ...

He means linear solvers like the one I use in the contact app that restructures all the components so the LHS and RHS is purely displacement based

loumalouomega added 7 commits April 24, 2024 16:38

feat: Add possibility to customize skin variable in CalculateNodalDis…

efe20ba

…tanceToSkinProcess

feat: Add possibility to customize skin variable in CalculateNodalDis…

d794f45

…tanceToSkinProcess

feat: Add FallbackLinearSolver to standard linear solver factory

b0b6940

feat: Add test for FallbackLinearSolver

e4d6135

feat: Add FallbackLinearSolver to TrilinosApplication

a73302b

The commit adds the `FallbackLinearSolver` class to the `TrilinosApplication` in order to provide a fallback option for linear solvers. This allows for more robust and flexible solving capabilities.

feat: Add FallbackLinearSolver to TrilinosApplication

9d80f42

feat: Add FallbackLinearSolver test for TrilinosApplication

8f268ef

loumalouomega added Enhancement Kratos Core Applications Testing Parallel-MPI Distributed memory parallelism for HPC / clusters Feature labels Apr 24, 2024

loumalouomega requested review from pooyan-dadvand and pablobecker April 24, 2024 14:45

loumalouomega requested review from a team as code owners April 24, 2024 14:45

RiccardoRossi reviewed Apr 24, 2024

View reviewed changes

loumalouomega commented Apr 24, 2024

View reviewed changes

kratos/linear_solvers/fallback_linear_solver.h Outdated Show resolved Hide resolved

matekelemen reviewed Apr 26, 2024

View reviewed changes

Explicit

483f09e

Co-authored-by: Máté Kelemen <44344022+matekelemen@users.noreply.github.com>

loumalouomega requested a review from a team as a code owner April 29, 2024 07:52

WPK4FEM and others added 3 commits April 29, 2024 09:52

[GeoMechanicsApplication] Incremental displacement variable, output a…

a925df1

…nd integration test. (#12288) * Incremental displacement variable, output and integration test. * Output enabled fhrough the C++ route too.

required changes

5f10a92

Merge branch 'master' into core/fallback-linear-solver

5731586

philbucher reviewed May 5, 2024

View reviewed changes

loumalouomega and others added 8 commits May 6, 2024 10:10

Simplify test

2abd9b8

Co-authored-by: Philipp Bucher <philipp.bucher@tum.de>

KRATOS_ERROR

a6dd1dd

Revert "Simplify test"

4a1f800

This reverts commit 2abd9b8.

Merge branch 'master' into core/fallback-linear-solver

c13d249

Using Or operation in AdditionalPhysicalDataIsNeeded

e9ad45a

Defining list into space

388deb5

Update solver

5bde144

Cleaning

a6ed943

loumalouomega mentioned this pull request May 6, 2024

[Core][TrilinosApplication] Defining list of fastest direct linear solvers. #12350

Merged

Merge branch 'master' into core/fallback-linear-solver

d2b4895

loumalouomega requested review from RiccardoRossi, matekelemen and philbucher May 15, 2024 10:26

loumalouomega added 2 commits May 27, 2024 10:13

Merge branch 'master' into core/fallback-linear-solver

77c1e35

Update fallback_linear_solver.h to include additional data provision …

fae7f7d

…in solver iterations

[Core][TrilinosApplication] Adding fallback linear solver #12309

Are you sure you want to change the base?

[Core][TrilinosApplication] Adding fallback linear solver #12309

Conversation

loumalouomega commented Apr 24, 2024 • edited

📝 Description

Features:

Example Parameters

Parameters Explained

🆕 Changelog

RiccardoRossi left a comment

Choose a reason for hiding this comment

loumalouomega commented Apr 24, 2024

philbucher commented Apr 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matekelemen commented Apr 26, 2024

loumalouomega commented Apr 29, 2024

matekelemen commented Apr 29, 2024

loumalouomega commented Apr 29, 2024

loumalouomega commented Apr 29, 2024

philbucher commented Apr 30, 2024

philbucher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

loumalouomega commented May 10, 2024

matekelemen commented May 13, 2024

loumalouomega commented May 13, 2024

matekelemen commented May 13, 2024

loumalouomega commented May 21, 2024

matekelemen commented May 23, 2024 • edited

loumalouomega commented May 27, 2024

loumalouomega commented Jun 10, 2024

RiccardoRossi commented Jun 11, 2024

loumalouomega commented Jun 11, 2024

loumalouomega commented Apr 24, 2024 •

edited

matekelemen commented May 23, 2024 •

edited