Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][TrilinosApplication] Adding fallback linear solver #12309

Open
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

loumalouomega
Copy link
Member

@loumalouomega loumalouomega commented Apr 24, 2024

📝 Description

This PR adds FallbackLinearSolver. This solver enhances the robustness of linear system solving by implementing a fallback mechanism across multiple solver options. This solver attempts to solve linear systems using a predefined list of solvers sequentially until one succeeds. The idea would be to consider in first place an iterative solver and switch to a direct solver in case this fails.

REQUIRES THAT THE SOLVER RETURNS FALSE WHEN FAILING; NOT WITH ERROR

Features:

  • Sequential solver trials: Tries multiple solvers in a specified order, falling back to the next option if the current solver fails.
  • Configurable solver list: Users can define the list of fallback solvers via configuration parameters, allowing for customized solver sequences tailored to specific needs.
  • Automatic solver switching: Seamlessly switches between solvers during runtime without requiring user intervention, enhancing usability and efficiency.

Example Parameters

A typical configuration for the FallbackLinearSolver might look like this in JSON format:

{
    "solver_type": "fallback_linear_solver",
    "solvers": [
        {
            "solver_type": "amgcl",
            "smoother_type": "ilu0",
            "krylov_type": "lgmres",
            "max_iteration": 1000,
            "tolerance": 1e-6,
            "verbosity": 1
        },
        {
            "solver_type": "skyline_lu_factorization",
            "scaling": false,
            "verbosity": 1
        }
    ],
    "reset_solver_index_each_try": false
}

Parameters Explained

  • solver_type: This should always be "simple_fallback_linear_solver" for using this solver.
  • solvers: A collection of solvers to be tried in sequence. Each solver is identified by a key (solver_0, solver_1, etc.), and contains its own settings:
    • solver_type: The type of solver to use (e.g., "amgcl", "skyline_lu_factorization"). This must be compatible with the available solvers in Kratos.
    • Other settings within each solver depend on the specific solver type and may include things like smoother_type, krylov_type, max_iteration, tolerance, and verbosity.
  • reset_solver_index_each_try: A boolean value (true or false) indicating whether to start from the first solver in the list for each new solve attempt or continue from where the last attempt left off. Setting this to true can be useful if you expect the nature of the linear systems to change in a way that might affect solver performance differently over time.

🆕 Changelog

Copy link
Member

@RiccardoRossi RiccardoRossi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok to me ... however i admit beforehand that i did not have the time to look long into this, so if anyone is doing a deeper review i think it would be much better.

my only comment is thet defaulting to skyline is VERY pessimistic!

@loumalouomega
Copy link
Member Author

this is ok to me ... however i admit beforehand that i did not have the time to look long into this, so if anyone is doing a deeper review i think it would be much better.

my only comment is thet defaulting to skyline is VERY pessimistic!

I pointed in the code your concern

@philbucher
Copy link
Member

I would like to review, can you please wait so that I can take a look latest on the weekend?

kratos/linear_solvers/fallback_linear_solver.h Outdated Show resolved Hide resolved
Comment on lines 177 to 182
FallbackLinearSolver(const FallbackLinearSolver& rOther)
: mSolvers(rOther.mSolvers),
mParameters(rOther.mParameters),
mCurrentSolverIndex(rOther.mCurrentSolverIndex)
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a shallow copy. I'd either delete the copy constructor, or do a proper deep copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huuuummm

Comment on lines 192 to 198
FallbackLinearSolver& operator=(const FallbackLinearSolver& rOther)
{
mSolvers = rOther.mSolvers;
mParameters = rOther.mParameters;
mCurrentSolverIndex = rOther.mCurrentSolverIndex;
return *this;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as for the copy constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huuuuum

kratos/linear_solvers/fallback_linear_solver.h Outdated Show resolved Hide resolved
Comment on lines 757 to 764
// Empty in defaults. Should be filled with the solvers to try. For example:
// {
// "solver_type": "amgcl"
// },
// {
// "solver_type": "skyline_lu_factorization"
// }
// Label of the solver is solver_x, where x is the index in the list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments are stripped from Parameters. Documentation should be written in the docstrings (preferably the class' docstring).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

if (mCurrentSolverIndex < mSolvers.size()) {
KRATOS_INFO("FallbackLinearSolver") << "Current solver " << GetCurrentSolver()->Info() << " failed with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
} else {
KRATOS_WARNING("FallbackLinearSolver") << "Current solver index is out of bounds." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an exception

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments below

if (mCurrentSolverIndex < mSolvers.size()) {
KRATOS_INFO("FallbackLinearSolver") << "Switching to new solver " << GetCurrentSolver()->Info() << " with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
} else {
KRATOS_WARNING("FallbackLinearSolver") << "New solver index is out of bounds." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, just throw an error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, with this it will simply continues as it is done now with current standard linear solvers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that this method should not even be called when the last solver is reached. Seems too complicated, and you need to handle the out-of-bounds case in other places too

* outlines a placeholder for enhanced future functionality, such as more comprehensive logging or
* additional transition actions.
*/
void UpdateCounterSolverIndex()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counter ... index

I think one of them is enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe...

@matekelemen
Copy link
Contributor

btw shouldn't this be in the LinearSolversApplication?

@loumalouomega
Copy link
Member Author

btw shouldn't this be in the LinearSolversApplication?

Nope, that application is for external libraries linear solvers. This is a blackbox linear solver that takes as input 2+ linear solvers, not solving anything by itself.

Co-authored-by: Máté Kelemen <44344022+matekelemen@users.noreply.github.com>
@matekelemen
Copy link
Contributor

Nope, that application is for external libraries linear solvers.

If that really is the case, then I'd put it into core. There's nothing in this solver that would require Trilinos and it would be a shame to demand Trilinos just to get access to it.

@loumalouomega loumalouomega requested a review from a team as a code owner April 29, 2024 07:52
WPK4FEM and others added 3 commits April 29, 2024 09:52
…nd integration test. (#12288)

* Incremental displacement variable, output and integration test.
* Output enabled fhrough the C++ route too.
…thermal element (#12303)

* Added thermal line element

* Fixed a bug

* Added test cases

* Added 3D thermal line element + test cases

* modifications based on review 3

Added README.md for the test cases

* fix for documentation

* Modifications based on review 3

* Modifications based on review 4

* improvements in README

* Fix in README.md for units

* A small fix
@loumalouomega
Copy link
Member Author

seems that you messed up a master-merge 🤔

@loumalouomega
Copy link
Member Author

seems that you messed up a master-merge 🤔

Fixed

@philbucher
Copy link
Member

Lgtm
I will approve on Monday to give others some time to review

Pls ping me in case I forget

Copy link
Member

@philbucher philbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm seems that last time I looked I didnt see much code.

Now I have quite a few comments, mainly why you add the low-level access functions

kratos/linear_solvers/fallback_linear_solver.h Outdated Show resolved Hide resolved
if (mCurrentSolverIndex < mSolvers.size()) {
KRATOS_INFO("FallbackLinearSolver") << "Switching to new solver " << GetCurrentSolver()->Info() << " with the following settings: " << mParameters["solvers"][mCurrentSolverIndex].PrettyPrintJsonString() << std::endl;
} else {
KRATOS_WARNING("FallbackLinearSolver") << "New solver index is out of bounds." << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that this method should not even be called when the last solver is reached. Seems too complicated, and you need to handle the out-of-bounds case in other places too

Comment on lines 679 to 689
if constexpr (TSparseSpaceType::IsDistributed()) {
faster_direct_solvers = std::vector<std::string>({ // May need to be updated and reordered. In fact I think it depends of the size of the equation system
"mumps2", // Amesos2 (if compiled with MUMPS-support)
"mumps", // Amesos (if compiled with MUMPS-support)
"super_lu_dist2", // Amesos2 SuperLUDist (if compiled with MPI-support)
"super_lu_dist", // Amesos SuperLUDist (if compiled with MPI-support)
"amesos2", // Amesos2
"amesos", // Amesos
"klu2", // Amesos2 KLU
"klu", // Amesos KLU
"basker" // Amesos2 Basker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in the trilinos app, do you think it could be accomodated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be added to the space so it will depend on the space and avoid this issue

kratos/linear_solvers/fallback_linear_solver.h Outdated Show resolved Hide resolved
Comment on lines +246 to +258
.def("AddSolver", [](FallbackLinearSolverType& rSelf, LinearSolverType::Pointer pSolver) {
rSelf.AddSolver(pSolver);
})
.def("AddSolver", [](FallbackLinearSolverType& rSelf, const Parameters ThisParameters) {
rSelf.AddSolver(ThisParameters);
})
.def("GetSolvers", &FallbackLinearSolverType::GetSolvers)
.def("SetSolvers", &FallbackLinearSolverType::SetSolvers)
.def("GetResetSolverEachTry", &FallbackLinearSolverType::GetResetSolverEachTry)
.def("SetResetSolverIndexEachTry", &FallbackLinearSolverType::SetResetSolverIndexEachTry)
.def("GetParameters", &FallbackLinearSolverType::GetParameters)
.def("GetCurrentSolverIndex", &FallbackLinearSolverType::GetCurrentSolverIndex)
.def("ClearCurrentSolverIndex", &FallbackLinearSolverType::ClearCurrentSolverIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main usage concern are the setter methods, why do you need those? IMO they are only required if you want to access stuff on a very low level, which none of the usual functionalities like B&S or strategies do.

Same for the other methods, do you add them just because? I would prefer to keep such difficult things easy to use, and extend if there is an actual need

Can you please explain the usecase?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing and benchmarking mainly

Comment on lines +78 to +94
.def(py::init<Parameters>())
.def(py::init<TrilinosLinearSolverType::Pointer, TrilinosLinearSolverType::Pointer, Parameters>())
.def(py::init<const std::vector<TrilinosLinearSolverType::Pointer>&, Parameters>())
.def("AddSolver", [](TrilinosFallbackLinearSolverType& rSelf, TrilinosLinearSolverType::Pointer pSolver) {
rSelf.AddSolver(pSolver);
})
.def("AddSolver", [](TrilinosFallbackLinearSolverType& rSelf, const Parameters ThisParameters) {
rSelf.AddSolver(ThisParameters);
})
.def("GetSolvers", &TrilinosFallbackLinearSolverType::GetSolvers)
.def("SetSolvers", &TrilinosFallbackLinearSolverType::SetSolvers)
.def("GetResetSolverEachTry", &TrilinosFallbackLinearSolverType::GetResetSolverEachTry)
.def("SetResetSolverIndexEachTry", &TrilinosFallbackLinearSolverType::SetResetSolverIndexEachTry)
.def("GetParameters", &TrilinosFallbackLinearSolverType::GetParameters)
.def("GetCurrentSolverIndex", &TrilinosFallbackLinearSolverType::GetCurrentSolverIndex)
.def("ClearCurrentSolverIndex", &TrilinosFallbackLinearSolverType::ClearCurrentSolverIndex)
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if we leave those functions after the discussion) you can refactor this into a function so that updates to the core will also be done here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@loumalouomega
Copy link
Member Author

@matekelemen and @philbucher are you happy with the changes?

@matekelemen
Copy link
Contributor

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

@loumalouomega
Copy link
Member Author

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

@matekelemen
Copy link
Contributor

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

Yes, that's the point.

With the current behavior, the solver would fail after moving on to the first fallback that needs extra data. Am I not being clear, or am I missing something?

@loumalouomega
Copy link
Member Author

are you happy with the changes?

As I said, I'd just throw an exception if any of the fallback solvers returned true for AdditionalPhysicalDataIsNeeded.

But that will break the worflow if the linear solver requires it

Yes, that's the point.

With the current behavior, the solver would fail after moving on to the first fallback that needs extra data. Am I not being clear, or am I missing something?

Maybe the clever thing would be to call the corresponding functions when switching solver launch

@matekelemen
Copy link
Contributor

matekelemen commented May 23, 2024

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.

P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

@loumalouomega
Copy link
Member Author

@matekelemen check if you agree with the change

@loumalouomega
Copy link
Member Author

@KratosMultiphysics/technical-committee I think this is now up to you

@RiccardoRossi
Copy link
Member

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.

P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

quite honestly i do not understand what you mean ...

@loumalouomega
Copy link
Member Author

Maybe the clever thing would be to call the corresponding functions when switching solver launch

Well, you'd have to keep track of which solvers you called ProvideAdditionalPhysicalData for to prevent calling it multiple times if Solve is invoked repeatedly. Other than that, I'm not opposed to the idea.
P.S.: it would really help metasolvers like this one if we knew whether a LinearSolver manipulated its inputs (matrix, RHS). @KratosMultiphysics/technical-committee what do you think about extending the interface of LinearSolver with information about what the solver must mutate?

quite honestly i do not understand what you mean ...

He means linear solvers like the one I use in the contact app that restructures all the components so the LHS and RHS is purely displacement based

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants