Fix `SymmetricMatrixCoefficient::ProjectSymmetric` bug #4271

j-signorelli · 2024-04-26T20:07:16Z

The matrix defined for projection of SymmetricMatrixCoefficient onto a QuadratureFunction appears to be of incorrect size.

The current code creates a vdim x vdim matrix, where vdim is the dimension of the QF = height*(height+1)/2.

For wanting to output a symmetric 3x3 tensor (6 values), this then creates a 6x6 matrix instead of a 3x3 one. This PR fixes that.

PR	Author	Editor	Reviewers	Assignment	Approval	Merge
#4271	@j-signorelli	@tzanio	@najlkin + @paul-hilscher	5/11/24	5/16/24	⌛due 6/1/24

PR Checklist

Code builds.
Code passes make style.
Update CHANGELOG:
- Is this a new feature users need to be aware of? New or updated example or miniapp?
- Does it make sense to create a new section in the CHANGELOG to group with other related features?
Update INSTALL:
- Had a new optional library been added? If so, what range of versions of this library are required? (Make sure the external library is compatible with our BSD license, e.g. it is not licensed under GPL!)
- Have the version ranges for any required or optional libraries changed?
- Does make or cmake have a new target?
- Did the requirements or the installation process change? (rare)
Update continuous integration server configurations if necessary (e.g. with new version requirements for each of MFEM's dependencies)
- .github
- .appveyor.yml
Update .gitignore:
- Check if make distclean; git status shows any files that were generated from the source by the project (not an IDE) but we don't want to track in the repository.
- Add new patterns (just for the new files above) and re-run the above test.
New examples:
- All sample runs at the top of the example source file work.
- Update examples/makefile:
  - Add the example code to the appropriate SEQ_EXAMPLES and PAR_EXAMPLES variables.
  - Add any files generated by it to the clean target.
  - Add the example binary and any files generated by it to the top-level .gitignore file.
- Update examples/CMakeLists.txt:
  - Add the example code to the ALL_EXE_SRCS variable.
  - Make sure THIS_TEST_OPTIONS is set correctly for the new example.
- List the new example in doc/CodeDocumentation.dox.
- If new examples directory (e.g.examples/pumi), list it in doc/CodeDocumentation.conf.in
- Companion pull request for documentation in mfem/web repo:
  - Update or add example-specific documentation, see e.g. the src/examples.md.
  - Add the description, labels and screenshots in src/examples.md and src/img.
  - In examples.md, list the example under the appropriate categories, add new categories if necessary.
  - Add a short description of the example in the "Extensive Examples" section of features.md.
New miniapps:
- All sample runs at the top of the miniapp source file work.
- Update top-level makefile and makefile in corresponding miniapp directory.
- Add the miniapp binary and any files generated by it to the top-level .gitignore file.
- Update CMake build system:
  - Update the CMakeLists.txt file in the miniapps directory, if the new miniapp is in a new directory.
  - Add/update the CMakeLists.txt file in the new miniapp directory.
  - Consider adding a new test for the new miniapp.
- List the new miniapp in doc/CodeDocumentation.dox
- If new miniapps directory (e.g.miniapps/nurbs), add it to MINIAPP_SUBDIRS in the makefile.
- If new miniapps directory (e.g.miniapps/nurbs), list it in doc/CodeDocumentation.conf.in
- Companion pull request for documentation in mfem/web repo:
  - Update or add miniapp-specific documentation, see e.g. the src/meshing.md and src/electromagnetics.md files.
  - Add the description, labels and screenshots in src/examples.md and src/img.
  - The miniapps go at the end of the page, and are usually listed only under a specific "Application (PDE)" category.
  - Add a short description of the miniapp in the "Extensive Examples" section of features.md.
New capability:
- All new public, protected, and private classes, methods, data members, and functions have full Doxygen-style documentation in source comments. Documentation should include descriptions of member data, function arguments and return values, template parameters, and prerequisites for calling new functions.
- Pointer arguments and return values must specify whether ownership is being transferred or lent with the call.
- Any new functions should include descriptions of their intended use e.g. for internal use only, user-facing, etc., along with references to example code whenever possible/appropriate.
- Consider adding new sample runs in existing examples to highlight the new capability.
- Consider saving cool simulation pictures with the new capability in the Confluence gallery (LLNL only) or submitting them, via pull request, to the gallery section of the mfem/web repo.
- If this is a major new feature, consider mentioning it in the short summary inside README (rare).
- List major new classes in doc/CodeDocumentation.dox (rare).
Update this checklist, if the new pull request affects it.
Run make unittest to make sure all unit tests pass.
Run the tests in tests/scripts.
(LLNL only) After merging:
- Update internal tests to include the new features.

j-signorelli · 2024-04-26T21:27:16Z

I am not sure why the linker was not able to find symbols for arm64 for the checks -- if anyone has any suggestions, please let me know!

sebastiangrimberg · 2024-04-26T22:10:47Z

I am not sure why the linker was not able to find symbols for arm64 for the checks -- if anyone has any suggestions, please let me know!

GitHub actions recently started rolling out new runners for macOS jobs which use macOS 14 and Arm64 instances instead of x84_64 for macos-latest. Looks like this is what's causing the problem here -> The cached builds are being restored are from an older run on an x86_64 architecture, while the checks for this PR are running on arm64. I could be wrong, but think we need a PR to update the cache keys here which will force rebuilding of the Hypre and METIS dependencies in the cache for PR workflows.

tzanio · 2024-05-11T15:15:21Z

This PR is now under review (see the table in the PR description). To help with the review process, please do not force push to the branch.

najlkin

Yes, this is a bug for sure, there seems to be confusion between plain Vector or Memory wrappers and DenseSymmetricMatrix, which takes the size of the matrix as an argument of UseExternalData and not size of the data.

paul-hilscher · 2024-05-13T21:29:26Z

@j-signorelli ,

would it make sense to add a small unit test ? test_coefficient.cpp does cover MatrixCoefficient but it seems we do not have tests for SymmetricMatrixCoefficient / SymmetricMatrixConstantCoefficient (ideally including a test for ProjectSymmetry).

…t-fix

…SymmetricMatrix

…Matrix to Constant one

j-signorelli · 2024-05-14T19:56:45Z

@paul-hilscher:
Done, but it required additional changes + bug fixes that probably should be re-reviewed.

SymmetricMatrixConstantCoefficient was broken due to there not being any DenseSymmetricMatrix &DenseSymmetricMatrix::operator=(const DenseSymmetricMatrix &m) -- I defined this
mat was being used as an auxiliary variable in SymmetricMatrixCoefficient while derived classes SymmetricMatrixConstantCoefficient and SymmetricMatrixFunctionCoefficient each had their own mat variables. I renamed the auxiliary one mat_aux for clarification, and moved GetMatrix() to only be a member function of SymmetricMatrixConstantCoefficient, as it seems to have intended to be based on the documentation for it.
I removed the override of Print() for DenseSymmetricMatrix (did this as I was debugging, but it's more convenient to just use Matrix::Print() than not have anything implemented anyways)

fem/coefficient.hpp

tests/unit/fem/test_coefficient.cpp

linalg/symmat.cpp

fem/coefficient.hpp

paul-hilscher

Thanks for the fix and the tests! Great stuff.

j-signorelli · 2024-05-15T19:13:42Z

@paul-hilscher @najlkin Thank you for your feedback -- if there is anything else to adjust please let me know, otherwise this should be ready to merge into next @tzanio

tzanio

Thanks @j-signorelli !

tzanio · 2024-05-17T00:37:56Z

Merged in next for testing...

tzanio · 2024-05-17T17:33:27Z

There is a failure with this PR in our regression testing on Lassen (V100 machine) when building optimized + CUDA. The error is below, can you double-check the code?

INFO: Test filter: ~[Parallel] 
Filters: ~[Parallel]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unit_tests is a Catch v2.13.9 host application.
Run with -? for options
-------------------------------------------------------------------------------
Symmetric Matrix Coefficient
-------------------------------------------------------------------------------
/usr/WS2/mfem/gitlab-runner/kolev/builds/zkQzXxmM/001/gitlab/mfem/mfem-autotest/tests/unit/fem/test_coefficient.cpp:307
...............................................................................
/usr/WS2/mfem/gitlab-runner/kolev/builds/zkQzXxmM/001/gitlab/mfem/mfem-autotest/tests/unit/fem/test_coefficient.cpp:332: FAILED:
  REQUIRE( qf.DistanceTo(values) == MFEM_Approx(0.0) )
with expansion:
  0.5017307833147915 == Approx( 0 )
===============================================================================
test cases:     297 |     296 passed | 1 failed
assertions: 2828545 | 2828544 passed | 1 failed
make[1]: *** [makefile:240: unit_tests-test-seq] Error 1

najlkin · 2024-05-17T17:58:02Z

I think the problem might be that DistanceTo sums totally different numbers, it would be better to sum the relative errors, i.e., instead of (x_i-y_i)^2 sum (x_i-y_i)^2/y_i^2 😉

sebastiangrimberg · 2024-05-17T19:25:59Z

There is a failure with this PR in our regression testing on Lassen (V100 machine) when building optimized + CUDA. The error is below, can you double-check the code?

I'm pretty sure this is because Vector::DistanceTo is assumes the vector is not using the device while the added test uses a QuadratureFunction which does use the device by default. The following patch to linalg/vector.hpp would hopefully fix this:

diff --git a/linalg/vector.hpp b/linalg/vector.hpp
index 00bb625f9..292ac6fec 100644
--- a/linalg/vector.hpp
+++ b/linalg/vector.hpp
@@ -678,24 +678,24 @@ inline real_t Distance(const Vector &x, const Vector &y)

 inline real_t Vector::DistanceSquaredTo(const real_t *p) const
 {
-   return DistanceSquared(data, p, size);
+   return DistanceSquared(HostRead(), p, size);
 }

 inline real_t Vector::DistanceSquaredTo(const Vector &p) const
 {
    MFEM_ASSERT(p.Size() == Size(), "Incompatible vector sizes.");
-   return DistanceSquared(data, p.data, size);
+   return DistanceSquared(HostRead(), p.HostRead(), size);
 }

 inline real_t Vector::DistanceTo(const real_t *p) const
 {
-   return Distance(data, p, size);
+   return Distance(HostRead(), p, size);
 }

 inline real_t Vector::DistanceTo(const Vector &p) const
 {
    MFEM_ASSERT(p.Size() == Size(), "Incompatible vector sizes.");
-   return Distance(data, p.data, size);
+   return Distance(HostRead(), p.HostRead(), size);
 }

 /// Returns the inner product of x and y

Probably for a debug GPU build one of the asserts in the type-casts in mem_manager.hpp would have failed to let us know of this.

najlkin · 2024-05-17T20:01:00Z

Yes, that would explain why it happens only in this context. I presumed these basic methods are safe, but MFEM can always surprise 😄 . It goes quite far beyond the scope of this PR, so it would be perhaps cleaner to just turn off the device processing and open a separate PR for this.
Besides that, I would still recommend the relative errors, but it is not critical and only makes the test slightly insensitive.

sebastiangrimberg · 2024-05-17T20:03:13Z

It's a (minor) bug fix so I would encourage adding the patch as part of this PR. But I will let the editor @tzanio decide.

v-dobrev · 2024-05-17T22:10:22Z

These "distance" methods were meant for small vectors on CPU where the overhead of HostRead (in terms of code and execution time) is not desirable. We can add an MFEM_ASSERT to check they are not used on device. For big vectors that can be on device, one can use subtract and Normlinf() or if one really wants $\ell_2$ (i.e. Eucledian) norm, Norml2().

tests/unit/fem/test_coefficient.cpp

v-dobrev · 2024-05-17T23:28:14Z

Hmm, I'm still not sure how but in my tests with a CUDA build, the unit tests pass. I'm using a debug build but that should not change the result unless it produces an error from a failed check which it does not.

v-dobrev · 2024-05-18T02:19:57Z

It turns out the unit test failure is due to a compiler issue: when using -O3, IBM XL C++ enables optimizations that can cause issues in some cases and it looks like this is one of those cases. Switching to -O2 optimization with the same compiler fixes the problem.

v-dobrev · 2024-05-18T03:34:28Z

I noticed a potential issue when SymmetricMatrixCoefficient::ProjectSymmetric is called with QuadratureFunction &qf that is valid on device. Since we are accessing its entries using aliases, and all of that is done on host, it is best to call

   qf.HostWrite();

sometime before the start of the loop over elements, here:

mfem/fem/coefficient.cpp

Line 812 in 13e1067

for (int iel = 0; iel < ne; ++iel)

tzanio · 2024-05-18T19:47:04Z

@j-signorelli what do you think about the discussion and suggestions above ☝️ ?

Fix matrix dimension

365b2a0

j-signorelli self-assigned this Apr 26, 2024

j-signorelli added bug fem ready-for-review labels Apr 26, 2024

Merge branch 'master' into symmatcoeff-project-fix

98e0f32

tzanio added in-review and removed ready-for-review labels May 11, 2024

tzanio assigned najlkin, paul-hilscher and tzanio May 11, 2024

tzanio requested review from najlkin and paul-hilscher May 11, 2024 15:15

tzanio added this to Review Now in Pull Requests via automation May 11, 2024

tzanio added this to the mfem-4.8 milestone May 11, 2024

najlkin approved these changes May 13, 2024

View reviewed changes

j-signorelli added 5 commits May 14, 2024 12:01

Merge branch 'master' of github.com:mfem/mfem into symmatcoeff-projec…

62a57c3

…t-fix

Do not override Matrix::Print with an un-implemented version in Dense…

d89ccea

…SymmetricMatrix

Add copy of DenseSymmetricMatrix = operator fxn

c54e92a

Rename auxiliary matrix used in SymmetricMatrixCoefficient + move Get…

2b6029a

…Matrix to Constant one

Add reg test for SymmetricMatrixCoefficient

69ac6a0

j-signorelli requested a review from najlkin May 14, 2024 19:57

style

89bb334

najlkin reviewed May 14, 2024

View reviewed changes

fem/coefficient.hpp Show resolved Hide resolved

tests/unit/fem/test_coefficient.cpp Outdated Show resolved Hide resolved

paul-hilscher reviewed May 14, 2024

View reviewed changes

linalg/symmat.cpp Show resolved Hide resolved

paul-hilscher reviewed May 14, 2024

View reviewed changes

fem/coefficient.hpp Show resolved Hide resolved

paul-hilscher approved these changes May 14, 2024

View reviewed changes

j-signorelli added 2 commits May 15, 2024 14:01

Keep but deprecated SymmetricMatrixCoefficient::GetMatrix

54dcdc7

Randomize unit test vector values

9834126

minor

13e1067

tzanio approved these changes May 17, 2024

View reviewed changes

tzanio added the in-next label May 17, 2024

tzanio added a commit that referenced this pull request May 17, 2024

Merge branch 'symmatcoeff-project-fix' into next (PR #4271)

7820759

v-dobrev reviewed May 17, 2024

View reviewed changes

tests/unit/fem/test_coefficient.cpp Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `SymmetricMatrixCoefficient::ProjectSymmetric` bug #4271

Fix `SymmetricMatrixCoefficient::ProjectSymmetric` bug #4271

j-signorelli commented Apr 26, 2024 •

edited by tzanio

j-signorelli commented Apr 26, 2024

sebastiangrimberg commented Apr 26, 2024 •

edited

tzanio commented May 11, 2024

najlkin left a comment

paul-hilscher commented May 13, 2024 •

edited

j-signorelli commented May 14, 2024

paul-hilscher left a comment

j-signorelli commented May 15, 2024

tzanio left a comment

tzanio commented May 17, 2024

tzanio commented May 17, 2024

najlkin commented May 17, 2024

sebastiangrimberg commented May 17, 2024 •

edited

najlkin commented May 17, 2024

sebastiangrimberg commented May 17, 2024

v-dobrev commented May 17, 2024 •

edited

v-dobrev commented May 17, 2024

v-dobrev commented May 18, 2024

v-dobrev commented May 18, 2024

tzanio commented May 18, 2024

Fix SymmetricMatrixCoefficient::ProjectSymmetric bug #4271

Are you sure you want to change the base?

Fix SymmetricMatrixCoefficient::ProjectSymmetric bug #4271

Conversation

j-signorelli commented Apr 26, 2024 • edited by tzanio

j-signorelli commented Apr 26, 2024

sebastiangrimberg commented Apr 26, 2024 • edited

tzanio commented May 11, 2024

najlkin left a comment

Choose a reason for hiding this comment

paul-hilscher commented May 13, 2024 • edited

j-signorelli commented May 14, 2024

paul-hilscher left a comment

Choose a reason for hiding this comment

j-signorelli commented May 15, 2024

tzanio left a comment

Choose a reason for hiding this comment

tzanio commented May 17, 2024

tzanio commented May 17, 2024

najlkin commented May 17, 2024

sebastiangrimberg commented May 17, 2024 • edited

najlkin commented May 17, 2024

sebastiangrimberg commented May 17, 2024

v-dobrev commented May 17, 2024 • edited

v-dobrev commented May 17, 2024

v-dobrev commented May 18, 2024

v-dobrev commented May 18, 2024

tzanio commented May 18, 2024

Fix `SymmetricMatrixCoefficient::ProjectSymmetric` bug #4271

Fix `SymmetricMatrixCoefficient::ProjectSymmetric` bug #4271

j-signorelli commented Apr 26, 2024 •

edited by tzanio

sebastiangrimberg commented Apr 26, 2024 •

edited

paul-hilscher commented May 13, 2024 •

edited

sebastiangrimberg commented May 17, 2024 •

edited

v-dobrev commented May 17, 2024 •

edited