Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance_ex1 with triangles #4202

Open
aschaf opened this issue Mar 19, 2024 · 6 comments
Open

performance_ex1 with triangles #4202

aschaf opened this issue Mar 19, 2024 · 6 comments

Comments

@aschaf
Copy link
Member

aschaf commented Mar 19, 2024

Hi,

I was playing around with the performance miniapp performance_ex1, and in order to be able to run the program with the -perf and -asm option on triangular meshes, I had to apply a small fix to tbilinearform.hpp. At

mfem/fem/tbilinearform.hpp

Lines 505 to 507 in 37e03ab

const Array<int> *dof_map = sol_fe.GetDofMap();
const int *dof_map_ = dof_map->GetData();
DenseMatrix M_loc_perm(dofs*vdim,dofs*vdim); // initialized with zeros

I replaced line 506 with

const int *dof_map_ = (dof_map) ? dof_map->GetData() : NULL;

which then allowed the program to run with -perf -ams. Second, I also tried using a TPiecewiseConstCoefficient and I found an error in

mfem/fem/teltrans.hpp

Lines 94 to 106 in 37e03ab

template <typename vint_t, int NE>
inline MFEM_ALWAYS_INLINE
void SetAttributes(int el, vint_t (&attrib)[NE]) const
{
const int vsize = sizeof(vint_t)/sizeof(attrib[0][0]);
for (int i = 0; i < NE; i++)
{
for (int j = 0; j < vsize; i++)
{
attrib[i][j] = elements[el+j+i*vsize]->GetAttribute();
}
}
}

where the increment in the inner loop increases the wrong variable. If these fixes are reasonable, I would open a pull request fixing those issues.

Best,
Andreas

@v-dobrev
Copy link
Member

Your suggested fixes look good to me. Thanks @aschaf!

@tzanio tzanio self-assigned this Mar 19, 2024
@aschaf
Copy link
Member Author

aschaf commented Mar 20, 2024

I opened #4206 to resolve the issue.

A related question: if I want to further speed up the assembly (eg. on my laptop) should I use the MPI parallel version to utilize all available cores, or would it make sense to somehow extend the TBilinearform to utilize eg. OpenMP for the assembly loop?

@v-dobrev
Copy link
Member

A related question: if I want to further speed up the assembly (eg. on my laptop) should I use the MPI parallel version to utilize all available cores, or would it make sense to somehow extend the TBilinearform to utilize eg. OpenMP for the assembly loop?

The MPI-parallel version should be easier to try -- just look at how miniapps/performance/ex1p.cpp differes from miniapps/performance/ex1.cpp. Implementing an OpenMP version will require some effort and in the end may actually be slower.

@aschaf
Copy link
Member Author

aschaf commented May 14, 2024

I noticed another issue I somewhat missed in the pull request. When turning on SIMD, it again crashes at line 103 in teltrans.hpp

mfem/fem/teltrans.hpp

Lines 94 to 106 in 7c296d0

template <typename vint_t, int NE>
inline MFEM_ALWAYS_INLINE
void SetAttributes(int el, vint_t (&attrib)[NE]) const
{
const int vsize = sizeof(vint_t)/sizeof(attrib[0][0]);
for (int i = 0; i < NE; i++)
{
for (int j = 0; j < vsize; j++)
{
attrib[i][j] = elements[el+j+i*vsize]->GetAttribute();
}
}
}

specifically because at some point el+j+i*vsize is greater than the number of elements. A quick and dirty fix was just to replace line 103 with

attrib[i][j] = ((el+j+i*vsize) < fes.GetNE()) ? elements[el+j+i*vsize]->GetAttribute() : 1;

@v-dobrev
Copy link
Member

@aschaf, you are right -- this is a bug. In the respective situation in VectorExtract, we stop copying values when we reach the last element, see

void VectorExtract(const vec_layout_t &vl,

A better approach for here and for VectorExtract is probably to replicate the last element attribute or values, so that all SIMD entries have meaningful values.

@aschaf
Copy link
Member Author

aschaf commented May 14, 2024

@v-dobrev Thank you for the explanation, I will open a PR once this is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants