Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS tutorial #1171

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from
Open

AWS tutorial #1171

wants to merge 17 commits into from

Conversation

kennyweiss
Copy link
Member

@kennyweiss kennyweiss commented Aug 21, 2023

Summary

  • This PR adds an axom tutorial building up to a GPU- and spatial index-accelerated application akin to our mesh_tester utility (in <axom>/src/tools/mesh_tester)
  • The tutorial currently has lessons on
    • building/configuring an application against an installed axom
    • reading in an STL mesh
    • developing a naive self-intersection algorithm
  • The following lessons will be added in short order
    • extending the naive algorithm using RAJA and Umpire via axom::forall
    • developing a RAJA-based algorithm using spatial indexes
  • I will also add a top-level README outlining the structure of the tutorial

Misc changes:

  • Fixes some warnings in Release configs
  • Optimizes primal::intersect(BoundingBox, BoundingBox)
  • Fixes a bug with axom::utilities::locale() when the user's default locale is missing
  • Fixes Component graph is missing #1109 -- we can once again generate graphs in our readthedocs builds

@kennyweiss kennyweiss self-assigned this Aug 21, 2023
@kennyweiss kennyweiss changed the title [WIP] aws tutorial [WIP] AWS tutorial Aug 24, 2023
@kennyweiss kennyweiss changed the title [WIP] AWS tutorial AWS tutorial Aug 24, 2023
@kennyweiss kennyweiss marked this pull request as ready for review August 24, 2023 02:51
otherBB.m_min[i],
otherBB.m_max[i]);
} // END for all dimensions
if(!detail::intersect_bbox_bbox(m_min[i],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Nice optimization

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! For the "naive" self-intersection algorithm in the tutorial, the intersect(BoundingBox,BoundingBox) test used something like half the cycles on a test that ran for a few seconds. This optimization improved the runtime by something like 30% on our blueos Power9s.

I still need to benchmark this before merging to ensure that it didn't give up performance on Intel runs.

Tag: @publixsubfan

# configure and install 'radiuss_tutorial' example
#------------------------------------------------------------------------------
configure_file(
radiuss_tutorial/host-config.cmake.in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would just call this the tutorial... but we can debate/fix that later.

#------------------------------------------------------------------------------

set(tutorial_deps axom axom::fmt axom::cli11)
blt_list_append(TO tutorial_deps ELEMENTS cuda IF CUDA_FOUND)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
blt_list_append(TO tutorial_deps ELEMENTS cuda IF CUDA_FOUND)
blt_list_append(TO tutorial_deps ELEMENTS cuda IF ENABLE_CUDA)

# Config related to compiler
set(CMAKE_C_COMPILER "@CMAKE_C_COMPILER@" CACHE PATH "")
set(CMAKE_CXX_COMPILER "@CMAKE_CXX_COMPILER@" CACHE PATH "")
set(CMAKE_Fortran_COMPILER "@CMAKE_Fortran_COMPILER@" CACHE PATH "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fortran being used in this tutorial?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No -- I adapted this from the using_with_blt one. I'm pretty sure our TPLs were also built w/ fortran in the docker image

};
```

The constructor calls ``slic::initialize()``, sets the logging level to ``slic::message::Debug`` and then creates log streams for the different message levels. The destructor finalizes the slic component with `slic::finalize()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The constructor calls ``slic::initialize()``, sets the logging level to ``slic::message::Debug`` and then creates log streams for the different message levels. The destructor finalizes the slic component with `slic::finalize()`.
The constructor calls ``slic::initialize()``, sets the logging level to ``slic::message::Debug`` and then creates log streams for the different message levels. The destructor finalizes the SLIC component with `slic::finalize()`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is component the right word here in the last sentence?

@@ -97,7 +97,7 @@ std::locale locale(const std::string& name)
}
catch(std::runtime_error&)
{
loc = std::locale(loc, "", std::locale::ctype);
loc = std::locale(loc, "C", std::locale::ctype);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locales are the gift that keep on giving!

The standard appears to require std::locale("C") and std::locale(""), but apparently the latter cannot always be relied upon. See: https://stackoverflow.com/a/46364980
Specifically -- one of the comments there says:

Note that even though "" is in the set of valid string arguments, I've seen exceptions thrown on std::locale("") (from a plugin run under LibreOffice on Mac)

The original was failing in our AWS docker image (but passing in our docker-based CI, and everywhere else I've run it).

See also: #1128

Copy link
Member

@white238 white238 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work Kenny! This looks great and I look forward for us to continue to build on it.

Thanks @white238!

Co-authored-by: Chris White <white238@llnl.gov>
@kennyweiss
Copy link
Member Author

Docs build for this branch w/ working graphviz: https://axom.readthedocs.io/en/feature-kweiss-aws-tutorial/

Copy link
Member

@rhornung67 rhornung67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kennyweiss This looks like a great foundation to build on for a more comprehensive tutorial

Comment on lines +427 to +429
IndexArray intersections_d(axom::ArrayOptions::Uninitialized {},
numIndices,
kernel_allocator);
Copy link
Contributor

@bmhan12 bmhan12 Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IndexArray intersections_d(axom::ArrayOptions::Uninitialized {},
numIndices,
kernel_allocator);
IndexArray intersections_d(axom::ArrayOptions::Uninitialized {},
numIndices,
numIndices,
kernel_allocator);

Bugfix for CUDA policy. Lesson 4 also has a few instances of this constructor call as well. Looks like the uninitialized constructor does work, just needs a different overload. @kennyweiss reported this works on rzansel as-is, maybe it's the Automatic Translatation Service covering for us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bmhan12

My solution was to use the initializing constructor -- e,g,

    IndexArray intersections_d(numIndices,numIndices,kernel_allocator);

Yours is better!

I'll update the code for lesson_03 and lesson_04.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Component graph is missing
4 participants