Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions for retrieving the default tile sizes in MDRangePolicy #6839

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

ldh4
Copy link
Contributor

@ldh4 ldh4 commented Feb 23, 2024

Resolves #6609

This PR adds query functions in MDRangePolicy to allow retrieving the default tile sizes that are used internally, if a custom tile size is not specified during the construction of the policy:

  • tile_type tile_size_recommended() const
  • int tile_size_recommended(const int tile_rank) const
  • tile_type largest_tile_size_recommended() const
  • int largest_tile_size_recommended(const int tile_rank) const
  • int max_total_tile_size() const

The base setting of TileSizeProperties is in:

struct TileSizeProperties {
int max_threads;
int default_largest_tile_size;
int default_tile_size;
int max_total_tile_size;
};
template <typename ExecutionSpace>
TileSizeProperties get_tile_size_properties(const ExecutionSpace&) {
// Host settings
TileSizeProperties properties;
properties.max_threads = std::numeric_limits<int>::max();
properties.default_largest_tile_size = 0;
properties.default_tile_size = 2;
properties.max_total_tile_size = std::numeric_limits<int>::max();
return properties;
}

And this setting is used to configure the tile size during the construction of the policy.

For Cuda, HIP and SYCL, the settings of TileSizeProperties are in Kokkos_Cuda_MDRangePolicy.hpp, Kokkos_HIP_MDRangePolicy.hpp and Kokkos_SYCL_MDRangePolicy.hpp, respectively.

return default_tile_size;
}

int tile_length_max_recommended_per_rank(const int tile_rank) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one called tile_length_* while all the others are tile_size_*? We don't use tile_length anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had mixed thoughts on this. Unlike the other two functions, this one only returns a length of a tile in 1 dimension (of a specified rank). But in the interest of keep the interfaces consistent, changed it to tile_size_*.

core/src/KokkosExp_MDRangePolicy.hpp Outdated Show resolved Hide resolved
core/unit_test/TestMDRangePolicyConstructors.hpp Outdated Show resolved Hide resolved
core/unit_test/TestMDRangePolicyConstructors.hpp Outdated Show resolved Hide resolved
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need/want tile_length_max_recommended_per_rank?

Not sure about the name for tile_size_max_total.

Comment on lines +149 to +150
auto default_size_properties =
Kokkos::Impl::get_tile_size_properties(TEST_EXECSPACE());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about what that test is trying to achieve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main purpose of this test is to verify that the new functions are correctly returning the default tile sizes that are internally used during the construction of MDRangePolicy. Some of these are used from the values set in a function for each backend:

template <typename ExecutionSpace>
TileSizeProperties get_tile_size_properties(const ExecutionSpace&) {
// Host settings
TileSizeProperties properties;
properties.max_threads = std::numeric_limits<int>::max();
properties.default_largest_tile_size = 0;
properties.default_tile_size = 2;
properties.max_total_tile_size = std::numeric_limits<int>::max();
return properties;
}

with exception to the last rank of the policy.
So, the test is checking against the internal value to confirm that the function will be returning the right values.

core/unit_test/TestMDRangePolicyConstructors.hpp Outdated Show resolved Hide resolved
constexpr int rank = 3;
using Policy = Kokkos::MDRangePolicy<TEST_EXECSPACE, Kokkos::Rank<rank>>;
using tile_type = typename Policy::tile_type;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please assert that the recommended tile size is "valid", that is the tile size is non-negative (if appropriate) for each dimension and the total flattened size does not exceed the limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checks to confirm that the recommended tile size per rank is positive value and is less than the recommended largest tile size per rank.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for checking the total flattened size not exceeding the limit, I was going to add the following unit test:

#ifndef KOKKOS_ENABLE_OPENMPTARGET  // FIXME_OPENMPTARGET
TEST(TEST_CATEGORY_DEATH, policy_tile_size_exceeding_limit) {
  using Policy = Kokkos::MDRangePolicy<TEST_EXECSPACE, Kokkos::Rank<2>>;
  using tile_type = typename Policy::tile_type;

  ::testing::FLAGS_gtest_death_test_style = "threadsafe";
  ASSERT_DEATH(
      {
        auto max_threads =
            Kokkos::Impl::get_tile_size_properties(TEST_EXECSPACE())
                .max_threads;
        printf("%d\n", max_threads);
        (void)Policy({0, 0, 0}, {100, 100, 100},
                     tile_type{{max_threads, max_threads, max_threads}});
      }, "");
}
#endif

but it seems that there may be some deficiencies in MDRangePolicy that checks for this condition.

m_prod_tile_dims *= m_tile[i];
}
if (m_prod_tile_dims > static_cast<index_type>(properties.max_threads)) {
printf(" Product of tile dimensions exceed maximum limit: %d\n",
static_cast<int>(properties.max_threads));
Kokkos::abort(
"ExecSpace Error: MDRange tile dims exceed maximum number "
"of threads per block - choose smaller tile dims");
}

Currently the product of all tile sizes are checked against properties.max_threads instead of properties.max_total_tile_size. For Cuda and HIP, properties.max_total_tile_size != properties.max_threads, so some inconsistencies are found when running the unit test. I think this part will need to be addressed separately.

auto properties = Impl::get_tile_size_properties(m_space);
return tile_size_last_rank(properties,
m_upper[tile_rank] - m_lower[tile_rank]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets only expose tile_size_recommended and max_total_tile_size, I am not sure how a user would correctly use largest_tile_size_recommended which really is more like a "recommended tile size in the rank which corresponds to the fastest iteration".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed largest_tile_size_recommended().
I have no strong opinion on keeping or removing it. But for Cuda/HIP/SYCL, the tile size of '16' is used by default in the rank with the fastest iteration. Users won't have access to this default number without this function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants