Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions for retrieving the default tile sizes in MDRangePolicy #6839

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
35 changes: 32 additions & 3 deletions core/src/KokkosExp_MDRangePolicy.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,38 @@ struct MDRangePolicy : public Kokkos::Impl::PolicyTraits<Properties...> {
}
bool impl_tune_tile_size() const { return m_tune_tile_size; }

tile_type tile_size_recommended() const {
auto properties = Impl::get_tile_size_properties(m_space);
tile_type default_tile_size = {};

for (std::size_t i = 0; i < default_tile_size.size(); ++i) {
default_tile_size[i] = properties.default_tile_size;
}

int last_rank = (inner_direction == Iterate::Right) ? rank - 1 : 0;
default_tile_size[last_rank] = tile_length_last_rank(
properties, m_upper[last_rank] - m_lower[last_rank]);
return default_tile_size;
}
crtrott marked this conversation as resolved.
Show resolved Hide resolved

int tile_length_max_recommended_per_rank(const int tile_rank) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one called tile_length_* while all the others are tile_size_*? We don't use tile_length anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had mixed thoughts on this. Unlike the other two functions, this one only returns a length of a tile in 1 dimension (of a specified rank). But in the interest of keep the interfaces consistent, changed it to tile_size_*.

auto properties = Impl::get_tile_size_properties(m_space);
return tile_length_last_rank(properties,
m_upper[tile_rank] - m_lower[tile_rank]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets only expose tile_size_recommended and max_total_tile_size, I am not sure how a user would correctly use largest_tile_size_recommended which really is more like a "recommended tile size in the rank which corresponds to the fastest iteration".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed largest_tile_size_recommended().
I have no strong opinion on keeping or removing it. But for Cuda/HIP/SYCL, the tile size of '16' is used by default in the rank with the fastest iteration. Users won't have access to this default number without this function.


int tile_size_max_total() const {
return Impl::get_tile_size_properties(m_space).max_total_tile_size;
}

private:
int tile_length_last_rank(const Impl::TileSizeProperties properties,
const index_type length) const {
return properties.default_largest_tile_size == 0
? std::max<int>(length, 1)
: properties.default_largest_tile_size;
}

void init_helper(Impl::TileSizeProperties properties) {
m_prod_tile_dims = 1;
int increment = 1;
Expand Down Expand Up @@ -352,9 +383,7 @@ struct MDRangePolicy : public Kokkos::Impl::PolicyTraits<Properties...> {
m_tile[i] = 1;
}
} else {
m_tile[i] = properties.default_largest_tile_size == 0
? std::max<int>(length, 1)
: properties.default_largest_tile_size;
m_tile[i] = tile_length_last_rank(properties, length);
}
}
m_tile_end[i] =
Expand Down
46 changes: 46 additions & 0 deletions core/unit_test/TestMDRangePolicyConstructors.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,50 @@ TEST(TEST_CATEGORY_DEATH, policy_invalid_bounds) {
}
#endif

TEST(TEST_CATEGORY, policy_get_tile_size) {
constexpr int rank = 3;
using Policy = Kokkos::MDRangePolicy<TEST_EXECSPACE, Kokkos::Rank<rank>>;
using tile_type = typename Policy::tile_type;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please assert that the recommended tile size is "valid", that is the tile size is non-negative (if appropriate) for each dimension and the total flattened size does not exceed the limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checks to confirm that the recommended tile size per rank is positive value and is less than the recommended largest tile size per rank.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for checking the total flattened size not exceeding the limit, I was going to add the following unit test:

#ifndef KOKKOS_ENABLE_OPENMPTARGET  // FIXME_OPENMPTARGET
TEST(TEST_CATEGORY_DEATH, policy_tile_size_exceeding_limit) {
  using Policy = Kokkos::MDRangePolicy<TEST_EXECSPACE, Kokkos::Rank<2>>;
  using tile_type = typename Policy::tile_type;

  ::testing::FLAGS_gtest_death_test_style = "threadsafe";
  ASSERT_DEATH(
      {
        auto max_threads =
            Kokkos::Impl::get_tile_size_properties(TEST_EXECSPACE())
                .max_threads;
        printf("%d\n", max_threads);
        (void)Policy({0, 0, 0}, {100, 100, 100},
                     tile_type{{max_threads, max_threads, max_threads}});
      }, "");
}
#endif

but it seems that there may be some deficiencies in MDRangePolicy that checks for this condition.

m_prod_tile_dims *= m_tile[i];
}
if (m_prod_tile_dims > static_cast<index_type>(properties.max_threads)) {
printf(" Product of tile dimensions exceed maximum limit: %d\n",
static_cast<int>(properties.max_threads));
Kokkos::abort(
"ExecSpace Error: MDRange tile dims exceed maximum number "
"of threads per block - choose smaller tile dims");
}

Currently the product of all tile sizes are checked against properties.max_threads instead of properties.max_total_tile_size. For Cuda and HIP, properties.max_total_tile_size != properties.max_threads, so some inconsistencies are found when running the unit test. I think this part will need to be addressed separately.

std::size_t last_rank =
(Policy::inner_direction == Kokkos::Iterate::Right) ? rank - 1 : 0;

auto default_size_properties =
Kokkos::Impl::get_tile_size_properties(TEST_EXECSPACE());
Comment on lines +149 to +150
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about what that test is trying to achieve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main purpose of this test is to verify that the new functions are correctly returning the default tile sizes that are internally used during the construction of MDRangePolicy. Some of these are used from the values set in a function for each backend:

template <typename ExecutionSpace>
TileSizeProperties get_tile_size_properties(const ExecutionSpace&) {
// Host settings
TileSizeProperties properties;
properties.max_threads = std::numeric_limits<int>::max();
properties.default_largest_tile_size = 0;
properties.default_tile_size = 2;
properties.max_total_tile_size = std::numeric_limits<int>::max();
return properties;
}

with exception to the last rank of the policy.
So, the test is checking against the internal value to confirm that the function will be returning the right values.


{
Policy policy({0, 0, 0}, {100, 100, 100}, tile_type{{2, 4, 16}});

auto rec_tile_size = policy.tile_size_recommended();

EXPECT_EQ(default_size_properties.max_total_tile_size,
policy.tile_size_max_total());

for (std::size_t i = 0; i < rank; ++i) {
if (i != last_rank) {
EXPECT_EQ(default_size_properties.default_tile_size, rec_tile_size[i]);
} else {
#if defined(KOKKOS_ENABLE_CUDA) || defined(KOKKOS_ENABLE_HIP) || \
defined(KOKKOS_ENABLE_SYCL)
if (default_size_properties.default_largest_tile_size == 0)
EXPECT_EQ(100, rec_tile_size[i]);
ldh4 marked this conversation as resolved.
Show resolved Hide resolved
else
EXPECT_EQ(default_size_properties.default_largest_tile_size,
rec_tile_size[i]);
#else
EXPECT_EQ(policy.tile_length_max_recommended_per_rank(last_rank),
rec_tile_size[i]);
#endif
}
#if defined(KOKKOS_ENABLE_CUDA) || defined(KOKKOS_ENABLE_HIP) || \
defined(KOKKOS_ENABLE_SYCL)
EXPECT_EQ(default_size_properties.default_largest_tile_size,
policy.tile_length_max_recommended_per_rank(i));
#else
EXPECT_EQ(100, policy.tile_length_max_recommended_per_rank(i));
#endif
ldh4 marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

} // namespace