Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crimson: Support basic deployments #57593

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Matan-B
Copy link
Contributor

@Matan-B Matan-B commented May 21, 2024

This new options is there to allow basic cluster deployments.
crimson_seastar_num_threads can be used globally for all the OSDs in the cluster:

osd:
  crimson_seastar_num_threads: <n>

As a result, CPUs 0-<n-1> will be allocated to seastar threads and CPUs <n>-<nproc-1>
will be allocated to Alienstore threads

Note: The core allocation will overlap between OSDS. An optiomal deployment
requires crimson_alien_thread_cpu_cores and crimson_seastar_cpu_cores to be set
for each OSD based on the host its located at.

Documentation followup: #57352

  /*
   * In descending order, one of the following options *must* be set:
   * 1) crimson_alien_thread_cpu_cores
   * 2) crimson_seastar_num_threads
   * 3) crimson_seastar_cpu_cores
  */

Fixes: https://tracker.ceph.com/issues/65752

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@Matan-B Matan-B requested a review from a team as a code owner May 21, 2024 10:04
@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from e3d7b2f to 6faf209 Compare May 21, 2024 17:01
@Matan-B Matan-B changed the title crimson: Default (non-optimal) startup cpu_set [WIP] crimson: introduce crimson_seastar_reactor_num May 21, 2024
@Matan-B Matan-B added the DNM label May 21, 2024
default: 0
desc: number of CPU cores on which seastar reactor threads will run
flags:
- startup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why crimson_seastar_reactor_num is needed.

Looks it is identical to configure crimson_seastar_cpu_cores as "0-crimson_seastar_reactor_num-1" ?

Which seems not useful when there are multiple OSDs because --cpuset always start with 0, causing overlap.

Copy link
Contributor Author

@Matan-B Matan-B May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's practically the same. It's there only to make Crimson deployment friendlier by setting a global (non-optimal) OSD value of crimson_seastar_reactor_num = n.
Once crimson_seastar_cpu_cores and crimson_alien_thread_cpu_cores are set, this value will no longer have any impact.
For crimson_seastar_cpu_cores to be set properly, each OSD on each host must be configured separately which may be too specific for just getting the cluster up and running.

Another reason is that if crimson_seastar_cpu_cores is not set, we fallback to Seastar's default of using all available cores. That means that on nodes with 80 cores we will create 80 seastar threads.

By looking again, crimson_seastar_thread_num is probably a better terminology and we can also cap it to 32.

Copy link
Member

@cyx1231st cyx1231st May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I generally understand the purpose.

Which seems not useful when there are multiple OSDs because --cpuset always start with 0, causing overlap.

But I'm still wondering how to prevent "--cpuset" overlapping if there are multiple OSDs in the same host, only with crimson_seastar_thread_num or crimson_seastar_reactor_num configured.

I suppose the OSD still needs to know that it is the nth daemon in the same host to produce the accurate --cpuset for seastar. But if it requires yet another Ceph configuration (which still has to be configured separately), probably the current crimson_seastar_cpu_cores is simpler?

Or, does each OSD live in a separate container, so that overlapping is impossible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm still wondering how to prevent "--cpuset" overlapping if there are multiple OSDs in the same host, only with crimson_seastar_thread_num or crimson_seastar_reactor_num configured.

It's expected to overlap. As mentioned, it's not optimal (at all), but it's just there to let the cluster come up in a simple way. Actual usage will have to set crimson_seastar_cpu_cores and crimson_alien_thread_cpu_cores.
I'll emphasis that in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyx1231st, Please see updated commit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-c [ --smp ] arg number of threads (default: one per CPU)

Looks it configures the seastar thread number, and thread-affinity=0 to unpin the CPUs.

Copy link
Contributor Author

@Matan-B Matan-B May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks it configures the seastar thread number, and thread-affinity=0 to unpin the CPUs.

(From our offline conversation)

See reactor.cc:

    if (smp_opts.smp) {
        smp::count = smp_opts.smp.get_value();
    } else {
        smp::count = cpu_set.size();
    }
    std::vector<reactor*> reactors(smp::count);

All that --smp does is replace cpu_set.size() usage explicitly. Setting --smp = n or --cpuset = 0-n-1 will result in the same number of smp::count. The only difference would be which CPUs are used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference would be which CPUs are used.

And also quote from @athanatos's comment above:

With crimson_seastar_reactor_num, we don't set a cpu_mask at all so the scheduler is free to place them whereever. In that case, they may overlap, but at least all of the cpus on the host will be used.

So my understanding from the above is that with crimson_seastar_reactor_num=n, we expect the n seastar reactor threads to be scheduled to all the available CPUs, rather than being limitted to --cpuset=0-(n-1), right?

Also, if we want to isolate the CPUs between seastar and alienstore threads in this case, for example, cpuset 0-(m-1) for seastar threads and cpuset m-(nproc-1) for alienstore thread, m should be another independent configuration, is that true?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it makes sense to allow the user to specify a number of alienstore cpu threads different from the number of seastar threads. To make things simpler, I'd be fine with requiring the user to specify both.

Copy link
Contributor Author

@Matan-B Matan-B May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1 (Optimal)

  • CPUs are pinned to the provided sets (seastar and alien cpu sets)
  • thread-affinity = 1 (seastar's default).
  • smp (smp::count / reactor count): seastar_cpu_set.size()
  • Seastar reactor threads: seastar_cpu_set.size()
  • Alienstore threads: alien_cpu_set.size() * crimson_alien_op_num_threads_per_core (default = 4).

Option 2 (Basic):

  • No CPU pinning.
  • thread-affinity = 0
  • smp (smp::count / reactor count) = crimson_seastar_num_threads
  • Seastar reactor threads num: crimson_seastar_num_threads
  • Alienstore threads num: crimson_alien_op_num_threads

@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from 6faf209 to 9f6182b Compare May 22, 2024 11:02
@Matan-B Matan-B changed the title [WIP] crimson: introduce crimson_seastar_reactor_num Crimson: Support basic deployments May 22, 2024
@Matan-B Matan-B requested a review from cyx1231st May 22, 2024 11:03
@Matan-B Matan-B removed the DNM label May 22, 2024
@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from 9f6182b to a1519a2 Compare May 22, 2024 11:11
@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch 4 times, most recently from 1a00aee to 335b0e4 Compare May 29, 2024 13:26
src/common/options/crimson.yaml.in Outdated Show resolved Hide resolved
}
std::string smp = fmt::format("{}", reactor_num);
ret.early_args.emplace_back("--smp");
ret.early_args.emplace_back(smp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from 335b0e4 to 539241f Compare May 30, 2024 09:16
@Matan-B Matan-B requested a review from cyx1231st May 30, 2024 09:17
@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from 539241f to dd1b201 Compare May 30, 2024 09:19
@Matan-B
Copy link
Contributor Author

Matan-B commented May 30, 2024

@cyx1231st, @athanatos - should be ready with all comments addressed. I'll follow up with doc update once approved/merged.

Copy link
Member

@cyx1231st cyx1231st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for minor comments.

crimson_alien_op_num_threads_per_core can be removed from the commit message.

src/common/options/crimson.yaml.in Outdated Show resolved Hide resolved
src/crimson/os/alienstore/alien_store.cc Outdated Show resolved Hide resolved
    This new option is there to allow *basic* cluster deployments.
    crimson_seastar_num_threads can be used globally for all the OSDs in the cluster:
    ```
    osd:
      crimson_seastar_num_threads: <n>
      crimson_alien_op_num_threads: <m>
    ```

    As a result, all the available CPUs will be allocated to *both* seastar reactor threads
    and to Alienstore threads - without any exclusion.

    Notes:

    * The core allocation will most likely overlap between OSDS and/or Seastar and Alienstore.
    An optiomal deployment requires `crimson_alien_thread_cpu_cores`
    and `crimson_seastar_cpu_cores` to be set for each OSD based on the host its located at.

    * Seastar reactor number (smp::count) will be deduced directly from crimson_seastar_num_threads

    ---

    Documentation followup: ceph#57352

---
    ### Option 1 (Optimal)
    * CPUs are pinned to the provided sets (seastar and alien cpu sets)
    * thread-affinity = 1 (seastar's default).
    * smp (smp::count / reactor count): `seastar_cpu_set.size()`
    * Seastar reactor threads: `seastar_cpu_set.size()`
    * Alienstore threads num: `crimson_alien_op_num_threads`
    ---
    ### Option 2 (Basic):
    * No CPU pinning.
    * thread-affinity = 0
    * smp (smp::count / reactor count) = `crimson_seastar_num_threads`
    * Seastar reactor threads num: `crimson_seastar_num_threads`
    * Alienstore threads num: `crimson_alien_op_num_threads`
---
    Fixes: https://tracker.ceph.com/issues/65752

    Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-crimson-default-cpu-cores branch from dd1b201 to 64a9915 Compare June 2, 2024 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants