Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidying Tictac Ecosystem for non-users #1883

Open
martinsumner opened this issue Nov 17, 2023 · 2 comments
Open

Tidying Tictac Ecosystem for non-users #1883

martinsumner opened this issue Nov 17, 2023 · 2 comments

Comments

@martinsumner
Copy link
Contributor

An eco-system of changes exist in Riak_KV around TictacAAE:

  • the AAE process itself;
  • nextgenrepl - ttaae full-sync;
  • nextgenrepl - real-time repl;
  • reaper;
  • eraser;
  • reader;
  • aae fold infrastructure.

What of riak users that don't require any of these features? Some of these features create log noise (#1875). A crashing process in one of these features can also trigger a crash of riak_kv (if done with sufficient intensity). The crashing issue has been seen when one of these processes was disabled using an environment variable change (and the disabled term of riak.conf was used instead of the boolean expected by the code).

A request has been made by @nsaadouni to look to bundle this ecosystem into a separate dependency. This might not be straight forward. Especially with respect to aae_fold support.

As a minimum though, what is proposed is:

  • Have a dedicated supervisor for the Tictac EcoSystem, and start all of these processes via that supervisor, not directly via the main riak_kv supervision tree;
  • Have a (default enabled) configuration option which allows the TicTac EcoSystem to be disabled (i.e. riak_kv will not start the ecosystem supervisor if the ecosystem option has been set to disabled);
  • Strip where possible references to the Tictac ecosystem from riak_kv_vnode - in particular manage exchange/rebuild pokes by a ttaae manager not within the vnode itself.

For a user with a pure-basho feature set, they can be assured that their operation is not impacted by running of any processes from the nextgen ecosystem.

Another potential change might be to switch real-time repl to using a more standard mechanism to hook into the PUT process. The coordinator hook was made to reduce real-time latency, but can have issues with being too fast when not correctly configured. This would clean the riak_kv_vnode code for non-nextgenrepl users.

@fadushin
Copy link
Contributor

👍🏽

Re: "real-time real" do you mean "next gen real-time repl"? (considering the fact that "legacy" realtime replication is hooked into the Put FSM already)

@martinsumner
Copy link
Contributor Author

👍🏽

Re: "real-time real" do you mean "next gen real-time repl"? (considering the fact that "legacy" realtime replication is hooked into the Put FSM already)

Yes, that's correct. Currently nextgen real-time repl is hooked into the vnode coordinating the PUT (not as a post-commit hook), with the aim of reducing replication latency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants