Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce getNodeByQuery overhead #13221

Open
wants to merge 13 commits into
base: unstable
Choose a base branch
from

Conversation

fcostaoliveira
Copy link

@fcostaoliveira fcostaoliveira commented Apr 17, 2024

The following PR does the following changes based upon on CPU profile info. The getNodeByQuery function represents 8.2% of an overhead of 12.3% when comparing single shard cluster with standalone.
Proposed changes:

  • inlinging keyHashSlot to reduce overhead of that function call
  • Reduce duplicate calls to getCommandFlags within getNodeByQuery
  • moving crc16 to header file. inlining crc16 to reduce overhead of that function call

The above changes represent an improvement of approximately 5% on the achievable ops/sec as showcase bellow.

results

steps to reproduce

2 Nodes.
DB node (swap ip) :

taskset -c 0 ./src/redis-server --save '' --cluster-announce-ip 192.168.1.200 --bind 192.168.1.200 --protected-mode no --requirepass perf --daemonize yes --cluster-enabled yes --logfile redis.log
redis-cli -h 192.168.1.200 -a perf flushall
redis-cli -h 192.168.1.200 -a perf cluster flushslots
redis-cli -h 192.168.1.200 -a perf cluster addslotsrange 0 16383 
redis-cli -h 192.168.1.200 -a perf cluster info

Client node:

taskset -c 0,1 memtier_benchmark --server 192.168.1.200 --port 6379 --authenticate perf --cluster-mode --pipeline 10 --data-size 100 --ratio 1:0 --key-pattern P:P --key-minimum=1 --key-maximum 1000000 --test-time 180 -c 25 -t 2 --hide-histogram

results unstable (804110a)

root@micro1:~# taskset -c 0,1 memtier_benchmark --server 192.168.1.200 --port 6379 --authenticate perf --cluster-mode --pipeline 10 --data-size 100 --ratio 1:0 --key-pattern P:P --key-minimum=1 --key-maximum 1000000 --test-time 180 -c 25 -t 2 --hide-histogram
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 180 secs]  0 threads:   107876070 ops,  599582 (avg:  599300) ops/sec, 83.99MB/sec (avg: 83.95MB/sec),  0.83 (avg:  0.83) msec latency

2         Threads
25        Connections per thread
180       Seconds


ALL STATS
======================================================================================================================================================
Type         Ops/sec     Hits/sec   Misses/sec    MOVED/sec      ASK/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------------------------------------------------------
Sets       599300.19          ---          ---         0.00         0.00         0.83133         0.82300         1.25500         3.07100     85967.12 
Gets            0.00         0.00         0.00         0.00         0.00             ---             ---             ---             ---         0.00 
Waits           0.00          ---          ---          ---          ---             ---             ---             ---             ---          --- 
Totals     599300.19         0.00         0.00         0.00         0.00         0.83133         0.82300         1.25500         3.07100     85967.12 

results this PR (7e1d2ea)

root@micro1:~# taskset -c 0,1 memtier_benchmark --server 192.168.1.200 --port 6379 --authenticate perf --cluster-mode --pipeline 10 --data-size 100 --ratio 1:0 --key-pattern P:P --key-minimum=1 --key-maximum 1000000 --test-time 180 -c 25 -t 2 --hide-histogram
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 180 secs]  0 threads:   112251930 ops,  618650 (avg:  623611) ops/sec, 86.66MB/sec (avg: 87.36MB/sec),  0.80 (avg:  0.80) msec latency

2         Threads
25        Connections per thread
180       Seconds


ALL STATS
======================================================================================================================================================
Type         Ops/sec     Hits/sec   Misses/sec    MOVED/sec      ASK/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------------------------------------------------------
Sets       623611.19          ---          ---         0.00         0.00         0.79897         0.79100         1.20700         2.76700     89454.70 
Gets            0.00         0.00         0.00         0.00         0.00             ---             ---             ---             ---         0.00 
Waits           0.00          ---          ---          ---          ---             ---             ---             ---             ---          --- 
Totals     623611.19         0.00         0.00         0.00         0.00         0.79897         0.79100         1.20700         2.76700     89454.70 

@fcostaoliveira fcostaoliveira changed the title Improve.get node by query Reduce getNodeByQuery overhead Apr 17, 2024
for (e = s+1; e < keylen; e++)
if (key[e] == '}') break;

/* No '}' or nothing between {} ? Hash the whole key. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it, do we want to put an "unlikely" here? I am far from an expert, just asking.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will to the test of the impact on the results and reply back.

sundb and others added 8 commits May 28, 2024 11:10
…after defragment (redis#13231)

Introducted by redis#13013

After defragmenting the dictionary in the kvstore, if the dict is
reallocated, the value of its node in the kvstore rehashing list must be
updated.
For my mistake, in the last revert commit in redis#13231, I originally wanted
to revert the last one, but reverted the penultimate fix.
Now that we have fix another potential memory read issue in [`743f1dd`
(redis#13231)](redis@743f1dd),
now it just seems to avoid confusion, i will verify in the future
whether it will have any impact, if so we will add this PR to backport.

Failed CI: https://github.com/sundb/redis/actions/runs/8826731960
Because it does not cause any propagation (arguably it should, see the
comment in the tcl file)

The motivation for this fix is that in 6.2 if dirty changed without
propagation inside MULTI/EXEC it would cause propagation of EXEC only,
which would result in the replica sending errors to its master
added reverse history search to redis-cli, use it with the following:

* CTRL+R : enable search backward mode, and search next one when
pressing CTRL+R again until reach index 0.
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(reverse-i-search):                   # press CTRL+R
(reverse-i-search): keys two          # input `keys`
(reverse-i-search): keys one          # press CTRL+R again
(reverse-i-search): keys one          # press CTRL+R again, still `keys one` due to reaching index 0
(i-search): keys two                  # press CTRL+S, enable search forward
(i-search): keys two                  # press CTRL+S, still `keys one` due to reaching index 1
```

* CTRL+S : enable search forward mode, and search next one when pressing
CTRL+S again until reach index 0.
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(i-search):                       # press CTRL+S
(i-search): keys one              # input `keys`
(i-search): keys two              # press CTRL+S again
(i-search): keys two              # press CTRL+R again, still `keys two` due to reaching index 0
(reverse-i-search): keys one      # press CTRL+R, enable search backward
(reverse-i-search): keys one      # press CTRL+S, still `keys one` due to reaching index 1
```

* CTRL+G : disable
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(reverse-i-search):                   # press CTRL+R
(reverse-i-search): keys two          # input `keys`
127.0.0.1:6379>                       # press CTRL+G
```

* CTRL+C : disable
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(reverse-i-search):                   # press CTRL+R
(reverse-i-search): keys two          # input `keys`
127.0.0.1:6379>                       # press CTRL+G
```

* TAB : use the current search result and exit search mode
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(reverse-i-search):                # press CTRL+R
(reverse-i-search): keys two       # input `keys`
127.0.0.1:6379> keys two           # press TAB
```

* ENTER : use the current search result and execute the command
```
127.0.0.1:6379> keys one
127.0.0.1:6379> keys two
(reverse-i-search):                 # press CTRL+R
(reverse-i-search): keys two        # input `keys`
127.0.0.1:6379> keys two            # press ENTER
(empty array)
127.0.0.1:6379>
```

* any arrow key will disable reverse search

your result will have the search match bolded, you can press enter to
execute the full result

note: I have _only added this for multi-line mode_, as it seems to be
forced that way when `repl` is called

Closes: redis#8277

---------

Co-authored-by: Clayton Northey <clayton@knowbl.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Bjorn Svensson <bjorn.a.svensson@est.tech>
Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
…SP3 in MULTI (redis#13255)

This test was introducted by redis#13251.
Normally we auto transform the reply format of XREADGROUP to array under
RESP3 (see trasformer_funcs).
But when we execute XREADGROUP command in multi it can't work, which
cause the new test failed.
The solution is to verity the reply of XREADGROUP in advance rather than
in MULTI.

Failed validate schema CI:
https://github.com/redis/redis/actions/runs/9025128323/job/24800285684

---------

Co-authored-by: guybe7 <guy.benoish@redislabs.com>
…an bug (redis#13150)" (redis#13266)

The kernel config `vm.mmap_rnd_bits` had been revert in
actions/runner-images#9491, so we can revert
the changes from redis#13150.

CI only with ASAN passed:
https://github.com/sundb/redis/actions/runs/9058263634
…ommand (redis#13276)

This PR is based on the commits from PR redis#12944.

Allow SPUBLISH command within multi/exec on replica

Behavior on unstable:

```
127.0.0.1:6380> CLUSTER NODES
39ce8aa20f1f0d91f1a88d976ee1926dfefcdf1a 127.0.0.1:6380@16380 myself,slave 8b0feb120b68aac489d6a5af9c77dc40d71bc792 0 0 0 connected
8b0feb120b68aac489d6a5af9c77dc40d71bc792 127.0.0.1:6379@16379 master - 0 1705091681202 0 connected 0-16383
127.0.0.1:6380> SPUBLISH hello world
(integer) 0
127.0.0.1:6380> MULTI
OK
127.0.0.1:6380(TX)> SPUBLISH hello world
QUEUED
127.0.0.1:6380(TX)> EXEC
(error) MOVED 866 127.0.0.1:6379
```

With this change:

```
127.0.0.1:6380> SPUBLISH hello world
(integer) 0
127.0.0.1:6380> MULTI
OK
127.0.0.1:6380(TX)> SPUBLISH hello world
QUEUED
127.0.0.1:6380(TX)> EXEC
1) (integer) 0
```

---------

Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: oranagra <oran@redislabs.com>
`reclaimFilePageCache` did not set `errno` but `rdbSaveInternal` which
is logging the error assumed it did. This makes sure `errno` is set.

Fixes redis#13245

Signed-off-by: Ted Lyngmo <ted@lyncon.se>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants