Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spatial index for objects #14631

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

appgurueu
Copy link
Contributor

@appgurueu appgurueu commented May 9, 2024

Optimizes range queries for objects in order to resolve #14613.

The spatial index is a dynamic forest of static k-d-trees, as outlined in my comment.

To do

This PR is WIP; the TODOs are in-source and might not be exhaustive.

How to test

There is a randomized unit test which tests this against a naive implementation. I also recommend playing e.g. Shadow Forest or other games with entities to give this some "field testing".

The benchmarks added by sfan show the following range query speedups on my setup:

  • 200 objects: ca. 6x
  • 1450 objects: ca. 22x
  • 10000 objects: ca. 153x (this demonstrates that the new index indeed scales way better)
Data
old
benchmark name                                                                                                       samples       iterations    estimated
                                                                                                                     mean          low mean      high mean
                                                                                                                     std dev       low std dev   high std dev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
inside_radius_200                                                                                                              100          1261    139.593 ms 
                                                                                                                         1.1102 us    1.10106 us    1.12325 us 
                                                                                                                        55.2183 ns    41.7996 ns    73.7618 ns 
                                                                                                                                                               
inside_radius_1450                                                                                                             100           142    140.367 ms 
                                                                                                                        10.0835 us     9.9536 us    10.2648 us 
                                                                                                                        774.521 ns    597.986 ns    1.04772 us 
                                                                                                                                                               
inside_radius_10000                                                                                                            100            11    151.907 ms 
                                                                                                                        174.836 us    168.695 us    181.905 us 
                                                                                                                        33.4375 us    29.0096 us    38.6354 us 
                                                                                                                                                               
in_area_200                                                                                                                    100           850     139.57 ms 
                                                                                                                         1.5991 us    1.59156 us    1.60799 us 
                                                                                                                        41.7553 ns    35.4184 ns    50.8028 ns 
                                                                                                                                                               
in_area_1450                                                                                                                   100            97    140.446 ms 
                                                                                                                        14.5087 us    14.3894 us     14.627 us 
                                                                                                                        606.878 ns    520.745 ns     712.16 ns 
                                                                                                                                                               
in_area_10000                                                                                                                  100             9    144.056 ms 
                                                                                                                        216.703 us     211.43 us    224.314 us 
                                                                                                                        31.9183 us    23.6851 us    53.2383 us 

new
benchmark name                                                                                                       samples       iterations    estimated
                                                                                                                     mean          low mean      high mean
                                                                                                                     std dev       low std dev   high std dev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
inside_radius_200                                                                                                              100          6393    184.118 ms 
                                                                                                                        297.734 ns    295.987 ns    299.511 ns 
                                                                                                                        9.03226 ns    7.87456 ns    10.4505 ns 
                                                                                                                                                               
inside_radius_1450                                                                                                             100          2011     184.61 ms 
                                                                                                                        930.247 ns    925.143 ns    940.841 ns 
                                                                                                                        35.8154 ns    20.3918 ns    69.6263 ns 
                                                                                                                                                               
inside_radius_10000                                                                                                            100           803     184.61 ms 
                                                                                                                        2.53827 us     2.4752 us    2.69643 us 
                                                                                                                        476.573 ns      198.8 ns    917.299 ns 
                                                                                                                                                               
in_area_200                                                                                                                    100          7484    184.106 ms 
                                                                                                                        246.216 ns    244.321 ns    247.968 ns 
                                                                                                                        9.27178 ns    7.80405 ns    11.5285 ns 
                                                                                                                                                               
in_area_1450                                                                                                                   100          2907    184.595 ms 
                                                                                                                        647.267 ns    644.314 ns     649.87 ns 
                                                                                                                         14.137 ns    11.7168 ns    17.8412 ns 
                                                                                                                                                               
in_area_10000                                                                                                                  100          1403    184.635 ms 
                                                                                                                        1.41243 us    1.39912 us    1.44893 us 
                                                                                                                        103.922 ns    46.1786 ns    221.902 ns 

@appgurueu appgurueu added WIP The PR is still being worked on by its author and not ready yet. Performance labels May 9, 2024
@sfan5
Copy link
Member

sfan5 commented May 9, 2024

I think this will need comprehensive tests for how the spatial index and object lifecycle interact. (basically the check list I had in that issue)

@lhofhansl
Copy link
Contributor

lhofhansl commented May 14, 2024

How does this relate to #14643? Do they solve the same problem? Are they complementary?

@appgurueu
Copy link
Contributor Author

How does this relate to #14643? Do they solve the same problem? Are they complementary?

They solve the same problem in different ways. I am not yet sure which way is better. Some thoughts:

  • Performance on the relatively evenly distributed entities in sfan's benchmark is comparable. Both solutions seem capable of producing speedups large enough to basically solve the problem for now.
  • exe_virus' solution potentially performs poorly in two theoretical cases:
    • Very sparse distributions of entities, large area queries: You would need to check many (empty) buckets for such a query; in this case, performance would be even worse than the current linear search. This is probably relatively irrelevant in practice however due to being dwarfed by the associated costs related to mapblock management. For big query areas, the linear search could also be used as a fallback.
    • Very dense distributions, small area queries: In this case there would be too many entities per bucket, so we would be back to near-linear performance in "miniature" settings (think something like rubenwardy's conquer or a chess game where all pieces are entities). Again I don't know how relevant this is in practice, though I could imagine "swarms" or "clumps" of objects to appear, but I don't know whether (1) these are large enough (probably not); (2) large queries are done anyways, meaning there is hardly a way to optimize these.
    • TL;DR: exe_virus' solution is best suited towards a distribution of entities where there's a low, roughly constant number of entities per bucket. This seems to be what real-world distributions are likely to look like.
  • This data structure can provide better worst-case runtime guarantees for range queries, but they aren't great either, due to "the curse of dimensionality". They aren't of much relevance anyways, since there is a large difference between the worst and average case here.
  • exe_virus' data structure is much simpler. It's basically just a multimap from bucket pos to entity. In that sense, it's similar to what we have already for mapblocks. The data structure here is a dynamic forest of static k-d-trees, which are merged, and sometimes is shrinked. In terms of code size, exe_virus' spatial map (impl. & header) is roughly around 150 loc, whereas my (header-only) implementation is around 450 loc. (I am reasonably confident in the correctness of my data structure nevertheless.)
  • I expect updates to be faster in exe_virus' solution (but I don't think that this will be very relevant in the bigger picture, since I expect there to only be linearly many updates per step, so an amortized logarithmic-ish runtime with good constant factors should be fine here).
  • There is currently some remaining potential for optimization in both solutions.

@ExeVirus
Copy link
Contributor

appgurueu is correct on many fronts here.

To clarify on the two drawbacks mentioned, I have two pending solutions for the subjects:

Very sparse distributions of entities, large area queries

I have implemented exactly the recommended solution here: check the number of buckets we are about to iterate over and compare against the total number of entities. If there more mapblocks, I fallback to linear iteration over all entities for now, until the algorithm is more optimized.

Very dense distributions, small area queries

Except where all entities in the entire server are in a single mapblock, this solution will still help, just not tremendously. In this case, only one or two mapblocks will need to be checked (worst case 8 on a corner), ignoring all the other entities on the server.

I am have been debating allowing the data structure use a non-hardcoded bucket size, so that buckets can be of size 16x16x16 or 1x1x1, 64x64x64, etc. Then we could expose it as a setting for servers/games to tweak for their needs.

There is currently some remaining potential for optimization in both solutions.

And yes, both algorithms can be improved. I'm looking to give mine another 2 or 3x improvement for large range queries by using the regularity of the boxes to check for whole rows/columns of mapblocks, rather than each individual mapblock or entity inside a given mapblock. This will only apply for range queries that are at least 2 mapblocks in dimension or larger, however. I'm looking to improve this specifically to help with connected client object updates, which we currently rely on large getObjectsInRadius queries for.

My Recommendation:

Once we are done with optimizations over the next days/weeks, we can do side by side comparisons and let the numbers decide for us. If they're comparable, Core Devs can make that call. These structures are certainly exclusive: i.e. we should choose one not both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance WIP The PR is still being worked on by its author and not ready yet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The data structure problem with active objects
4 participants