Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add crude rate limiter #3283

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Add crude rate limiter #3283

wants to merge 3 commits into from

Conversation

dureuill
Copy link
Contributor

Pull Request

Related issue

Related to #3126 (rate limiting)

What does this PR do?

  • Add 3 rate limiters to the search route:
    • Global rate limiter that limits the total number of search requests regardless of their origin
    • IP rate limiter that limits the number of search requests per IP address of origin
    • API key rate limiter that limits the number of search requests per API key used in the request
  • Add a new set of --rate-limiting-* options to configure the rate limiters (enabled? pool size? cooldown duration?)

Notes on the implementation

  • Actix Governor does not behave as expected: for example with cargo run --release -- --rate-limiting-api-key-pool 1 --rate-limiting-api-key-cooldown-ns 50000000000 I would expect that any request past the first one would be blocked for 50s, but this is not what happens: even at startup I'm allowed to send up to 8 requests, and then I have to wait for about 45s... Not sure where the fault lies here.
  • For API keys, all requests without an API key are put in the same rate-limiting bucket. It would be better to disable the rate limiter when there is no API key but actix governor does not give this level of control. We could maybe imply --rate-limiting-disable-api-key when no Master Key was provided 馃 .
  • The cooldown is in ns, which is not very practical but might be necessary to allow maximum control. Alternatives involve somehow parsing a "human-readable" duration to allow for a --rate-limiting-cooldown 4s, which the std doesn't allow.

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

@irevoire
Copy link
Member

irevoire commented Jan 2, 2023

API key rate limiter that limits the number of search requests per API key used in the request

I'm not sure I see the point of this limitation?
In which case should we use it?


  1. Global rate limiter that limits the total number of search requests regardless of their origin
  2. IP rate limiter that limits the number of search requests per IP address of origin

Personally, I think 1锔忊儯 is a bad idea and should be defined by meilisearch itself considering your number of cores / RAM and the number of searches currently running.
I don't see the point of limiting your meilisearch instance globally, and also, if the global rate limiter <= IP rate limiter, then one user can completely DOS your meilisearch instance by using all the available spots.

On the other hand, I think 2锔忊儯 is great, but if set too low could bring a horrendous experience, like getting results while you type and then getting locked at the end or before the end of your word and thus getting zero results once you gave the most complete query.
That's terrible for a search-as-you-type engine 馃様
I think it's a good idea to have it, but we should be clear that its value should be quite high because 50ms could be dramatic with the current engine.
Once we implement the shortcut to return faster results, if we can ensure that we're always going to answer in less than 50ms then it'll become even more useful I think.

@dureuill
Copy link
Contributor Author

dureuill commented Jan 3, 2023

[about API key limitation]
I'm not sure I see the point of this limitation?
In which case should we use it?

Limiting API key usage is useful in a scheme where a different API key is distributed to each user. This allows authenticating a user with more precisions than the IP (I'll develop this point further in a minute).

[About global rate limiter]
Personally, I think 1锔忊儯 is a bad idea and should be defined by meilisearch itself considering your number of cores / RAM and the number of searches currently running.
I don't see the point of limiting your meilisearch instance globally, and also, if the global rate limiter <= IP rate limiter, then one user can completely DOS your meilisearch instance by using all the available spots.

I think it is possible that we actually agree here.

I don't see the point of limiting your meilisearch instance globally

The point is to control what happens when the instance is overloaded in a more graceful manner. When there is no global rate limiting, an instance overload will result in an OOM kill, or being limited in CPU and answering very slowly to requests. In legitimate bursts of usage, I think it might lead lower maintenance load to outright reject a small number of requests than to have the instance crash and starts rejecting all requests.

should be defined by meilisearch itself considering your number of cores / RAM and the number of searches currently running.

Ideally, it should be the case, and maybe we can set the defaults according to these metrics, but in general, it only works when the computing resources are dedicated to Meilisearch which is not necessarily the case, so a manual level of control can be a welcome escape hatch. Also, I don't personally know how to convert CPU/RAM resources to the correct maximum number of "in flight" requests.

if the global rate limiter <= IP rate limiter

absolutely, maybe we should error or warn in this case?

On the other hand, I think [IP rate limiting] is great

IP rate limiting has very serious limitations, some of which I think I stated in my rate limiting survey. To restate in a more condensed form:

  1. Main issue is when the instance is behind a reverse proxy of some form: it will only see the IP of the proxy and so become a de facto global limiter. There are workaround (not implemented here) based on the forwarded from header, but they require the reverse proxy to rewrite this header as the end user might "lie" on the header in their request.
  2. In a DoS attack using Tor, even a single attacker will appear as many IPs (each per Tor endpoint used in the attack), defeating the purpose of IP rate limiting. zulip for instance provides a mitigation by downloading a list of Tor endpoint and handling them as a single IP. We might consider this mitigation if needs be.

Hence the motivation to provide an alternative mechanism such as the API key rate limiter. If we can think of other, more convenient mechanisms, I think it would be good too.

I think it's a good idea to have it, but we should be clear that its value should be quite high because 50ms could be dramatic with the current engine.

I'm not sure I understand that sentence, sorry. To give some context, with the rate limiter as implemented, each IP starts with a pool of 200 requests, and that pool is replenished by 1 request every 50ms. This appears sufficient to me to address the "interactive usage" use case (maybe not if the user just presses "aaaaaaaaaaaaaaaaa" in the form, but I guess in that case, we probably don't want to send that many requests?)

Kerollmops
Kerollmops previously approved these changes Jan 3, 2023
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what you've done in this PR.

About the parameters that can't be higher than others, why can't we clamp them to the lowest number and always log the retained values instead of warning the user in the log? Not sure if we prefer to have a silent behavior 馃

About supporting human readable durations, isn't there an equivalent to byte-unit on crates.io? Something able to parse: 5h30m20s, 5M4d30h. Also, I don't know if being able to specify nanosecond-precise duration is important. Milliseconds or even simple seconds would be enough.

Comment on lines +1575 to +1589
name = "governor"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19775995ee20209163239355bc3ad2f33f83da35d9ef72dea26e5af753552c87"
dependencies = [
"dashmap",
"futures",
"futures-timer",
"no-std-compat",
"nonzero_ext",
"parking_lot",
"quanta",
"rand",
"smallvec",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reduce the number of dependencies that governor brings, here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a way to disable the features of governor via actix-governor (that uses governor's default features). It is also unclear to me if we can disable them at all: the guide for governor mentions that it uses dashmap by default in the "std mode" but it is unclear what to do to change this and the impact it might have.

@@ -384,6 +390,105 @@ pub fn configure_data(
);
}

#[derive(Clone, Copy)]
pub struct ApiKeyExtractor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to document the new structures with a simple sentence, please? What is it used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, tell me what you think.

@@ -36,7 +37,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
)
.service(web::resource("/stats").route(web::get().to(SeqHandler(get_index_stats))))
.service(web::scope("/documents").configure(documents::configure))
.service(web::scope("/search").configure(search::configure))
.service(web::scope("/search").configure(|cfg| search::configure(cfg, rate_limiters)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, only enabled on the search route. I like that 鉂わ笍

cfg.service(web::scope("/tasks").configure(tasks::configure))
.service(web::resource("/health").route(web::get().to(get_health)))
.service(web::scope("/keys").configure(api_key::configure))
.service(web::scope("/dumps").configure(dump::configure))
.service(web::resource("/stats").route(web::get().to(get_stats)))
.service(web::resource("/version").route(web::get().to(get_version)))
.service(web::scope("/indexes").configure(indexes::configure))
.service(web::scope("/indexes").configure(|cfg| indexes::configure(cfg, rate_limiters)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho and the indexes route? Why? Is it costly to ask for indexes info? Not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not enabled in the indexes route: the configure function of the indexes is what calls the configure function for the search, which requires the rate limiters, so we need to pass them through.

The rate limiters are only in effect on the search route.

@dureuill
Copy link
Contributor Author

dureuill commented Jan 3, 2023

@Kerollmops

I like what you've done in this PR.

Thanks! I must emphasize that the rate-limiter does not behave as I expected it to, though... as described in the "Notes on the implementation" in the description above

About the parameters that can't be higher than others, why can't we clamp them to the lowest number and always log the retained values instead of warning the user in the log? Not sure if we prefer to have a silent behavior 馃

I think we can do both: [WARNING]Using the global rate limiting pool value (:globalValue) for the per IP rate limiting pool value (:ipValue) because it doesn't make sense to globally restrict more than per IP or something more elegant.

About supporting human readable durations, isn't there an equivalent to byte-unit on crates.io? Something able to parse: 5h30m20s, 5M4d30h.

I found https://lib.rs/crates/duration-str and https://lib.rs/crates/parse_duration but didn't bother to vet them for a first iteration.

Also, I don't know if being able to specify nanosecond-precise duration is important. Milliseconds or even simple seconds would be enough.

I must stress that the duration is the necessary duration to replenish a single request from the pool (I wish that actix governor had used a more classic "window" rate limiting algorithm, but this does not depend on me). Having it specified in seconds means that users would be unable to replenish, e.g. the 200-request pool before 1 minute and 4脿 seconds, which seems too much for me.

@Kerollmops
Copy link
Member

Thanks! I must emphasize that the rate-limiter does not behave as I expected it to, though... as described in the "Notes on the implementation" in the description above

Yup, I noticed and this is the reason why we will not merge this PR in v1 馃槀

I found https://lib.rs/crates/duration-str and https://lib.rs/crates/parse_duration but didn't bother to vet them for a first iteration.

We found https://lib.rs/crates/humantime-serde that was looking pretty good too!

Having it specified in seconds means that users would be unable to replenish, e.g. the 200-request pool before 1 minute and 4脿 seconds, which seems too much for me.

Indeed, but having a millisecond precision would be enough.

@dureuill dureuill removed this from the v1.0.0 milestone Jan 9, 2023
@curquiza curquiza marked this pull request as draft April 26, 2023 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants