-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add crude rate limiter #3283
base: main
Are you sure you want to change the base?
Add crude rate limiter #3283
Conversation
853fe3c
to
f0b4a00
Compare
I'm not sure I see the point of this limitation?
Personally, I think 1锔忊儯 is a bad idea and should be defined by meilisearch itself considering your number of cores / RAM and the number of searches currently running. On the other hand, I think 2锔忊儯 is great, but if set too low could bring a horrendous experience, like getting results while you type and then getting locked at the end or before the end of your word and thus getting zero results once you gave the most complete query. |
Limiting API key usage is useful in a scheme where a different API key is distributed to each user. This allows authenticating a user with more precisions than the IP (I'll develop this point further in a minute).
I think it is possible that we actually agree here.
The point is to control what happens when the instance is overloaded in a more graceful manner. When there is no global rate limiting, an instance overload will result in an OOM kill, or being limited in CPU and answering very slowly to requests. In legitimate bursts of usage, I think it might lead lower maintenance load to outright reject a small number of requests than to have the instance crash and starts rejecting all requests.
Ideally, it should be the case, and maybe we can set the defaults according to these metrics, but in general, it only works when the computing resources are dedicated to Meilisearch which is not necessarily the case, so a manual level of control can be a welcome escape hatch. Also, I don't personally know how to convert CPU/RAM resources to the correct maximum number of "in flight" requests.
absolutely, maybe we should error or warn in this case?
IP rate limiting has very serious limitations, some of which I think I stated in my rate limiting survey. To restate in a more condensed form:
Hence the motivation to provide an alternative mechanism such as the API key rate limiter. If we can think of other, more convenient mechanisms, I think it would be good too.
I'm not sure I understand that sentence, sorry. To give some context, with the rate limiter as implemented, each IP starts with a pool of 200 requests, and that pool is replenished by 1 request every 50ms. This appears sufficient to me to address the "interactive usage" use case (maybe not if the user just presses "aaaaaaaaaaaaaaaaa" in the form, but I guess in that case, we probably don't want to send that many requests?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what you've done in this PR.
About the parameters that can't be higher than others, why can't we clamp them to the lowest number and always log the retained values instead of warning the user in the log? Not sure if we prefer to have a silent behavior 馃
About supporting human readable durations, isn't there an equivalent to byte-unit
on crates.io? Something able to parse: 5h30m20s, 5M4d30h. Also, I don't know if being able to specify nanosecond-precise duration is important. Milliseconds or even simple seconds would be enough.
name = "governor" | ||
version = "0.4.2" | ||
source = "registry+https://github.com/rust-lang/crates.io-index" | ||
checksum = "19775995ee20209163239355bc3ad2f33f83da35d9ef72dea26e5af753552c87" | ||
dependencies = [ | ||
"dashmap", | ||
"futures", | ||
"futures-timer", | ||
"no-std-compat", | ||
"nonzero_ext", | ||
"parking_lot", | ||
"quanta", | ||
"rand", | ||
"smallvec", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the number of dependencies that governor brings, here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find a way to disable the features of governor via actix-governor (that uses governor's default features). It is also unclear to me if we can disable them at all: the guide for governor mentions that it uses dashmap by default in the "std mode" but it is unclear what to do to change this and the impact it might have.
@@ -384,6 +390,105 @@ pub fn configure_data( | |||
); | |||
} | |||
|
|||
#[derive(Clone, Copy)] | |||
pub struct ApiKeyExtractor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to document the new structures with a simple sentence, please? What is it used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, tell me what you think.
@@ -36,7 +37,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) { | |||
) | |||
.service(web::resource("/stats").route(web::get().to(SeqHandler(get_index_stats)))) | |||
.service(web::scope("/documents").configure(documents::configure)) | |||
.service(web::scope("/search").configure(search::configure)) | |||
.service(web::scope("/search").configure(|cfg| search::configure(cfg, rate_limiters))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, only enabled on the search route. I like that 鉂わ笍
cfg.service(web::scope("/tasks").configure(tasks::configure)) | ||
.service(web::resource("/health").route(web::get().to(get_health))) | ||
.service(web::scope("/keys").configure(api_key::configure)) | ||
.service(web::scope("/dumps").configure(dump::configure)) | ||
.service(web::resource("/stats").route(web::get().to(get_stats))) | ||
.service(web::resource("/version").route(web::get().to(get_version))) | ||
.service(web::scope("/indexes").configure(indexes::configure)) | ||
.service(web::scope("/indexes").configure(|cfg| indexes::configure(cfg, rate_limiters))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ho and the indexes route? Why? Is it costly to ask for indexes info? Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not enabled in the indexes route: the configure function of the indexes is what calls the configure function for the search, which requires the rate limiters, so we need to pass them through.
Thanks! I must emphasize that the rate-limiter does not behave as I expected it to, though... as described in the "Notes on the implementation" in the description above
I think we can do both:
I found https://lib.rs/crates/duration-str and https://lib.rs/crates/parse_duration but didn't bother to vet them for a first iteration.
I must stress that the duration is the necessary duration to replenish a single request from the pool (I wish that actix governor had used a more classic "window" rate limiting algorithm, but this does not depend on me). Having it specified in seconds means that users would be unable to replenish, e.g. the 200-request pool before 1 minute and 4脿 seconds, which seems too much for me. |
f0b4a00
to
6678491
Compare
Yup, I noticed and this is the reason why we will not merge this PR in v1 馃槀
We found https://lib.rs/crates/humantime-serde that was looking pretty good too!
Indeed, but having a millisecond precision would be enough. |
Pull Request
Related issue
Related to #3126 (rate limiting)
What does this PR do?
--rate-limiting-*
options to configure the rate limiters (enabled? pool size? cooldown duration?)Notes on the implementation
cargo run --release -- --rate-limiting-api-key-pool 1 --rate-limiting-api-key-cooldown-ns 50000000000
I would expect that any request past the first one would be blocked for 50s, but this is not what happens: even at startup I'm allowed to send up to 8 requests, and then I have to wait for about 45s... Not sure where the fault lies here.--rate-limiting-disable-api-key
when no Master Key was provided 馃 .--rate-limiting-cooldown 4s
, which the std doesn't allow.PR checklist
Please check if your PR fulfills the following requirements:
Thank you so much for contributing to Meilisearch!