-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/blobserver/google/cloudstorage: add rate limiting #1285
base: master
Are you sure you want to change the base?
Conversation
…e only when it's explicitly specified.
pkg/serverinit/genconfig.go
Outdated
@@ -733,14 +734,17 @@ func (b *lowBuilder) addGoogleDriveConfig(v string) error { | |||
return nil | |||
} | |||
|
|||
var errGCSUsage = errors.New(`genconfig: expected "googlecloudstorage" field to be of form "client_id:client_secret:refresh_token:bucket[/dir/]" or ":bucketname[/dir/]"`) | |||
var errGCSUsage = errors.New(`genconfig: expected "googlecloudstorage" field to be of form "client_id:client_secret:refresh_token:bucket[/dir/][:ratelimit]" or ":bucketname[/dir/]"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious what the format/unit of ratelimit
is.
I think we should probably hold on making this configurable until the new configuration format is done. (I keep forgetting to finish it, but this is some extra motivation.)
pkg/serverinit/genconfig.go
Outdated
@@ -755,7 +759,8 @@ func (b *lowBuilder) addGoogleCloudStorageConfig(v string) error { | |||
if isReplica { | |||
gsPrefix := "/sto-googlecloudstorage/" | |||
b.addPrefix(gsPrefix, "storage-googlecloudstorage", args{ | |||
"bucket": bucket, | |||
"bucket": bucket, | |||
"rate_limit": rate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think everything else in Perkeep (+ JSON in general?) uses fooBar case, not underscores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved this (and the obviousness of units) by renaming the config item to qps
.
@@ -87,6 +89,7 @@ func newFromConfig(_ blobserver.Loader, config jsonconfig.Obj) (blobserver.Stora | |||
auth = config.RequiredObject("auth") | |||
bucket = config.RequiredString("bucket") | |||
cacheSize = config.OptionalInt64("cacheSize", 32<<20) | |||
ratestr = config.OptionalString("rate_limit", "10ms") // no faster than 1 req per this duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, cacheSize above is without underscore and client_id and client_secret below use an underscore. Great. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does 10ms come from? Is that a Google-published number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment referencing the relevant GCS docs. Short answer: up to 1000qps for writing and 5000 qps for reading is fine, above that and you're supposed to ramp up to it slowly. The default here is very conservative.
@@ -108,16 +111,22 @@ func newFromConfig(_ blobserver.Loader, config jsonconfig.Obj) (blobserver.Stora | |||
if dirPrefix != "" && !strings.HasSuffix(dirPrefix, "/") { | |||
dirPrefix += "/" | |||
} | |||
|
|||
dur, err := time.ParseDuration(ratestr) | |||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably want to return a more descriptive error here so the user knows that this was from their rate limit.
Update: this change does not in fact suffice to avoid rate-limit-exceeded errors. Digging into it a little farther I find that many (all?) write operations modify the same object: my PGP public-key blob! And it's the modifying of that one object that's happening too often for GCS's taste, even if the overall rate is well within limits. Offhand it looks like the GCS blobserver code will happily upload any blob anew even if it's already present. I'll see if I can whip up a fix for that (in a separate PR). |
…has no OptionalFloat method.)
This solves rate-limit-exceeded errors when dumping large numbers of blobs into a Google Cloud Storage backend. The default limit is 1 request per 10ms and can be overridden with the new optional setting
rate_limit
.