Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pick one compression algorithm #2756

Open
martinthomson opened this issue Mar 12, 2024 · 20 comments
Open

Pick one compression algorithm #2756

martinthomson opened this issue Mar 12, 2024 · 20 comments

Comments

@martinthomson
Copy link
Contributor

The current compression dictionary spec defines both br-d and zstd-d. These algorithms have roughly similar performance, so why do we need two?

Having the ability to negotiate different algorithms is useful, but having two means far less interoperability. If you want good interoperability, then either servers generally implement both or clients generally implement both. Anything less than that and you fail to use the feature when you could otherwise. In this case, it probably means less performance for those who fail to negotiate the feature, which isn't so serious, but it's still a missed opportunity... and worse if it means that some sites or browsers lose out on performance.

Given the potential need for a server or caches to hold tons of content compressed with two algorithms, the logical way to get good interoperability is to insist that clients implement both. Being responsible for a client, however, my preference is to avoid carrying two implementations, so I don't like that outcome.

From my perspective, I see no reason not to limit this to zstd-d and drop br-d. Performance is generally better in zstd than brotli, except in some narrow cases. But I don't care as much about that choice as I do about eliminating one or another.

@pmeenan
Copy link
Contributor

pmeenan commented Mar 21, 2024

CPU performance on the compression side of things is around the same last I heard (when comparing equivalent compression ratios) with ZStandard being faster on decompression.

I'm not an expert on either but I believe there are also differences on how they deal with streaming compression and accumulating data by block before emitting compressed data that could impact the dynamic use case.

At the limit, Brotli 11 compresses significantly smaller than Zstandard 22 in all of the dictionary compression tests I've run. That matters less for the dynamic on-the-fly case but is probably worth the savings when delta compressing at build time.

Doesn't the same question hold for zstd and br content encodings? The vast majority of servers support br for dynamic compression mostly because that's all that clients have supported but as Chrome rolled out support for zstd there were some companies that preferred to use it over br and some used it where they weren't using br at all.

If we had to pick, we would probably prefer br-d over zstd.

@pmeenan
Copy link
Contributor

pmeenan commented Mar 21, 2024

A couple of other notes that are probably limitations of the cli tools or default configs that we have seen:

  • ZStandard allows for using the dictionary at all compression levels. Brotli doesn't start using it until level 5 so use cases where you want to use minimal CPU for "some" compression would benefit ZStandard (though the Brotli team has mentioned that it's just a config on the compressing side and could be changed).
  • The Brotli cli is limited to 50MB dictionaries while ZStandard uses the dictionary as part of the compression window which defaults to 128MB (though things start to get different in how they behave as you get near those limits where Brotli always has the dictionary available and ZStandard loses it at window size + 1 byte.

@pmeenan
Copy link
Contributor

pmeenan commented Apr 3, 2024

@martinthomson given the window size discussions in #2754 are you still interested in limiting it to a single algorithm?

If so, I can ping the mailing list to get more feedback but my recommendation would be for it to be Brotli if that is the case since it allows for larger resources to be delta-compressed (up to 50 MB vs 8 MB for ZStandard) given how they trat the dictionary relative to the compression window.

@martinthomson
Copy link
Contributor Author

It weakens my position slightly, but I find it hard to get excited about improving the performance of resources that are more than 8MB in size.

I'd still prefer one rather than two, especially since these schemes have substantially similar performance. The differences only affect very specific scenarios.

@ekinnear
Copy link

ekinnear commented Apr 5, 2024

I'm mixed on this, normally I'd advocate for fewer options that can lead us to wider deployment of just one option.

That said, if we think that br and zstd are both worth keeping around (and we appear to), we've been thinking of the ability to transport additional compression dictionaries as a generic feature that can be supported by any relatively modern compression algorithm. From that perspective, it would be nice to have it be available/consistently defined for both brotli and zstandard.

@martinthomson
Copy link
Contributor Author

if we think that br and zstd are both worth keeping around (and we appear to)

I'm not sure about this, personally. I'm still of the view that we would be better off picking one.

@yoavweiss
Copy link
Contributor

I agree with @pmeenan that:

  • Accept-Encoding already supports multiple encodings and we're better off for it.
  • In the immediate, brotli seems more valuable that zstd

I haven't dug into it, but it is possible that e.g. brotli is better for the static case whereas zstd is better for the dynamic one.
Beyond that, shipping with multiple options can have a positive influence against ossification.

@martinthomson
Copy link
Contributor Author

We're already shipping with multiple options. That's the value that using content codings provides. We get to use this alongside gzip, which isn't going away any time soon.

As for two, when the differences we're talking about are so minor, the only reason to do both is probably because browsers have both already. That's not a great reason.

@pmeenan
Copy link
Contributor

pmeenan commented Apr 30, 2024

I'm not sure the differences are that minor. I need to pull some hard data to be sure, but based on what I've seen so far:

Brotli Benefits

  • Can delta-compress up to 50 MB resources with 16 MB window
  • Compresses measurably better at the max (11) compared to Zstandard (22)
  • Streaming compressor - emits first bytes with less latency than Zstandard (block-based)

Zstandard Benefits:

  • Effective dictionary compression at lower quality levels (Brotli only starts using dictionaries at level 5)
  • Able to delta-compress resources north of 100 MB (requires a window as big as the resource)

I know Meta has quite a bit of Zstandard infrastructure for their apps and I'm sure there are reasons I'm missing for why it was chosen.

If I were to pick only one it would be Brotli because it better maps to the HTTP use cases, both in the static case (max compression) and dynamic case (lower latency) assuming the dynamic compression can be at least level 5.

It's possible that both could be improved without changing the stream format. I know Brotli compression CPU costs improved since the original releases and is now similar to Zstandard and maybe changes can be made where dictionaries can be utilized at lower compression levels.

As things stand right now, I could see developers picking one or the other specifically for their use case based on just the differences above. Having both as options also keeps the incentive to improve both of them.

@martinthomson
Copy link
Contributor Author

I guess we'll have to disagree. Those seem pretty minor to me. Compression performance (bytes, CPU) are close enough as to make no real difference. If more differences manifest when applying larger amounts of CPU or memory, I don't think that is very relevant to the use case.

@pmeenan
Copy link
Contributor

pmeenan commented Apr 30, 2024

Just to put some numbers behind the static size differences, using the wayback machine, the YouTube desktop player from April 29th 2024 is 8,868,144 bytes. Using the player from April 26th 2024 as a dictionary at the maximum settings:

Brotli 11: 1,429,850 bytes
Zstandard 22: 1,495,378 bytes
Brotli 11 with dictionary: 45,854 bytes
Zstandard 22 with dictionary: 138,134 bytes

Both are significantly smaller using dictionaries but the Brotli one is 1/3 the size of the Zstandard one. At scale, that becomes pretty significant.

On the latency front, I'm probably a bit biased coming from fighting for every ms of TTFB for search and large sites that stream the responses like Amazon and Yahoo but even a few milliseconds of delay in buffering can make or break a ship decision on dictionary compression.

On the resource size side of things, Something like the web version of Adobe Photoshop ships wasm bundles that are 60+ MB. Not being able to use Zstandard would likely be the difference between being able to use delta compression and not. Though it just opens up the size range of 50-128 MB, over which dictionaries lose performance for the delta-encoding case.

@martinthomson
Copy link
Contributor Author

That's a fairly significant difference. How confident are you that this is a product of the format, as opposed to current algorithmic choices? Other metrics I've seen show a smaller gap between the two at different times, which suggests that the zstd format might produce a similar result with a better compressor in the same way that the brotli compressor has improved recently.

Photoshop probably doesn't need to ship all 60+MB at once, I'd hope.

@pmeenan
Copy link
Contributor

pmeenan commented Apr 30, 2024

There's a much smaller gap at all levels in the non-dictionary case. I can't speak to how much of that is from framing and format decisions vs optimization. It's possible that work on the compressor size of things could reduce the difference if resources are tasked to work on it. Is the work more likely to be funded if both are available for sites to use or if only one is available (competition vs need)?

As things stand today, Brotli looks better for the use cases in discussion so I guess the question would be if there are reasons to also include Zstandard rather than to try to get Zstandard to perform as well as Brotli (at least in the specific case of maximum compression for delta-encoded resources).

I do know that how dictionaries and window sizes interact as well as the limitations on the dictionary sizes is a result of format.

As far as Photoshop goes, I don't know if it is "needed" but there are multiple wasm assets that make up the application, several of which are over 50 MB. Today, all of those need to be fully downloaded every time the app updates. I don't know that increasing the limit from 50 MB to 128 MB is enough to justify Zstandard all on its own but it does remove a blocker for some use cases in production today.

@martinthomson
Copy link
Contributor Author

That makes a pretty solid case for (a) brotli-only and (b) not running Photoshop on the web.

Leaving the second part aside... What justifies retaining zstd then? Could we leave zstd for now?

@pmeenan
Copy link
Contributor

pmeenan commented May 1, 2024

FWIW, the points I observed were from the perspective of a web dev. There very well may be deployment reasons why one would pick Zstandard over Brotli that I'm just not aware of. Maybe we can get the Facebook team to weigh in on why they chose Zstandard for their app-based HTTP traffic over Brotli.

Given we're just rolling out support for zstd in Chrome and Firefox now, it seems a bit strange to limit the dictionary case only but allow for sites to pick between br and zstd as they see fit.

I know there are a couple of large sites that had not been using br but had deployed zstd. It's possible it is because of older CPU utilization tests or maybe they have other reasons that Zstandard's libs integrated better with their serving infra.

Basically, I have a strong case for wanting to keep Brotli, but I don't know about any cases where Zstandard may be a significantly better option and I'd hate to make that call and exclude those cases from being explored.

@felixhandte
Copy link

felixhandte commented May 1, 2024

Just to put some numbers behind the static size differences, using the wayback machine, the YouTube desktop player from April 29th 2024 is 8,868,144 bytes. Using the player from April 26th 2024 as a dictionary at the maximum settings:

Brotli 11: 1,429,850 bytes
Zstandard 22: 1,495,378 bytes
Brotli 11 with dictionary: 45,854 bytes
Zstandard 22 with dictionary: 138,134 bytes

I get a different result

$ curl -s -L https://web.archive.org/web/20240426053445js_/https://www.youtube.com/s/desktop/5e42dd8a/jsbin/desktop_polymer.vflset/desktop_polymer.js -o 20240426053445_desktop_polymer.js
$ curl -s -L https://web.archive.org/web/20240428234943js_/https://www.youtube.com/s/desktop/5519da25/jsbin/desktop_polymer.vflset/desktop_polymer.js -o 20240428234943_desktop_polymer.js
$ zstd -vv --ultra -22 --long -D 20240426053445_desktop_polymer.js 20240428234943_desktop_polymer.js -o diff.zst
*** Zstandard CLI (64-bit) v1.5.6, by Yann Collet ***
--zstd=wlog=27,clog=26,hlog=25,slog=9,mml=3,tlen=999,strat=9
--format=.zst --no-sparse --block-size=0 --memory=134217728 --threads=1 --content-size
Loading 20240426053445_desktop_polymer.js as dictionary 
Decompression will require 8868143 B of memory
20240428234943_desktop_polymer.js :  0.58%   (8868143 B =>  51051 B, diff.zst) 
20240428234943_desktop_polymer.js : Completed in 5.48 sec  (cpu load : 100%)

I guess you missed the --ultra flag in your test, which is what lets the CLI use more than an 8 MB window. Since this JS file is just over 8 MB, the corresponding sections are at out of reach at the tail end of the input.

@felixhandte
Copy link

With respect to "why zstd?": Meta saw significant performance benefits from using zstd, both without and then with dictionaries, for our traffic to our mobile apps. We root-caused a significant part of that benefit to the faster decompression speed it offers over brotli and gzip. We also see better density with Zstd than Brotli at the levels we use for dynamic responses.

Also, I understand the desire for simplicity in the spec, but this doesn't seem like a topic that we should decide by fiat in the spec. Perhaps it would be sufficient to add a little more clarity to the draft that the two encodings it defines, although they share a lot of underlying mechanics, are independent of one another and implementers can implement either one without the other should they choose.

@pmeenan
Copy link
Contributor

pmeenan commented May 1, 2024

Thanks, that helps. I missed the --ultra and --long flags going past 19.

If there is concern about the encodings being tied with the negotiation, I could potentially separate out the actual brotli and Zstandard dictionary encodings as separate specs and have this just define the dictionary hash negotiation. We could also word it in such a way that the negotiation could be used for future dictionary-aware encodings but that it includes the details for encodings that can already handle dictionary-based encoding as of the time of the spec.

In either case, I think "what we ship as browsers" should be separated out from "what we specify for HTTP" and artificially limiting it to (and picking) one at the spec level is probably not appropriate.

With my browser hat on, I still think Chrome would be interested in supporting both (at least initially) and letting the ecosystem figure out what works well for different situations.

@pmeenan
Copy link
Contributor

pmeenan commented May 16, 2024

Having talked to both the Brotli and Zstandard devs, there are different tradeoffs that each team made that are fundamental to the stream formats that impact the compression rates, CPU utilization and encode/decode latency.

There are also differences in the encoder implementations that each may be able to improve on by leveraging ideas implemented in the other but I'm not convinced that those changes will necessarily be explored without the competition that having options brings.

As things stand right now, I'd rather spec both and see how the usage evolves for the various use cases.

@pmeenan
Copy link
Contributor

pmeenan commented May 28, 2024

I'm seeing ~3 options for moving forward on this:

1 - Strip out both content encodings and spec only the dictionary negotiation.

This feels like we'd just be kicking the can down the road a bit since both the brotli and zstandard teams would follow-up with specs for the dictionary-encoded versions and we'd have the same discussion across 2 different drafts.

2 - Spec a single content encoding that is allowed to be used with the dictionary negotiation and disallow other encodings.

This would draw pretty strong objections from not just the teams that prefer the encoding not picked but by those interested in non-browser use cases in the future that may use different encodings (including the same algorithms with different parameters).

3 - Spec both in HTTP and have the web-facing encodings discussion in browser forums (maybe the intent process?)

This is probably the cleanest and would keep the web ecosystem discussions in browser-space instead of at the HTTP level but may remove a bit of leverage from the discussion and also pushes those discussions down the road. Maybe make it part of the standards positions as well? mozilla/standards-positions#771

@martinthomson are you comfortable with something like option 3 or would you prefer option 1 where the content encodings are defined separately? From the discussion on the list and in this issue, I don't see option 2 as likely to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants