Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster decompression. #793

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

Conversation

icodywu
Copy link

@icodywu icodywu commented Mar 28, 2023

I made the following changes to effectively boost the decompression throughput (~40% over Silesia Corpus)

  1. faster data copy for LZ match, as given in FAST_LZ_COPY()
  2. faster Huffman decoding, as defined by HUFF_DECODE_LEN(), HUFF_DECODE_DIST(), using the redefined Huffman decoding tables.
  3. Break up the long sequence of if...else if...else if... inside the for loop in infate_fast().
  4. Buffer read in size_t, as opposed to byte-wise
  5. Expanded boundary, left>=258 -> left>=8, for applying inflate_fast()

Need to do:
Validate 32-bit CPU
Validate infback()
Add abundant comments in the final stage

@icodywu
Copy link
Author

icodywu commented Apr 11, 2023

Dear Maxpaj,

Would you mind explaining what in your view is bogus? Did you test the code performance?

@nmoinvaz
Copy link
Contributor

nmoinvaz commented Apr 11, 2023

It might be a good idea to split your PR into multiple functional commits explaining each change in detail. This allows each change to be performance tested individually.

@icodywu
Copy link
Author

icodywu commented Apr 12, 2023

I appreciate your constructive comments. The changes are intermingled, so it is difficult to split them into multiple commits in a clear-cut way. How about I amend a document to contrast each of the critical changes and explain the underlying reasons?

@madler
Copy link
Owner

madler commented Apr 12, 2023

I am waiting for the proposer's to-do list to be completed, and I will then test the commit for speed and inspect for correctness and portability. I will then consider whether I should merge it or some rewrite of it.I do not require that this be split into multiple commits.

- added inflateBackWrap() for stand-alone validation of inflateBack(). Moreover, inflateBackWrap() exhibits identical functionality as in inflate().
- added comments for code revisions
- added log in ChangeLog
@icodywu
Copy link
Author

icodywu commented May 3, 2023

I have created inflateBackWrap() to validate inflateBack(). Moreover, inflateBackWrap() exhibits identical functionality as inflate(). This allows for an apple-to-apple comparison of performance between inflate() and inflateBack() under different setups.
I have also added detailed comments inside the code to explain the underlying rationale.

@madler, could you take over and review the code? Feel free to make changes in your manner.

@hsiangkao
Copy link

hsiangkao commented Sep 4, 2023

Hi @madler, may I ask what the current status of this work.

we'd like to enable Intel DEFLATE IAA/QAT in-kernel support for EROFS, but we'd also like to have a faster DEFLATE software decompression fallback support anyway.

I'm not sure if zlib official codebase could address that (otherwise zlib-ng codebase has to be considered instead) since DEFLATE-family is currently a de-facto standard for various common use cases/formats, and better optimized (de)compression for those modern processors could benefit all of us.

And I guess old 16-bit platforms are unimportant now since they could still be supported by old stable zlib versions.

@madler
Copy link
Owner

madler commented Sep 4, 2023

This is on my to-do list.

@madler madler changed the base branch from master to develop January 17, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants