Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zypper is really slow #513

Open
jengelh opened this issue Nov 1, 2023 · 7 comments
Open

zypper is really slow #513

jengelh opened this issue Nov 1, 2023 · 7 comments

Comments

@jengelh
Copy link
Contributor

jengelh commented Nov 1, 2023

OS/Version: OpenSUSE Tumbleweed 20231030 x86_64, zypper-1.14.66-2.4, libzypp-17.31.23-14.2, libsolv-tools-0.7.25-1.2

Obtain a baseline:

# zypper --non-interactive in --download-only $(cat ghclist)
# time rpm -Uhv /var/cache/zypp/packages/base/x86_64/ghc-*.rpm
…
 157:ghc-ordered-containers-0.2.3-1.7 ################################# [ 99%]
 158:ghc-utf8-string-1.0.2-2.10       ################################# [100%]

real    0m2.694s
user    0m1.119s
sys     0m0.672s
# sync; time rpm -e $(rpm -qa 'ghc-*')
real    0m2.933s
user    0m1.482s
sys     0m0.636s

Then exercise zypper. Yes, there's a bigger solver component in zypper than there is an rpm, but it's a fraction of the overall walltime.

# time zypper --non-interactive in $(cat ghclist)
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 158 NEW packages are going to be installed:
  ghc-Glob ghc-JuicyPixels ghc-OneTuple ghc-Only ghc-QuickCheck ghc-SHA ghc-StateVar ghc-aeson ghc-aeson-pretty
  ghc-ansi-terminal ghc-ansi-terminal-types ghc-appar ghc-array ghc-asn1-encoding ghc-asn1-parse ghc-asn1-types ghc-assoc
  ghc-async ghc-attoparsec ghc-base ghc-base-compat ghc-base-compat-batteries ghc-base-orphans ghc-base16-bytestring ghc-base64
  ghc-base64-bytestring ghc-basement ghc-bifunctors ghc-binary ghc-bitvec ghc-blaze-builder ghc-blaze-html ghc-blaze-markup
  ghc-byteorder ghc-bytestring ghc-case-insensitive ghc-cassava ghc-cereal ghc-colour ghc-commonmark ghc-commonmark-extensions
  ghc-commonmark-pandoc ghc-comonad ghc-conduit ghc-conduit-extra ghc-connection ghc-containers ghc-contravariant ghc-cookie
  ghc-cryptonite ghc-data-default ghc-data-default-class ghc-data-default-instances-containers ghc-data-default-instances-dlist
  ghc-data-default-instances-old-locale ghc-data-fix ghc-deepseq ghc-digest ghc-directory ghc-distributive ghc-dlist
  ghc-doclayout ghc-doctemplates ghc-emojis ghc-exceptions ghc-file-embed ghc-filepath ghc-foldable1-classes-compat
  ghc-generically ghc-ghc-boot-th ghc-gridtables ghc-haddock-library ghc-hashable ghc-haskell-lexer ghc-hourglass
  ghc-http-client ghc-http-client-tls ghc-http-types ghc-indexed-traversable ghc-indexed-traversable-instances
  ghc-integer-logarithms ghc-iproute ghc-ipynb ghc-jira-wiki-markup ghc-libyaml ghc-memory ghc-mime-types ghc-mono-traversable
  ghc-mtl ghc-network ghc-network-uri ghc-old-locale ghc-ordered-containers ghc-pandoc-types ghc-parsec ghc-pem ghc-pretty
  ghc-pretty-show ghc-primitive ghc-process ghc-random ghc-regex-base ghc-regex-tdfa ghc-resourcet ghc-safe ghc-scientific
  ghc-semialign ghc-semigroupoids ghc-socks ghc-split ghc-splitmix ghc-stm ghc-streaming-commons ghc-strict ghc-syb ghc-tagged
  ghc-tagsoup ghc-template-haskell ghc-temporary ghc-texmath ghc-text ghc-text-conversions ghc-text-short ghc-th-abstraction
  ghc-th-compat ghc-th-lift ghc-th-lift-instances ghc-these ghc-time ghc-time-compat ghc-tls ghc-transformers
  ghc-transformers-compat ghc-typed-process ghc-typst-symbols ghc-unicode-collation ghc-unicode-data ghc-unicode-transforms
  ghc-uniplate ghc-unix ghc-unliftio-core ghc-unordered-containers ghc-utf8-string ghc-uuid-types ghc-vector
  ghc-vector-algorithms ghc-vector-stream ghc-witherable ghc-x509 ghc-x509-store ghc-x509-system ghc-x509-validation ghc-xml
  ghc-xml-conduit ghc-xml-types ghc-yaml ghc-zip-archive ghc-zlib

158 new packages to install.
Overall download size: 0 B. Already cached: 31.4 MiB. After the operation, additional 211.3 MiB will be used.
Continue? [y/n/v/...? shows all options] (y): y
In cache ghc-base-4.17.2.0-1.2.x86_64.rpm                                                                (1/158),   3.1 MiB    
…
(158/158) Installing: ghc-http-client-tls-0.3.6.1-2.10.x86_64 ...........................................................[done]
Running post-transaction scripts ........................................................................................[done]
real    0m39.364s
user    0m19.816s
sys     0m17.949s

# time zypper --non-interactive rm 'ghc-*'
…
(158/158) Removing ghc-base-4.17.2.0-1.2.x86_64 .........................................................................[done]
Running post-transaction scripts ........................................................................................[done]
There are running programs which still use files and libraries deleted or updated by recent upgrades. They should be restarted to benefit from the latest updates. Run 'zypper ps -s' to list these programs.
real    0m30.475s
user    0m16.013s
sys     0m13.089s
@jengelh
Copy link
Contributor Author

jengelh commented Nov 1, 2023

cross-checking expected baseline with dnf:

# dnf5 install --downloadonly $(cat ghclist)
# sync; time dnf5 install -y $(cat ghclist)
…
[158/158] Total                                                                        100% |   0.0   B/s |   0.0   B |  00m00s
Running transaction
[  1/160] Verify package files                                                         100% | 993.0   B/s | 158.0   B |  00m00s
[  2/160] Prepare transaction                                                          100% |   1.0 KiB/s | 158.0   B |  00m00s
[  3/160] Installing ghc-base-0:4.17.2.0-1.2.x86_64                                    100% | 189.0 MiB/s |  19.3 MiB |  00m00s
…
[160/160] Installing ghc-utf8-string-0:1.0.2-2.10.x86_64                               100% |   1.2 MiB/s | 296.6 KiB |  00m00s

real    0m4.599s
user    0m2.451s
sys     0m1.128s

# sync; time dnf5 remove -y 'ghc-*'
…
Transaction Summary:
 Removing:        158 packages

After this operation 211 MiB will be freed (install 0 B, remove 211 MiB).

Running transaction
[  1/159] Prepare transaction                                                          100% |   1.0 KiB/s | 158.0   B |  00m00s
[  2/159] Erasing ghc-commonmark-pandoc-0:0.2.1.3-2.15.x86_64                          100% | 444.0   B/s |   4.0   B |  00m00s
…
[159/159] Erasing ghc-base-0:4.17.2.0-1.2.x86_64                                       100% | 108.0   B/s |  16.0   B |  00m00s
>>> Running post-uninstall scriptlet: ghc-base-0:4.17.2.0-1.2.x86_64
>>> Stop post-uninstall scriptlet: ghc-base-0:4.17.2.0-1.2.x86_64

real    0m2.587s
user    0m1.078s
sys     0m0.820s

@mlandres
Copy link
Member

mlandres commented Nov 3, 2023

Maybe you can provide us the zypper.log of the install and remove command. And if you're about to measure, it would be interesting if the numbers for ZYPP_SINGLE_RPMTRANS=1 zypper ... also differ that much.
We are aware that there are several not install-related actions which consume extra time and where we can enhance. Most of them related to the repo autorefresh and cache building.
The traditional zypp backend and rpm/dnf are hard to compare, because the traditional backend forks and execs rpm for each single package. The SINGLE_RPMTRANS uses librpm to form a single transaction. The not-install related pre- and post-processing however is the same here.

@jengelh
Copy link
Contributor Author

jengelh commented Nov 3, 2023

ZYPP_SINGLE_RPMTRANS=1 provides the desirable execution time characteristics 👍 . Can this be made default?

Commands for obtaining timings should be shown above. Today I am getting a runtime of about 15s (probably my system was a bit loaded at the time of report). The exact package list of the 158 was:

ghc-Glob ghc-JuicyPixels ghc-OneTuple ghc-Only ghc-QuickCheck ghc-SHA ghc-StateVar ghc-aeson ghc-aeson-pretty ghc-ansi-terminal ghc-ansi-terminal-types ghc-appar ghc-array ghc-asn1-encoding ghc-asn1-parse ghc-asn1-types ghc-assoc ghc-async ghc-attoparsec ghc-base ghc-base-compat ghc-base-compat-batteries ghc-base-orphans ghc-base16-bytestring ghc-base64 ghc-base64-bytestring ghc-basement ghc-bifunctors ghc-binary ghc-bitvec ghc-blaze-builder ghc-blaze-html ghc-blaze-markup ghc-byteorder ghc-bytestring ghc-case-insensitive ghc-cassava ghc-cereal ghc-colour ghc-commonmark ghc-commonmark-extensions ghc-commonmark-pandoc ghc-comonad ghc-conduit ghc-conduit-extra ghc-connection ghc-containers ghc-contravariant ghc-cookie ghc-cryptonite ghc-data-default ghc-data-default-class ghc-data-default-instances-containers ghc-data-default-instances-dlist ghc-data-default-instances-old-locale ghc-data-fix ghc-deepseq ghc-digest ghc-directory ghc-distributive ghc-dlist ghc-doclayout ghc-doctemplates ghc-emojis ghc-exceptions ghc-file-embed ghc-filepath ghc-foldable1-classes-compat ghc-generically ghc-ghc-boot-th ghc-gridtables ghc-haddock-library ghc-hashable ghc-haskell-lexer ghc-hourglass ghc-http-client ghc-http-client-tls ghc-http-types ghc-indexed-traversable ghc-indexed-traversable-instances ghc-integer-logarithms ghc-iproute ghc-ipynb ghc-jira-wiki-markup ghc-libyaml ghc-memory ghc-mime-types ghc-mono-traversable ghc-mtl ghc-network ghc-network-uri ghc-old-locale ghc-ordered-containers ghc-pandoc-types ghc-parsec ghc-pem ghc-pretty ghc-pretty-show ghc-primitive ghc-process ghc-random ghc-regex-base ghc-regex-tdfa ghc-resourcet ghc-safe ghc-scientific ghc-semialign ghc-semigroupoids ghc-socks ghc-split ghc-splitmix ghc-stm ghc-streaming-commons ghc-strict ghc-syb ghc-tagged ghc-tagsoup ghc-template-haskell ghc-temporary ghc-texmath ghc-text ghc-text-conversions ghc-text-short ghc-th-abstraction ghc-th-compat ghc-th-lift ghc-th-lift-instances ghc-these ghc-time ghc-time-compat ghc-tls ghc-transformers ghc-transformers-compat ghc-typed-process ghc-typst-symbols ghc-unicode-collation ghc-unicode-data ghc-unicode-transforms ghc-uniplate ghc-unix ghc-unliftio-core ghc-unordered-containers ghc-utf8-string ghc-uuid-types ghc-vector ghc-vector-algorithms ghc-vector-stream ghc-witherable ghc-x509 ghc-x509-store ghc-x509-system ghc-x509-validation ghc-xml ghc-xml-conduit ghc-xml-types ghc-yaml ghc-zip-archive ghc-zlib

@mlandres
Copy link
Member

mlandres commented Nov 3, 2023

Were about to push this to become the new default.

@mlschroe
Copy link
Member

mlschroe commented Nov 3, 2023

Just FYI: the underlying problem is rpm's file conflict check. Most of our packages contain a 'COPYING' or 'LICENSE' file. When such a package is installed or erased, rpm needs to check every other package that also contains such a file. The different directories do not matter, as rpm needs to understand "aliased" directories, i.e. directory symlinks (an example is that "/bin -> /usr/bin" symlink).

So when zypper calls rpm for every transaction step, we get a O(N^2) time. When using a single transaction, rpm can optimize the check and we just have a O(N) time.

For everything else, it does not matter if a single transaction is used or not. It's really just the file conflict check and those license files...

@dirkmueller
Copy link
Member

How are we doing in regards to speeding zypper up here?

@bzeller
Copy link
Contributor

bzeller commented Mar 13, 2024

How are we doing in regards to speeding zypper up here?

There were some problems with rpm --root that had to be fixed first, thats why it is not yet default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants