Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficent parallel use of eval caches #10590

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

layus
Copy link
Member

@layus layus commented Apr 22, 2024

Motivation

This commit makes evaluation caches behave properly in presence of parallel
users. Transactions and SQL statements themselves are designed in such a way
that there cannot be race conditions while keeping transaction very small.
(Transactions used to last for the whole lifetime of the cache).

On top of #10570 it enables to significantly speed up evaluation of a large
number of attributes by splitting them across several calls to nix.

echo flake#attr1 ... flake#attrN | \
    parallel --pipe-part --linebuffer --block -1 \
	"xargs nix path-info --derivation"

From rough measurements on 16 cores it goes from 12s to 5s for ~200 attributes in the same flake.

This also fixes a wide range of issues discovered during the implementation,
like the insert or update statements that would create a brand new rowid and
silently make all the children of the old rowid unreachable, invalidating
significant portions of the cache.

This code has been stress-tested a bit, so it should be devoid of obvious mistakes.

Context

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@layus layus marked this pull request as draft April 22, 2024 19:19
@layus layus marked this pull request as ready for review April 22, 2024 19:54
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No wonder we've been getting sqlite busy warnings. Great find!

I've given it a superficial look, as I'm not too familiar with the eval cache yet.
Maybe @thufschmitt or @edolstra could finish review?

src/libexpr/eval-cache.cc Show resolved Hide resolved
@edolstra
Copy link
Member

The reason that we're not using WAL / small transactions is that (at least at the time) there was significant overhead to using them while populating the cache, especially for something like Nixpkgs.

Can you benchmark the performance of something like rm -rf ~/.cache/nix/eval-cache*; nix search nixpkgs hello with and without this change?

@layus
Copy link
Member Author

layus commented Apr 29, 2024

@edolstra

Yes, that is right indeed. While WAL does help a lot, transaction do appear to have some overhead:

Benchmark 1: ./baseline/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello
  Time (mean ± σ):     19.053 s ±  0.300 s    [User: 17.738 s, System: 1.286 s]
  Range (min … max):   18.745 s … 19.667 s    10 runs
 
Benchmark 2: ./wal/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello
  Time (mean ± σ):     30.813 s ±  2.847 s    [User: 24.298 s, System: 4.352 s]
  Range (min … max):   26.250 s … 35.306 s    10 runs
 
Benchmark 3: ./no-wal/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello
  Time (mean ± σ):     40.453 s ±  0.851 s    [User: 26.327 s, System: 13.886 s]
  Range (min … max):   39.018 s … 41.679 s    10 runs
 
Summary
  ./baseline/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello ran
    1.62 ± 0.15 times faster than ./wal/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello
    2.12 ± 0.06 times faster than ./no-wal/bin/nix --extra-experimental-features 'nix-command flakes' search nixpkgs hello

@layus
Copy link
Member Author

layus commented Apr 29, 2024

But then, we cannot reasonably keep a write lock on the database for the whole duration of the nix search command.

Would you consider a patch that performs only reads, and then writes all the computed data at once, when the cursors are destroyed ? It would still be efficient, and more friendly to concurrent accesses to the database.

@edolstra
Copy link
Member

@layus Yeah that sounds good.

An alternative might be (if possible) to start with a read lock and only upgrade to a write lock if we actually need to make changes to the database. Or to disable the eval cache if we can't acquire a write lock. Then it would be slower but at least it wouldn't be blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants