Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual path semantics that are deterministic, lazy and suitable for performing deduplication of values by a path value key. #10689

Open
roberth opened this issue May 13, 2024 · 2 comments
Labels
feature Feature request or proposal fetching Networking with the outside (non-Nix) world language The Nix expression language; parser, interpreter, primops, evaluation, etc

Comments

@roberth
Copy link
Member

roberth commented May 13, 2024

Is your feature request related to a problem? Please describe.

This issue discusses the solution space for a crucial step to make NixOS evaluation not copy nixpkgs to the store.

#6530 solves the problem of lazy fetching while using the module system, by introducing another problem, which is new observable nondeterminism in the form of virtual path string placeholders (/nix/store/virtual0000<n>).

Lazy paths (#10252) also add laziness, but the module system's use of paths causes it to have to copy nonetheless.

Describe the solution you'd like

Some possible solutions

This is an unsolved problem. Lazy trees will leak non-deterministic virtual path names in this case, which in my view is the primary reason why we can't just adopt that solution.
Possible solutions I'm aware of

  • make the module system perform the transformation instead. This should be possible to do entirely in Nix. (1. split into virtual root path and relative path string without context, 2. maintain a mapping from virtual root path to numbered placeholders, 3. insert those placeholders into the module paths, so that they're just strings)
    • O(n) unless something is added to the language (nondeterministicComparePathRoot function? nondeterministicPathRootIdString? A new "path map" type?)
    • not transparent
    • slows down the module system if it's complicated on the expression side
  • allow path values as attribute names
    • only depends on lazy path values
    • order is non-deterministic, unless we require a source fingerprint (or disallow a bunch functions??)
  • allow strings (with context) in attribute names
    • requires a proper lazy path strings implementation
    • same concern about order nondeterminism

Additional context

Discussed in meeting before, links tbd.

Also came up in

Priorities

Add 👍 to issues you find important.

@roberth roberth added feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc fetching Networking with the outside (non-Nix) world labels May 13, 2024
@edolstra
Copy link
Member

The non-determinism issue can be solved in the common case by deriving the virtual path from the input fingerprint. This wouldn't work for impure inputs, but I suppose if you have impure inputs, it's to be expected that you might end up with non-determinism in your evaluation.

It's also not obvious how to come up with a fingerprint for accessors like those returned by builtins.filterPath (since we don't have a way to fingerprint the filter function).

Another solution might be to restrict what you can do with a virtual path substring, e.g. make it illegal to call builtins.substring to get individual characters of the virtual path hash.

@roberth
Copy link
Member Author

roberth commented May 13, 2024

For stability, it'd be best to expose the fingerprints as little as possible, e.g. only their ordering but not their contents, but even exposing the ordering is a liability that we don't currently have.
Even if we can call a git rev deterministic, that doesn't mean that the user experience will be deterministic (e.g.: why was the list in the config reordered after this commit which only touches the readme...).

This wouldn't work for impure inputs

We might want to consider locking them (in memory) anyway, if we don't already do that.
If performance concerns prevent us from doing that, we could use an imperfect fingerprint.

builtins.filterPath (since we don't have a way to fingerprint the filter function).

We could evaluate the set of all included paths and then fingerprint the set and the underlying source. It's eager at the language level, but does not require any copying.
When the underlying source is itself a filtering source, we should merge the functions (inductively) instead of naively fingerprinting the underlying source to recover && short-circuit behavior, but at that point I think it should work well.

Another solution might be to restrict what you can do with a virtual path substring, e.g. make it illegal to call builtins.substring to get individual characters of the virtual path hash.

Or fall back to copying. This could be done by adding string-specific thunk constructors (e.g. tStringConcat in https://discourse.nixos.org/t/2024-04-29-nix-team-meeting-minutes-142/45020).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal fetching Networking with the outside (non-Nix) world language The Nix expression language; parser, interpreter, primops, evaluation, etc
Projects
None yet
Development

No branches or pull requests

2 participants