Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Jan 21, 2026

Motivation

This is an updated version of upstream NixOS#11749.

Nix historically has been bad at being able to answer the question "where did this store path come from", i.e. to provide traceability from a store path back to the Nix expression from which is was built. Nix tracks the "deriver" of a store path (the .drv file that built it) but that's pretty useless in practice, since it doesn't link back to the Nix expressions.

So this PR adds a "provenance" field (a JSON object) to the ValidPaths table and to .narinfo files that describes where the store path came from and how it can be reproduced.

There are currently the following types of provenance:

  • copied: Records that the store path was copied or substituted from another store (typically a binary cache). Its "from" field is the URL of the origin store. Its "provenance" field propagates the provenance of the store path on the origin store.

  • build: Records that the store path was produced by building a derivation. This is equivalent for the "deriver" field, but it has a nested "provenance" field that records how the .drv file was created.

  • tree: The store path is the result of a call to fetchTree (e.g. it's a flake or flake input). It includes the fetcher attributes.

  • subpath: The store path was created by copying the subpath of some other provenance (e.g. /foo/builder.sh from a tree) .

  • flake: Records that the store path was created during the evaluation of a flake output.

  • fetchurl: The store path is the result of a builtins.fetchurl call.

Context

Summary by CodeRabbit

  • New Features

    • Experimental provenance tracking: provenance metadata can be recorded and propagated for fetches, builds, store paths, flakes, workers, and substitutions.
  • Documentation

    • Store object schema extended with an optional provenance field (nullable JSON object).
  • Bug Fixes

    • Flake prefetch output no longer strips the "__final" attribute.
  • Tests

    • New functional tests validating provenance capture, propagation, and substitution.

@coderabbitai
Copy link

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

Introduce a Provenance subsystem and thread provenance metadata through fetchers, source accessors, eval/flake, store write/copy flows, worker/daemon protocol, DB schema, JSON (de)serialization, public APIs, and functional tests.

Changes

Cohort / File(s) Summary
Schema
doc/manual/source/protocols/json/schema/store-object-info-v2.yaml
Add optional provenance (null
Provenance core
src/libutil/include/nix/util/provenance.hh, src/libutil/provenance.cc
Add Provenance base, registry, Unknown/SubpathProvenance, JSON (de)serialization and factory helpers.
Source accessors & paths
src/libutil/include/nix/util/source-accessor.hh, src/libutil/source-accessor.cc, src/libutil/include/nix/util/source-path.hh, src/libutil/mounted-source-accessor.cc, src/libutil/union-source-accessor.cc
Add accessor provenance member, getProvenance() API, and SourcePath::getProvenance delegations/wrappers.
Fetchers & fetch provenance types
src/libfetchers/include/nix/fetchers/provenance.hh, src/libfetchers/provenance.cc, src/libfetchers/tarball.cc, src/libfetchers/fetchers.cc, src/libfetchers/filtering-source-accessor.{cc,hh}
Introduce TreeProvenance and FetchurlProvenance; attach provenance to SourceAccessor instances after fetch/substitution; update includes and meson lists.
Eval, flake & concurrency hooks
src/libexpr/include/nix/expr/eval.hh, src/libexpr/eval.cc, src/libexpr/primops.cc, src/libcmd/include/nix/cmd/installable-flake.hh, src/libcmd/installable-flake.cc, src/libflake/include/nix/flake/provenance.hh, src/libflake/provenance.cc, src/libflake/flake.cc
Add thread-local EvalContext.provenance, PushProvenance helper, InstallableFlake::makeProvenance, and propagate provenance into derivation writes and flake objects.
Async writer & derivation plumbing
src/libstore/include/nix/store/async-path-writer.hh, src/libstore/async-path-writer.cc, src/libstore/derivations.cc, src/libstore/include/nix/store/derivations.hh
Extend AsyncPathWriter::addPath and writeDerivation overloads to accept provenance; Items/queued paths carry provenance.
Store API, types & provenance structs
src/libstore/include/nix/store/store-api.hh, src/libstore/include/nix/store/path-info.hh, src/libstore/include/nix/store/provenance.hh, src/libstore/provenance.cc, src/libstore/include/nix/store/build-result.hh
Expose Provenance in public API, add includeInProvenance(), add provenance members to path/build types; implement BuildProvenance and CopiedProvenance with serializers and registration.
Store implementations & DB changes
src/libstore/local-store.cc, src/libstore/include/nix/store/local-store.hh, src/libstore/remote-store.cc, src/libstore/include/nix/store/remote-store.hh, src/libstore/restricted-store.cc, src/libstore/dummy-store.cc, src/libstore/binary-cache-store.cc, src/libstore/include/nix/store/binary-cache-store.hh, src/libstore/ssh-store.cc, src/libstore/include/nix/store/legacy-ssh-store.hh
Thread provenance through addToStoreFromDump/addCAToStore/copy flows; LocalStore DB schema and queries updated to persist provenance; remote handshake/transmission carry provenance; stores expose includeInProvenance().
Copy semantics & store-api logic
src/libstore/store-api.cc
copyStorePath now returns shared ValidPathInfo, wraps provenance into CopiedProvenance for copied paths, and propagates provenance across copy flows.
Build/substitution wiring
src/libstore/build/derivation-building-goal.cc, src/libstore/include/nix/store/build/derivation-builder.hh, src/libstore/include/nix/store/build/derivation-building-goal.hh, src/libstore/build/substitution-goal.cc, src/libstore/include/nix/store/build/substitution-goal.hh, src/libstore/unix/build/derivation-builder.cc
Thread drv provenance into DerivationBuilderParams and BuildResult; doneSuccess signatures accept provenance; substitution promise returns ValidPathInfo provenance.
Daemon, worker protocol & RPC
src/libstore/daemon.cc, src/libstore/include/nix/store/worker-protocol.hh, src/libstore/include/nix/store/worker-protocol-connection.hh, src/libstore/worker-protocol.cc, src/libstore/worker-protocol-connection.cc
Add WorkerProto.featureProvenance and provenance flags in ReadConn/WriteConn; handshake negotiates provenance capability; UnkeyedValidPathInfo reads/writes provenance when enabled.
NarInfo, PathInfo, feature flag & tests
src/libstore/nar-info.cc, src/libstore/path-info.cc, src/libutil/experimental-features.cc, src/libutil/include/nix/util/experimental-features.hh, tests/functional/flakes/provenance.sh, tests/functional/common/init.sh
NarInfo/PathInfo (de)serialization updated to include provenance under the experimental feature; add ExperimentalFeature::Provenance and functional tests validating provenance chains and copies.
Concurrency API refactor
src/libexpr/include/nix/expr/parallel-eval.hh, src/libexpr/parallel-eval.cc, src/libexpr/value-to-json.cc, src/nix/search.cc
Introduce Executor::WorkItems alias and state.addWork/makeWork/spawn helpers; update spawn APIs and call sites to use WorkItems.
Build system & exports
multiple meson.build and header lists under src/libfetchers, src/libflake, src/libstore, src/libutil
Add provenance.cc sources and provenance.hh public headers to meson/source/header lists across libraries.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Daemon
  participant Store
  participant Provenance
  Client->>Daemon: AddToStoreFromDump(dump,..., provenanceJSON?)
  Daemon->>Store: addToStoreFromDump(..., provenance)
  Note right of Store: create ValidPathInfo\ninfo.provenance = provenance
  Store->>Provenance: wrap/chain provenance (Build/Copied)
  Store-->Daemon: return path + provenance (if negotiated)
  Daemon-->Client: response (path + provenance if negotiated)
Loading
sequenceDiagram
  participant Fetcher
  participant Input
  participant Accessor
  participant Store
  participant Provenance
  Fetcher->>Input: fetchToStore(...)
  Input-->>Accessor: accessor (attrs)
  Accessor->>Provenance: accessor.provenance = TreeProvenance(input.attrs)
  Fetcher->>Store: addToStoreFromDump(..., provenance = accessor.getProvenance(path))
  Store->>Provenance: record BuildProvenance/CopiedProvenance chain
  Store-->Fetcher: store path with provenance
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • cole-h
  • grahamc

Poem

"I hopped through code with a nibble and grin,
Breadcrumbs of origin tucked safely within.
From fetch to store and flake to file,
Provenance chains make metadata smile.
A rabbit applauds each traced little win!" 🐇

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Provenance' is vague and generic, using a single non-descriptive term that does not convey meaningful information about the changeset beyond the feature name itself. Use a more descriptive title that summarizes the main change, e.g., 'Add provenance tracking for store paths' or 'Track and record provenance of store objects in ValidPaths and narinfo'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch provenance-detsys

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/libstore/dummy-store.cc (1)

235-304: Provenance parameter is accepted but not used in addToStoreFromDump.

The provenance parameter is added to the signature but is never used in the implementation. The ValidPathInfo created on lines 276–288 and stored in contents doesn't include the provenance. Other stores handle this correctly: BinaryCacheStore::addToStoreFromDump (line 382) and LocalStore::addToStoreFromDump (line 1331) both assign info.provenance = provenance.

Add the provenance assignment:

Suggested fix
         auto info = ValidPathInfo::makeFromCA(
             *this,
             name,
             ContentAddressWithReferences::fromParts(
                 hashMethod,
                 std::move(hash),
                 {
                     .others = references,
                     // caller is not capable of creating a self-reference, because
                     // this is content-addressed without modulus
                     .self = false,
                 }),
             std::move(narHash.first));

         info.narSize = narHash.second.value();
+        info.provenance = std::move(provenance);
🤖 Fix all issues with AI agents
In `@src/libexpr/include/nix/expr/eval.hh`:
- Around line 1134-1147: Members are declared in the wrong order which breaks
the intended destruction order: move std::shared_ptr<const Provenance>
rootProvenance so it is declared before ref<Executor> executor (or alternatively
move ref<Executor> executor to be the final data member) to ensure executor is
destroyed last; update the declarations around the class fields that contain
rootProvenance, executor and the public methods
setRootProvenance/getRootProvenance accordingly so destructor ordering is
preserved.

In `@src/libfetchers/filtering-source-accessor.cc`:
- Around line 71-76: FilteringSourceAccessor::getProvenance currently bypasses
checkAccess() and the SubpathProvenance wrapping from
SourceAccessor::getProvenance, so update this method to first call
checkAccess(path) (or the appropriate access-checking call used by
SourceAccessor), then if the local provenance member exists return it, otherwise
obtain the child provenance via next->getProvenance(prefix / path) and wrap that
result in a SubpathProvenance constructed with the returned provenance and the
prefix before returning; also add an `#include` for the SubpathProvenance header
if not already present.

In `@src/libfetchers/tarball.cc`:
- Around line 106-109: The code currently stores the raw URL in provenance
(info.provenance = std::make_shared<FetchurlProvenance>(url)) which can leak
credentials or signed tokens; before creating FetchurlProvenance and calling
store.addToStore(info, source, ...), sanitize/redact the URL (remove userinfo
and sensitive query params) using the existing redaction helper if available (or
add a small helper that strips user:pass and known signed token query keys),
then pass the redacted string into FetchurlProvenance instead of the original
url so no secrets are persisted in narinfo/metadata.
- Around line 12-34: Add a Provenance::Register for the FetchurlProvenance type
so deserialization of {"type":"fetchurl"} returns a FetchurlProvenance instance:
register a Provenance::Register (e.g. registerFetchurlProvenance) that uses
getObject(json), valueAt(obj, "url"), getString(...) and
make_ref<FetchurlProvenance>(...) to construct the object; place this
registration in this file (or src/libfetchers/provenance.cc) so the type is
recognized at load time. Also sanitize/redact sensitive components of the url
before storing/serializing in FetchurlProvenance::to_json (e.g. strip
credentials or tokens) to avoid leaking credentials in provenance.

In `@src/libflake/include/nix/flake/flake.hh`:
- Around line 98-101: Flake::~Flake() in flake.cc needs the complete type for
Provenance because it destroys the std::shared_ptr<const Provenance> member
provenance; add `#include` "nix/flake/provenance.hh" to src/libflake/flake.cc so
the Provenance definition is visible when instantiating the destructor for Flake
(referencing the provenance member and the Flake::~Flake() destructor).

In `@src/libstore/include/nix/store/path-info.hh`:
- Around line 127-132: The header now exposes std::shared_ptr via the member
"provenance" (type std::shared_ptr<const Provenance>), so add a direct include
for <memory> at the top of this header to avoid relying on transitive includes;
ensure the include is placed with the other standard headers and before any code
that references std::shared_ptr/Provenance.

In `@src/libstore/include/nix/store/provenance.hh`:
- Around line 9-36: DerivationProvenance is missing a from_json registration so
Provenance::from_json() cannot reconstruct derivation nodes; add a registration
call analogous to CopiedProvenance using Provenance::Register to register
"derivation" (or the existing type tag) with a factory that constructs a
DerivationProvenance from JSON, ensuring the registration happens during module
init (the same translation unit where other registrations like CopiedProvenance
are registered) so Provenance::from_json can deserialize DerivationProvenance
objects.

In `@src/libstore/nar-info.cc`:
- Around line 134-136: The text serialization in NarInfo (in
src/libstore/nar-info.cc) appends provenance unconditionally; update the
NarInfo::to_string (or the function that builds `res`) to only append
"Provenance: " + provenance->to_json_str() when
experimentalFeatureSettings.isEnabled(Xp::Provenance) is true, mirroring the
gating used during parsing and the UnkeyedValidPathInfo::toJSON() behavior;
locate the block that currently does `if (provenance) res += "Provenance: " +
provenance->to_json_str() + "\n";` and change it to check the feature flag
before using `provenance`.

In `@src/libutil/experimental-features.cc`:
- Around line 323-330: The provenance experimental-feature entry currently has
an empty .trackingUrl which leads to a broken link in documentation; update the
documentation generation (documentExperimentalFeatures) to check
feature.trackingUrl (or equivalent field) and only emit the tracking/issue link
when it is non-empty, or alternatively set a real URL for the Xp::Provenance
entry; reference the Xp::Provenance feature record (name "provenance") and the
trackingUrl field when making the change so the link is suppressed for empty
values.
🧹 Nitpick comments (5)
src/libstore/include/nix/store/remote-store.hh (1)

91-99: Consider restoring default arguments for RemoteStore callers.

C++ default args aren’t inherited, so removing them here can break direct RemoteStore usages even if Store still provides defaults. If this isn’t intentional, consider mirroring the base defaults to preserve API ergonomics.

♻️ Proposed adjustment
-        FileSerialisationMethod dumpMethod,
-        ContentAddressMethod hashMethod,
-        HashAlgorithm hashAlgo,
-        const StorePathSet & references,
-        RepairFlag repair,
-        std::shared_ptr<const Provenance> provenance) override;
+        FileSerialisationMethod dumpMethod = FileSerialisationMethod::NixArchive,
+        ContentAddressMethod hashMethod = ContentAddressMethod::Raw::NixArchive,
+        HashAlgorithm hashAlgo = HashAlgorithm::SHA256,
+        const StorePathSet & references = StorePathSet(),
+        RepairFlag repair = NoRepair,
+        std::shared_ptr<const Provenance> provenance = nullptr) override;
src/libstore/provenance.cc (1)

6-25: Inconsistent next field serialization between provenance types.

DerivationProvenance::to_json() always includes the "next" field (as null when absent), while CopiedProvenance::to_json() only includes it when present. This inconsistency may cause confusion or issues during deserialization.

Consider making the serialization consistent. Based on the FlakeProvenance pattern from src/libflake/provenance.cc, which always emits "next", aligning with that approach would be more consistent:

♻️ Suggested fix for consistency
 nlohmann::json CopiedProvenance::to_json() const
 {
-    nlohmann::json j{
+    return nlohmann::json{
         {"type", "copied"},
         {"from", from},
+        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
-    if (next)
-        j["next"] = next->to_json();
-    return j;
 }
src/libstore/build/derivation-building-goal.cc (1)

443-447: Consider capturing provenance for the AlreadyValid shortcut.
Provenance is fetched after the early return at Line 349, so AlreadyValid results will report null provenance even when it exists. If you want consistent provenance in build results, consider querying it before that shortcut or in the AlreadyValid branch.

src/libstore/store-api.cc (1)

888-895: Consider returning existing path info on the early-valid shortcut.
With the new nullptr return when the destination already has the path (Line 893–895), callers that want provenance (or other info) need an extra query. If that data is important in the “already present” case, consider returning dstStore.queryPathInfo(storePath) or explicitly documenting the contract.

Also applies to: 907-942

src/libutil/include/nix/util/provenance.hh (1)

12-38: Add a virtual destructor to Provenance as best practice.
While the base class is polymorphic and owns derived types through std::shared_ptr (which correctly captures concrete type deletes at construction), explicitly marking a polymorphic base class with a virtual destructor clarifies intent and ensures safety if the code evolves. This follows standard C++ polymorphism conventions.

Proposed fix
 struct Provenance
 {
+    virtual ~Provenance() = default;
     static ref<const Provenance> from_json_str(std::string_view);

@edolstra edolstra changed the title Provenance detsys Provenance Jan 22, 2026
@github-actions
Copy link

github-actions bot commented Jan 22, 2026

@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:14 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:43 Inactive
@edolstra edolstra force-pushed the provenance-detsys branch 6 times, most recently from fb14a66 to e8599c8 Compare January 22, 2026 14:53
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:55 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:55 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/libstore/include/nix/store/worker-protocol.hh (1)

66-84: Verify provenance handling in hardcoded protocol version 16 cases only—main negotiation path is correct.

The primary code paths using WorkerProto::BasicClientConnection and WorkerProto::BasicServerConnection correctly set the provenance flag based on negotiated features. However, three direct constructions with hardcoded version = 16 do not set provenance and will silently default to false:

  • src/libstore/store-api.cc:204 in addMultipleToStore() (marked FIXME)
  • src/libstore/export-import.cc:147 and :60 for nario format import/export

These are non-negotiated contexts marked as technical debt. If provenance should flow through these paths, they need explicit handling. Otherwise, document that these legacy/special-format paths intentionally bypass provenance.

src/libstore/include/nix/store/store-api.hh (1)

948-957: Handle nullptr return from copyStorePath.
The function can return nullptr when the path already exists in the destination store. At least one call site in src/libstore/build/substitution-goal.cc:258 sets a promise value with the return directly without guarding against nullptr, which could cause issues downstream.

🤖 Fix all issues with AI agents
In `@src/libstore/build/derivation-building-goal.cc`:
- Around line 443-447: The AlreadyValid fast‑path returns BuildResult::Success
before provenance is fetched, causing missing provenance; hoist the
maybeQueryPathInfo(drvPath) lookup (calling worker.evalStore.maybeQueryPathInfo
and capturing info->provenance into a std::shared_ptr<const Provenance>
provenance) before the early return for the AlreadyValid case and reuse that
provenance when constructing/returning BuildResult::Success so provenance is
present for the fast path as well as later build paths.

In `@src/libstore/dummy-store.cc`:
- Around line 235-243: The addToStoreFromDump implementation accepts a
provenance parameter but never attaches it to the created ValidPathInfo, so
provenance is dropped; update the function (addToStoreFromDump) to set
info.provenance = provenance (or equivalent member) on the ValidPathInfo
instance before calling whatever inserts the entry (e.g., insertPath /
_store.emplace or the existing insertion code) so the saved path records the
provided provenance.

In `@src/libstore/include/nix/store/legacy-ssh-store.hh`:
- Around line 76-79: The isUsefulProvenance() override currently returns true
but LegacySSHStore (ServeProto) cannot transmit provenance; change the method
implementation in the LegacySSHStore class so isUsefulProvenance() returns false
to accurately reflect protocol capabilities (or alternatively implement
provenance support in ServeProto/ReadConn/WriteConn and UnkeyedValidPathInfo
before keeping true) — locate the isUsefulProvenance() override and update its
return value to false.

In `@src/libutil/include/nix/util/provenance.hh`:
- Around line 3-8: Add explicit standard headers for the used types: include
<map>, <string>, <string_view>, and <memory> at the top of the header so that
uses of std::map, std::string, std::string_view, and std::shared_ptr are no
longer reliant on transitive includes; update the include block near the
existing includes ("nix/util/ref.hh", "nix/util/canon-path.hh", <functional>,
<nlohmann/json_fwd.hpp>) to add these four headers.
♻️ Duplicate comments (6)
src/libexpr/include/nix/expr/eval.hh (1)

1125-1147: Keep executor as the final data member to preserve destruction order.
This places rootProvenance after executor, which violates the “keep this last” guarantee. Please move rootProvenance before executor (or move executor back to the end).

src/libfetchers/tarball.cc (2)

18-34: Register fetchurl provenance for deserialization.
Without a Provenance::Register for "fetchurl", deserialization will fall back to UnknownProvenance, losing typed data.

🧩 Suggested registration (example)
+static Provenance::Register registerFetchurlProvenance(
+    "fetchurl",
+    [](nlohmann::json json) {
+        auto & obj = getObject(json);
+        return make_ref<FetchurlProvenance>(getString(valueAt(obj, "url")));
+    });

107-108: Avoid persisting credentials or signed tokens in provenance URLs.
Storing the raw URL in provenance risks leaking userinfo or signed query params into .narinfo/metadata. Please redact/sanitize before storing.

🛡️ Example guard (pseudo)
-        info.provenance = std::make_shared<FetchurlProvenance>(url);
+        info.provenance = std::make_shared<FetchurlProvenance>(redactSensitiveUrl(url));
src/libutil/experimental-features.cc (1)

323-330: Suppress tracking links when trackingUrl is empty.
documentExperimentalFeatures() always emits a tracking link; empty URL yields a broken link for provenance.

🧾 Suggested guard
-        docOss << fmt(
-            "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        if (!xpFeature.trackingUrl.empty()) {
+            docOss << fmt(
+                "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        }
src/libstore/include/nix/store/path-info.hh (1)

127-131: Add <memory> include for std::shared_ptr (duplicate).

This header now stores std::shared_ptr, so it should include <memory> directly to avoid reliance on transitive includes.

🔧 Suggested fix
 `#include` <string>
 `#include` <optional>
+#include <memory>
src/libfetchers/filtering-source-accessor.cc (1)

71-76: Preserve access checks and subpath provenance (Line 71-76).

This override skips checkAccess() and drops SubpathProvenance wrapping for non-root paths, so callers can obtain root provenance for any subpath. Consider mirroring SourceAccessor::getProvenance.

🧩 Proposed fix
 std::shared_ptr<const Provenance> FilteringSourceAccessor::getProvenance(const CanonPath & path)
 {
+    checkAccess(path);
     if (provenance)
-        return provenance;
+        return path.isRoot() ? provenance : std::make_shared<SubpathProvenance>(provenance, path);
     return next->getProvenance(prefix / path);
 }

If needed, add the SubpathProvenance include to this file.

🧹 Nitpick comments (3)
src/libstore/path-info.cc (1)

219-220: Consider limiting provenance to JSON format V2 (Line 219-220, Line 296-300).

To avoid surprising strict V1 consumers, consider gating provenance emission and parsing to PathInfoJsonFormat::V2 only.

♻️ Suggested adjustment
-        if (experimentalFeatureSettings.isEnabled(Xp::Provenance))
+        if (experimentalFeatureSettings.isEnabled(Xp::Provenance) && format == PathInfoJsonFormat::V2)
             jsonObject["provenance"] = provenance ? provenance->to_json() : nullptr;
-    if (experimentalFeatureSettings.isEnabled(Xp::Provenance)) {
+    if (experimentalFeatureSettings.isEnabled(Xp::Provenance) && format == PathInfoJsonFormat::V2) {
         auto prov = json.find("provenance");
         if (prov != json.end() && !prov->second.is_null())
             res.provenance = Provenance::from_json(prov->second);
     }

Also applies to: 296-300

src/libstore/provenance.cc (1)

6-25: Inconsistent handling of optional next field in JSON serialization.

DerivationProvenance::to_json() always emits the next field (with null if absent), while CopiedProvenance::to_json() only emits next when present. This asymmetry may cause issues with JSON comparison, diffing, or consumers that expect consistent structure.

Consider aligning the behavior—either always emit next (with null) or conditionally emit it in both classes.

Option A: Always emit `next` in CopiedProvenance (matching DerivationProvenance)
 nlohmann::json CopiedProvenance::to_json() const
 {
-    nlohmann::json j{
+    return nlohmann::json{
         {"type", "copied"},
         {"from", from},
+        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
-    if (next)
-        j["next"] = next->to_json();
-    return j;
 }
Option B: Conditionally emit `next` in DerivationProvenance (matching CopiedProvenance)
 nlohmann::json DerivationProvenance::to_json() const
 {
-    return nlohmann::json{
+    nlohmann::json j{
         {"type", "derivation"},
         {"drv", drvPath.to_string()},
         {"output", output},
-        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
+    if (next)
+        j["next"] = next->to_json();
+    return j;
 }
src/libstore/build/substitution-goal.cc (1)

72-72: Consider returning provenance for AlreadyValid paths.
Right now, AlreadyValid reports nullptr even though provenance may exist in the store. If consumers rely on it, consider fetching it here.

💡 Optional improvement
-    if (!repair && worker.store.isValidPath(storePath)) {
-        co_return doneSuccess(BuildResult::Success::AlreadyValid, nullptr);
-    }
+    if (!repair && worker.store.isValidPath(storePath)) {
+        auto info = worker.store.maybeQueryPathInfo(storePath);
+        co_return doneSuccess(BuildResult::Success::AlreadyValid, info ? info->provenance : nullptr);
+    }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libutil/include/nix/util/provenance.hh`:
- Around line 12-23: The Provenance polymorphic base class lacks a virtual
destructor; add a public virtual destructor to Provenance (e.g., declare a
virtual ~Provenance() = default) so derived instances (referenced by
std::shared_ptr<const Provenance> and used by functions like from_json_str,
from_json_str_optional, from_json, to_json_str, and the pure virtual to_json)
are safely destroyed when deleted via a base pointer.
♻️ Duplicate comments (1)
src/libutil/include/nix/util/provenance.hh (1)

3-8: Add missing standard library headers.

This header uses std::map, std::string, std::string_view, and std::shared_ptr but relies on transitive includes. Add explicit headers for better portability.

 `#include` "nix/util/ref.hh"
 `#include` "nix/util/canon-path.hh"
 
 `#include` <functional>
+#include <map>
+#include <memory>
+#include <string>
+#include <string_view>
 
 `#include` <nlohmann/json_fwd.hpp>
🧹 Nitpick comments (1)
src/libutil/include/nix/util/provenance.hh (1)

34-40: Consider logging or asserting on duplicate type registration.

insert_or_assign silently overwrites existing registrations. If two modules accidentally register the same type name, debugging would be difficult. Consider using insert and checking the return value, or at least logging when a type is overwritten.

@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 18:42 Inactive

auto attrPath = attr->getAttrPathStr();

state->setRootProvenance(makeProvenance(attrPath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having to call this, is there any way we could maybe make this automatic?

(Mostly thinking about protecting from developer error -- if I have to write code here for whatever reason, and add a new path where I should have called setRootProvenance but forgot to, that's probably not great. But if it's integrated into the construction of e.g. attrPath, that means we can't forget it because it'll always happen)

@cole-h
Copy link
Member

cole-h commented Jan 22, 2026

(Also if you could go through the coderabbit comments and see which ones are real things to be resolved or just LLM hallucinations and close them accordingly, that would be great)

Example of a substitution event:

  {
    "action": "result",
    "id": 0,
    "payload": {
      "builtOutputs": {},
      "path": "3bb116cnl86svn2lgc41a3i4a9qblgsf-libtool-2.4.7",
      "provenance": {
        "from": "https://cache.nixos.org",
        "type": "copied"
      },
      "startTime": 0,
      "status": "Substituted",
      "stopTime": 0,
      "success": true,
      "timesBuilt": 0
    },
    "type": 110
  }

Example of a derivation event:

  {
    "action": "result",
    "id": 3381333262860569,
    "payload": {
      "builtOutputs": {
        "out": {
          "dependentRealisations": {},
          "id": "sha256:deb37b0f322203d852a27010200f08e2dd739cb02b51d77999bd7f3162cdfe39!out",
          "outPath": "6b9w3gdjnbdvi50c0h0b9xg91hq6aryl-patchelf-0.18.0",
          "signatures": []
        }
      },
      "cpuSystem": 2118013,
      "cpuUser": 11471586,
      "path": {
        "drvPath": "fm8zrgh4dazysyz3imcva658h0iv34k0-patchelf-0.18.0.drv",
        "outputs": [
          "*"
        ]
      },
      "provenance": {
        "flakeOutput": "packages.x86_64-linux.default",
        "next": {
          "attrs": {
            "dirtyRev": "bb2f1eb3c1e4dc9c4523642a3e39d55806fc9a81-dirty",
            "dirtyShortRev": "bb2f1eb-dirty",
            "lastModified": 1768573749,
            "type": "git",
            "url": "file:///home/eelco/Dev/patchelf"
          },
          "type": "tree"
        },
        "type": "flake"
      },
      "startTime": 1768993105,
      "status": "Built",
      "stopTime": 1768993120,
      "success": true,
      "timesBuilt": 1
    },
    "type": 110
  }
@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 12:50 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 25, 2026 21:34 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 4, 2026 10:06 Inactive
This allows provenance to be propagated correctly to worker threads.
@github-actions github-actions bot temporarily deployed to pull request February 5, 2026 11:19 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2026 12:12 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2026 12:34 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2026 18:41 Inactive
@cole-h
Copy link
Member

cole-h commented Feb 5, 2026

I think the last piece of the puzzle before this can be merged is handling --impure flake outputs.

Right now, builds that happen as a result of e.g. nix build -f get no provenance data, because it's not reproducible / hermetic. However, --impure flake outputs do get a non-null provenance field. We should be consistent in our handling of this and also refuse to give provenance data for --impure outputs, at least for now.

Other than that, this looks good to me.

@github-actions github-actions bot temporarily deployed to pull request February 6, 2026 10:57 Inactive
Copy link
Member

@cole-h cole-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this roughly looks good to me now, thanks! Let's start playing with it!

@edolstra edolstra enabled auto-merge February 6, 2026 16:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/libstore/unix/build/derivation-builder.cc (1)

1868-1883: ⚠️ Potential issue | 🟡 Minor

Guard provenance emission for impure derivations.

Line 1870 sets provenance whenever drvProvenance is present, but impure derivations should not have provenance recorded for their outputs since they are non-deterministic. Add an explicit guard to prevent this:

Required fix
-            if (drvProvenance)
+            if (drvProvenance && !drv.type().isImpure())
                 newInfo.provenance = std::make_shared<const BuildProvenance>(drvPath, outputName, drvProvenance);

@github-actions github-actions bot temporarily deployed to pull request February 6, 2026 16:59 Inactive
@edolstra edolstra added this pull request to the merge queue Feb 6, 2026
Merged via the queue into main with commit ae71c42 Feb 6, 2026
28 checks passed
@edolstra edolstra deleted the provenance-detsys branch February 6, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants