Skip to content

Conversation

@karlem
Copy link
Contributor

@karlem karlem commented Oct 9, 2025

Close #1440


Note

Introduce an F3 proof generation and caching service (with RocksDB, metrics, and verifier), wire it into Fendermint and add CLI tools to inspect/manage the cache; also sync contract event/signature markers and minor CI/test tweaks.

  • Fendermint VM (new service):
    • Add fendermint/vm/topdown/proof-service crate for F3 certificate validation, proof bundle assembly, two-level cache (in-memory + RocksDB), persistence, metrics, and verification (assembler.rs, cache.rs, persistence.rs, service.rs, verifier.rs).
    • Provide launch_service(...) API and a dev binary proof-cache-test.
  • App Integration:
    • Add dependency on fendermint_vm_topdown_proof_service and include it in workspace.
    • New CLI subcommands fendermint proof-cache {inspect|stats|get|clear} for cache management.
  • Contracts:
    • Annotate event signatures and storage slot offsets in LibGateway.sol, LibGatewayActorStorage.sol, LibPowerChangeLog.sol, and Subnet.sol to stay in sync with proof-service.
  • IPLD Resolver:
    • Reduce membership publish intervals in tests; clean up unused log import.
  • CI:
    • Switch Aderyn installation to npm install -g @cyfrin/aderyn.
  • Deps/Workspace:
    • Add parking_lot, humantime-serde; bump bls-signatures to 0.15; update Cargo.toml members.

Written by Cursor Bugbot for commit b0213ec. This will update automatically on new commits. Configure here.

@karlem karlem changed the base branch from main to f3-add-actor October 9, 2025 16:41
@karlem karlem force-pushed the f3-add-actor branch 2 times, most recently from 6f22fee to 023528d Compare October 20, 2025 18:15
@karlem karlem marked this pull request as ready for review October 27, 2025 18:58
@karlem karlem requested a review from a team as a code owner October 27, 2025 18:58
@karlem karlem changed the title F3 proofs cache feat: F3 proofs cache Oct 27, 2025
cursor[bot]

This comment was marked as outdated.

@karlem karlem changed the title feat: F3 proofs cache feat: add F3 proofs cache Oct 27, 2025
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Base automatically changed from f3-add-actor to main October 31, 2025 15:10
cursor[bot]

This comment was marked as outdated.

@karlem karlem requested a review from a team November 5, 2025 13:50
@karlem karlem requested a review from LePremierHomme November 6, 2025 12:14

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this AI-generated? In many places it mentions exact identifiers and full data structures from the current code base. Do we really want to keep such detailed readme in sync with future code changes?

Copy link
Contributor Author

@karlem karlem Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is AI generated, so updating is quite easy (as we could also have it re-generated on CI?). But I can remove some of the details from it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you, I just shared my opinion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already out of sync with the actual source code 🙃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there was no need to re-generate it give that there was no final implementation approved.

Copy link

@sergefdrv sergefdrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my first round of review and left some comments/questions. I would do another round after getting your answers.

One general suggestion: maybe we should consider removing the persistent storage functionality from this PR and make it simpler and easier to review? We can add it later, in a separate PR. WDYT?

@karlem
Copy link
Contributor Author

karlem commented Nov 18, 2025

One general suggestion: maybe we should consider removing the persistent storage functionality from this PR and make it simpler and easier to review? We can add it later, in a separate PR. WDYT?

I honestly think that the storage is NOT a big part of the PR so it should be fine with it.

@sergefdrv
Copy link

One general suggestion: maybe we should consider removing the persistent storage functionality from this PR and make it simpler and easier to review? We can add it later, in a separate PR. WDYT?

I honestly think that the storage is NOT a big part of the PR so it should be fine with it.

Then I should review the interplay between the in-memory cache, the persistent storage, and the on-chain state more closely.

StorageProofSpec {
actor_id: self.gateway_actor_id,
slot: calculate_storage_slot("", NEXT_CONFIG_NUMBER_STORAGE_SLOT),
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Storage slot calculation uses empty string for global config

The second StorageProofSpec for NEXT_CONFIG_NUMBER_STORAGE_SLOT uses an empty string "" for calculate_storage_slot, while the first spec uses self.subnet_id. If nextConfigurationNumber is a global field (not per-subnet), passing an empty string may be intentional, but if it's per-subnet like the first spec, this would generate an incorrect storage slot. The inconsistency between the two specs suggests a potential copy-paste error or misunderstanding of the storage layout.

Fix in Cursor Fix in Web

@karlem karlem requested a review from sergefdrv December 4, 2025 20:00
@karlem karlem requested a review from sergefdrv December 8, 2025 20:35
"calibrationnet",
initial_instance,
)
.await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Hardcoded F3 network name in development testing tool

The development tool accepts arbitrary RPC URLs via --rpc-url but hardcodes "calibrationnet" as the F3 network name. The F3 light client uses this network name as part of the BLS signing domain, so if a user points the tool at a mainnet or other network RPC endpoint, certificate validation will silently fail with cryptographic errors rather than a clear message about the network mismatch. The network name parameter is missing from the CLI arguments despite the tool being designed to work with arbitrary RPC endpoints.

Fix in Cursor Fix in Web

Copy link

@sergefdrv sergefdrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Now this PR looks good. I don't have any serious concern, just some small minor suggestions. Feel free to merge at your discretion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already out of sync with the actual source code 🙃

"calibrationnet",
initial_instance,
)
.await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Hardcoded network name prevents multi-network usage

The F3 network name is hardcoded to "calibrationnet" when creating the F3Client, but the binary accepts --rpc-url as a user-configurable parameter implying multi-network support. If a user provides a mainnet or other network RPC URL, the F3 client would be initialized with the wrong network name, causing certificate validation failures or incorrect behavior. The network name parameter is missing from the CLI arguments.

Fix in Cursor Fix in Web

} else {
println!("No proof found for instance {}", instance_id);
println!();
println!("Available instances: {:?}", entries.len());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Misleading CLI output shows count instead of instance IDs

In the get_proof function, when a proof is not found, the error message displays "Available instances: X" where X is entries.len() (the count of entries). This is misleading because the label "Available instances" suggests the user will see a list of instance IDs they can query, but instead they only see a number. Compare this to show_stats which correctly labels the count as "Count" and shows actual instance IDs separately. Users who see "Available instances: 5" would reasonably expect to see something like [100, 101, 102, 103, 104] so they know which specific instances to query.

Fix in Cursor Fix in Web

/// This captures validator power changes that need to be reflected in the subnet
pub const NEW_POWER_CHANGE_REQUEST_SIGNATURE: &str =
"NewPowerChangeRequest(PowerOperation,address,bytes,uint64)";

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Event signatures use non-canonical Solidity types

NEW_TOPDOWN_MESSAGE_SIGNATURE and NEW_POWER_CHANGE_REQUEST_SIGNATURE use IpcEnvelope and PowerOperation type names, but Ethereum event topics are derived from canonical ABI types (structs as tuples, enums as uint8). This likely makes EventProofSpec.event_signature hash to the wrong topic0, so event proofs won’t match logs and generated bundles can be incomplete/invalid.

Fix in Cursor Fix in Web

/// This captures validator power changes that need to be reflected in the subnet
pub const NEW_POWER_CHANGE_REQUEST_SIGNATURE: &str =
"NewPowerChangeRequest(PowerOperation,address,bytes,uint64)";

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect Solidity event signatures in proofs

NEW_TOPDOWN_MESSAGE_SIGNATURE and NEW_POWER_CHANGE_REQUEST_SIGNATURE use custom type names (IpcEnvelope, PowerOperation) instead of ABI canonical types (tuple(...) for structs and uint8 for enums). Any filtering/proof generation keyed off these signatures will not match actual event topics, leading to missing/invalid event proofs.

Fix in Cursor Fix in Web

/// Bindings: contract_bindings::lib_power_change_log::NewPowerChangeRequestFilter
/// This captures validator power changes that need to be reflected in the subnet
pub const NEW_POWER_CHANGE_REQUEST_SIGNATURE: &str =
"NewPowerChangeRequest(PowerOperation,address,bytes,uint64)";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect event signature strings for proofs

The constants NEW_TOPDOWN_MESSAGE_SIGNATURE and NEW_POWER_CHANGE_REQUEST_SIGNATURE use non-canonical Solidity types (IpcEnvelope, PowerOperation) instead of ABI-canonical types (tuples/underlying integer). This likely produces mismatched topic hashes during proof generation, causing event proofs to miss real NewTopDownMessage/NewPowerChangeRequest logs.

Fix in Cursor Fix in Web

slot: calculate_storage_slot("", NEXT_CONFIG_NUMBER_STORAGE_SLOT),
},
]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Wrong storage slot computation for config number

build_storage_specs computes the nextConfigurationNumber slot via calculate_storage_slot(\"\", NEXT_CONFIG_NUMBER_STORAGE_SLOT). If nextConfigurationNumber is a plain storage slot (not a mapping keyed by a string), hashing with an empty key will target a different slot and generate invalid storage proofs.

Fix in Cursor Fix in Web

},
StorageProofSpec {
actor_id: self.gateway_actor_id,
slot: calculate_storage_slot("", NEXT_CONFIG_NUMBER_STORAGE_SLOT),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect storage slot calculation for nextConfigurationNumber

The NEXT_CONFIG_NUMBER_STORAGE_SLOT constant is set to 20, but according to the contract storage layout JSON files, PowerChangeLog (containing nextConfigurationNumber) is stored at slot 9 within ParentValidatorsTracker. The comment claims "nextConfigurationNumber is at slot 20" but this doesn't match the actual storage layout. Additionally, calculate_storage_slot("", NEXT_CONFIG_NUMBER_STORAGE_SLOT) passes an empty string for the key, which may produce incorrect results if the function expects a mapping key for slot calculation. This could cause storage proofs to reference the wrong slot, leading to proof verification failures.

Fix in Cursor Fix in Web

@karlem karlem merged commit 00b3d7f into main Dec 18, 2025
18 checks passed
@karlem karlem deleted the f3-proofs-cache branch December 18, 2025 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

F3 topdown: Proof Generator & Local Proof Cache

4 participants