-
Notifications
You must be signed in to change notification settings - Fork 2.2k
multi: Switch store modifications + prevent duplicates in SendHTLC #9777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi: Switch store modifications + prevent duplicates in SendHTLC #9777
Conversation
The SwitchRPC server will be hidden behind a build tag.
Add RPC for dispatching payments via onions. The payment route and onion are computed by the caller and the onion is delivered to the server for forwarding. NOTE: The server does NOT process or peel the onion so it assumed that the onion will be constructed such that the first hop is encrypted to one of the server's channel partners.
Allow the switch to defer error handling when callers of GetAttemptResult do not provide an error decrypter.
Add RPC to lookup the status of a previously forwarded onion. Allow callers of the TrackOnion rpc to indicate whether they would like to handle errors themselves or delegate error decryption to the server. We take care to return ErrPaymentIDNotFound across RPC boundary to the RPC caller. This will allow the caller of TrackOnion to explicitly confirm that there is no HTLC in-flight for the supplied attempt ID, so it is free to safely re-attempt the payment.
Add RPC which constructs a sphinx onion packet for the given payment route. NOTE: This is added primarily to aid with the itests added later.
This demonstrates how the Switch and SendOnion rpc behave when asked to dispatch duplicate onions. Notably, the Switch circuit map detects this - but only if the matching onion is still in flight. Once the circuit is torn down, the duplicate is permitted by the Switch. It is likely that we will add a layer of protection to the SendOnion call itself to prevent duplicates even after the first HTLC is no longer in-flight. TODO: Determine whether this SendOnion duplication protection should presist across restarts.
This allows users of the SendOnion RPC to include all fields that we support using with the UpdateAddHtlc type.
Preperatory refactor to allow for future alteration of the store backing the Switch.
TODO: is this needed after pushing pending result handling deeper into the store and InitAttempt within SendHTLC rather than SendOnion?
Add a new message for pending htlc attempt results in the Switch's network result store. This serves as a place holder to use during initialization of an attempt within the store which will be replaced with either a SETTLE or FAIL message once the HTLC attempt result is received from the network. NOTE: This message is not sent or received externally to channel peers. It was introduced only to avoid having to change the on-disk structure of the network result store.
This can be used to initialize the result store for a given attempt ID prior to sending the HTLC out to the network.
We're seeing what benefit to upstream clients is provided if the underlying SendHTLC implementatation (and by extension the SendOnion RPC) will not forward the same attempt ID twice without the result for a given ID having been cleaned from the result store. We accomplish the duplicate prevention using the InitAttempt method of the Switch Store.
We can now assert that making multiple calls to SendOnion for the same attempt ID is prevented.
We need to ensure that each of our test nodes are aware of the necessary channels. Not sure if AssertChannelActive or AssertChannelInGraph is better for this purpose.
This allows for the resolving of any ambiguity in the status of an HTLC in the case where the ChannelRouter and Switch run in separate processes and communicate over the network. When SendHTLC returns it is expected that the Router know, without any uncertainty whether the HTLC is in-flight (no error) or that it is neither in flight nor will it ever be in flight. This is tricky to guarantee when the communication happens over a network. Instead we allow the server to resolve any uncertainty by calling SendHTLC while resuming a payment prior to attempting to track the result. If the HTLC was successfully received by the remote Switch, then the Router will receive a duplicate HTLC error and can proceed to tracking the attempt result like normal.
|
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Mistakenly opened this. My apologies. |
Change Description
GOAL: Prevent duplicate payment attempts and unintentional loss of funds when using the SendOnion/TrackOnion RPCs to dispatch and track payments. We intend to use the ChannelRouter in a remote process from lnd and the Switch, but this discussion would apply generally to any client looking to use these RPC endpoints.
We have already explored an approach which prevents this using the SendOnion/TrackOnion implementations themselves and requires little to no changes to Switch or ChannelRouter logic. The approach is demonstrated in #9489. Both approaches are summarized there in the
Avoiding Duplicate Payment Attemptssection.We expand the ChannelRouter to handle new failure modes when SendHTLC is implemented using RPCs over a network - as is the case when Router and Switch run in separate processes.
The ChannelRouter needs to distinguish between an attempt that was initiated successfully and an attempt for which it is not known whether it initiated successfully. It must attempt to track the result only for attempts which are known to have been initiated successfully. Otherwise, since we base retry decision of the result of attempt tracking, we risk duplicate attempts being made if SendHTLC is implemented via RPC between two processes communicating over async network.
Originally, I thought that this approach would require the persisting of the acknowledgement from the Switch server using something like:
Persisted Server Acknowledgment of HTLC Receipt & Dispatch + Modified ChannelRouter Startup
This sequence—RegisterAttempt, SendHTLC, and AcknowledgeAttempt—is helpful in a distributed environment to:
On restart, query the ControlTower for in-flight attempts:
This is roughly demonstrated in calvinrzachman#14.
This PR explores a modification to the above approach to minimize the necessary ChannelRouter changes. To accomplish this we update the contract or assurance offered by the SendHTLC() method by making it safe to call multiple times with the same attempt ID. This allows us to to guarantee that:
With this assurance we safely and defensively call SendHTLC while resuming payments on startup. This should work both in the case where Router and Switch run in the same process (as is the case for the large majority of lnd deployments) and in the scenario where Router and Switch run remotely.