Skip to content

Conversation

@Roasbeef
Copy link
Member

In this commit, we return to an old issue in the tracker: #5837. Due to an older bug fix related to MAC check failures, we started to use two Write calls, one for the header and the other for the body. This enabled us to fix the bug we were seeing, meant that we were sending 2x the packets on the wire.

The referenced issue above notes this tradeoff, and suggests that we disable TCP no delay on the connection object. This serves to enable Nagle's algorithm. The tradeoff with this is that we may see up to a ~200 ms delay as the kernel may wait a bit longer to see if more data shows up before flushing the socket.

In this PR, we take an alternative approach, by using Go's net.Buffer API. This API implements vectored I/O (scatter and gather) via the writev system call. We pass in a slice of buffers to be written atomically to the wire. In the case that this fails, we'll try again with the remaining amount that was written.

In practice, we should now see that we send half as many packets on the wire as before. Verfiying this in the wild will require use of tcpdump or wireshark, etc, etc.

In this commit, we return to an old issue in the tracker:
lightningnetwork#5837. Due to an older bug
fix related to MAC check failures, we started to use two `Write` calls,
one for the header and the other for the body. This enabled us to fix
the bug we were seeing, meant that we were sending 2x the packets on the
wire.

THe referenced issue above notes this tradeoff, and suggests that we
disable TCP no delay on the connection object. This serves to enable
Nagle's algorithm. The tradeoff with this is that we may see up to a
~200 ms delay as the kernel may wait a bit longer to see if more data
shows up before flushing the socket.

In this PR, we take an alternative approach, by using Go's `net.Buffer`
API. This API implements vectored I/O (scatter and gather) via the
`writev` system call. We pass in a slice of buffers to be written
_atomically_ to the wire. In the case that this fails, we'll try again
with the remaining amount that was written.

In practice, we should now see that we send half as many packets on the
wire as before. Verfiying this in the wild will require use of `tcpdump`
or wireshark, etc, etc.
@gemini-code-assist
Copy link

Summary of Changes

Hello @Roasbeef, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the network efficiency of Brontide connections by refactoring the message flushing mechanism to use vectored I/O. By combining the message header and body into a single atomic write operation, it halves the number of TCP packets sent per message, leading to a more streamlined and performant communication layer. This approach avoids potential latency introduced by alternative methods like Nagle's algorithm while ensuring correct handling of partial writes and maintaining data integrity.

Highlights

  • Network Efficiency: Reduces the number of packets sent on the wire by 50% for Brontide messages by combining header and body writes.
  • Vectored I/O Implementation: Leverages Go's net.Buffers API, which utilizes the writev system call, to atomically write both message header and body.
  • Improved Flush Logic: The Flush method in brontide/noise.go has been refactored to support this new vectored I/O approach, including robust handling of partial writes and accurate calculation of flushed plaintext bytes.
  • New Test Coverage: Introduces TestFlushCoalescedWrite and TestFlushPartialWriteRecovery to thoroughly validate the new write behavior and error recovery.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the brontide.Flush method to use vectored I/O, which should significantly reduce the number of packets sent on the wire by coalescing header and body writes into a single syscall. The change is well-implemented and includes new tests to verify correctness, especially for partial write recovery scenarios. My review includes a couple of suggestions for code clarity and adherence to the style guide.

// If both header and body have been fully flushed, release the pooled
// buffers back to their pools.
//
// TODO(roasbeef): move above again?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This TODO comment seems to be a leftover from development. Moving releaseBuffers() before the error check would introduce a bug: in case of a partial write with an error, the buffers would be released prematurely, preventing a successful retry on the next Flush() call. The current placement is correct. I suggest removing this comment to avoid future confusion.

writes [][]byte
}

func (w *writeRecorder) Write(p []byte) (int, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the style guide (rule 17), every function should be commented with its purpose. This method is missing a comment. Also, rule 18 states that function comments must begin with the function name.

I suggest adding a comment like:

// Write records the written bytes. A copy of the slice is taken to avoid
// issues with buffer reuse.
Suggested change
func (w *writeRecorder) Write(p []byte) (int, error) {
// Write records the written bytes. A copy of the slice is taken to avoid
// issues with buffer reuse.
func (w *writeRecorder) Write(p []byte) (int, error) {
References
  1. Rule 17: Every function must be commented with its purpose and assumptions. Rule 18: Function comments must begin with the function name. (link)

Comment on lines +717 to +718
// totalBytes returns the total number of bytes written across all Write calls.
func (w *writeRecorder) totalBytes() int {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the style guide (rule 18), function comments must begin with the function name. This comment starts with a lowercase letter.

Suggested change
// totalBytes returns the total number of bytes written across all Write calls.
func (w *writeRecorder) totalBytes() int {
// TotalBytes returns the total number of bytes written across all Write calls.
func (w *writeRecorder) totalBytes() int {
References
  1. Rule 18: Function comments must begin with the function name. (link)

@ziggie1984 ziggie1984 self-requested a review December 18, 2025 21:31
Copy link
Contributor

@MPins MPins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this change in a local regtest setup using Docker. The node at 172.18.0.1 runs my locally compiled LND, while 172.18.0.3 runs LND v0.19.1 inside the Docker environment.

I compared traffic before and after applying the PR using Wireshark, filtering for TCP port 9735.

Even in this low-traffic scenario, it is possible to observe the benefit of this change in the marked packets, where header and payload are transmitted together rather than as separate segments.

The screenshots below compare message transmission before and after the change during channel opening, invoice payment, and channel closing.

  • channel opening
Screenshot from 2025-12-20 11-53-50 Screenshot from 2025-12-20 11-54-46
  • invoice payment
Screenshot from 2025-12-20 12-34-22 Screenshot from 2025-12-20 12-34-54
  • channel closing
Screenshot from 2025-12-20 12-56-20 Screenshot from 2025-12-20 12-56-50

@lightninglabs-deploy
Copy link

@ziggie1984: review reminder
@Roasbeef, remember to re-request review from reviewers when ready

@saubyk saubyk added this to the v0.20.1 milestone Jan 6, 2026
@saubyk saubyk added this to lnd v0.20 Jan 6, 2026
@saubyk saubyk moved this to In review in lnd v0.20 Jan 6, 2026
@saubyk saubyk removed this from the v0.20.1 milestone Jan 6, 2026
@saubyk saubyk removed this from lnd v0.20 Jan 6, 2026
@saubyk saubyk added this to v0.21 Jan 6, 2026
@saubyk saubyk moved this to In review in v0.21 Jan 6, 2026
@saubyk saubyk added this to the v0.21.0 milestone Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

4 participants