-
Notifications
You must be signed in to change notification settings - Fork 2.2k
brontide: use vectored I/O to send 50% less packets on the wire #10450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
In this commit, we return to an old issue in the tracker: lightningnetwork#5837. Due to an older bug fix related to MAC check failures, we started to use two `Write` calls, one for the header and the other for the body. This enabled us to fix the bug we were seeing, meant that we were sending 2x the packets on the wire. THe referenced issue above notes this tradeoff, and suggests that we disable TCP no delay on the connection object. This serves to enable Nagle's algorithm. The tradeoff with this is that we may see up to a ~200 ms delay as the kernel may wait a bit longer to see if more data shows up before flushing the socket. In this PR, we take an alternative approach, by using Go's `net.Buffer` API. This API implements vectored I/O (scatter and gather) via the `writev` system call. We pass in a slice of buffers to be written _atomically_ to the wire. In the case that this fails, we'll try again with the remaining amount that was written. In practice, we should now see that we send half as many packets on the wire as before. Verfiying this in the wild will require use of `tcpdump` or wireshark, etc, etc.
Summary of ChangesHello @Roasbeef, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the network efficiency of Brontide connections by refactoring the message flushing mechanism to use vectored I/O. By combining the message header and body into a single atomic write operation, it halves the number of TCP packets sent per message, leading to a more streamlined and performant communication layer. This approach avoids potential latency introduced by alternative methods like Nagle's algorithm while ensuring correct handling of partial writes and maintaining data integrity. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the brontide.Flush method to use vectored I/O, which should significantly reduce the number of packets sent on the wire by coalescing header and body writes into a single syscall. The change is well-implemented and includes new tests to verify correctness, especially for partial write recovery scenarios. My review includes a couple of suggestions for code clarity and adherence to the style guide.
| // If both header and body have been fully flushed, release the pooled | ||
| // buffers back to their pools. | ||
| // | ||
| // TODO(roasbeef): move above again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO comment seems to be a leftover from development. Moving releaseBuffers() before the error check would introduce a bug: in case of a partial write with an error, the buffers would be released prematurely, preventing a successful retry on the next Flush() call. The current placement is correct. I suggest removing this comment to avoid future confusion.
| writes [][]byte | ||
| } | ||
|
|
||
| func (w *writeRecorder) Write(p []byte) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the style guide (rule 17), every function should be commented with its purpose. This method is missing a comment. Also, rule 18 states that function comments must begin with the function name.
I suggest adding a comment like:
// Write records the written bytes. A copy of the slice is taken to avoid
// issues with buffer reuse.| func (w *writeRecorder) Write(p []byte) (int, error) { | |
| // Write records the written bytes. A copy of the slice is taken to avoid | |
| // issues with buffer reuse. | |
| func (w *writeRecorder) Write(p []byte) (int, error) { |
References
- Rule 17: Every function must be commented with its purpose and assumptions. Rule 18: Function comments must begin with the function name. (link)
| // totalBytes returns the total number of bytes written across all Write calls. | ||
| func (w *writeRecorder) totalBytes() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the style guide (rule 18), function comments must begin with the function name. This comment starts with a lowercase letter.
| // totalBytes returns the total number of bytes written across all Write calls. | |
| func (w *writeRecorder) totalBytes() int { | |
| // TotalBytes returns the total number of bytes written across all Write calls. | |
| func (w *writeRecorder) totalBytes() int { |
References
- Rule 18: Function comments must begin with the function name. (link)
MPins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this change in a local regtest setup using Docker. The node at 172.18.0.1 runs my locally compiled LND, while 172.18.0.3 runs LND v0.19.1 inside the Docker environment.
I compared traffic before and after applying the PR using Wireshark, filtering for TCP port 9735.
Even in this low-traffic scenario, it is possible to observe the benefit of this change in the marked packets, where header and payload are transmitted together rather than as separate segments.
The screenshots below compare message transmission before and after the change during channel opening, invoice payment, and channel closing.
- channel opening
- invoice payment
- channel closing
|
@ziggie1984: review reminder |
In this commit, we return to an old issue in the tracker: #5837. Due to an older bug fix related to MAC check failures, we started to use two
Writecalls, one for the header and the other for the body. This enabled us to fix the bug we were seeing, meant that we were sending 2x the packets on the wire.The referenced issue above notes this tradeoff, and suggests that we disable TCP no delay on the connection object. This serves to enable Nagle's algorithm. The tradeoff with this is that we may see up to a ~200 ms delay as the kernel may wait a bit longer to see if more data shows up before flushing the socket.
In this PR, we take an alternative approach, by using Go's
net.BufferAPI. This API implements vectored I/O (scatter and gather) via thewritevsystem call. We pass in a slice of buffers to be written atomically to the wire. In the case that this fails, we'll try again with the remaining amount that was written.In practice, we should now see that we send half as many packets on the wire as before. Verfiying this in the wild will require use of
tcpdumpor wireshark, etc, etc.