This discussion proposes adding native Arrow Flight support to DataFusion-Comet's shuffle implementation. Currently, Comet writes shuffle data in Arrow IPC format but relies on Spark's Netty-based BlockManager for network transfer. By implementing Arrow Flight in the Rust native layer, we can achieve true end-to-end zero-copy columnar shuffle, eliminating the JVM boundary crossing for network I/O.
Current Flow:
- Rust ShuffleWriter -> Arrow IPC files -> Disk
- Spark BlockManager -> Netty -> Remote Executor
- JNI -> Rust ShuffleReader -> RecordBatch
Proposed Flow:
- Rust ShuffleWriter -> In-memory Arrow buffers
- Rust FlightServer -> gRPC/HTTP2 -> Remote Rust FlightClient
- Rust ShuffleReader -> RecordBatch (zero-copy)
Ref