Commit e9c1cbc
committed
feat: implement direct Avro encoder for performance
Implement direct Avro encoder to eliminate GenericDatum intermediate layer,
matching the decoder approach for better performance.
Implementation:
- Add avro_direct_encoder_internal.h with EncodeArrowToAvro API
- Add avro_direct_encoder.cc implementing direct Arrow→Avro encoding
- All primitive types: bool, int, long, float, double, string, binary
- Temporal types: date, time, timestamp
- Logical types: uuid, decimal
- Nested types: struct, list, map (both string and non-string keys)
- Union type handling for optional fields
- Modify avro_writer.cc to use DataFileWriterBase with direct encoder
- Add EncodeContext to reuse scratch buffers and avoid allocations
This matches Java Iceberg implementation using Encoder interface directly,
avoiding intermediate object allocation overhead.1 parent 61a7de5 commit e9c1cbc
File tree
6 files changed
+982
-18
lines changed- src/iceberg
- avro
- test
6 files changed
+982
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
| 156 | + | |
156 | 157 | | |
157 | 158 | | |
158 | 159 | | |
| |||
0 commit comments