GH-49311: [C++][CI] Use differential fuzzing on IPC file fuzzer#49312
GH-49311: [C++][CI] Use differential fuzzing on IPC file fuzzer#49312pitrou merged 1 commit intoapache:mainfrom
Conversation
|
@addisoncrump FYI and if you're not bored of this :) |
|
@github-actions crossbow submit fuzz |
|
Revision: 589a1ff Submitted crossbow builds: ursacomputing/crossbow @ actions-c68bd13b91
|
|
Looks like a reasonable differential fuzzer impl. Are the two sources of information different? Or is one just a "wrapped" version of another? If so, your fuzzer might be exploring a lot of code that isn't relevant to the actual conversion bit (i.e., the bit you're actually testing). |
|
The IPC file format is just the IPC stream format + a fixed-size header + a footer with additional metadata for random access (like a ZIP catalog, basically). However, since the IPC file format allows for random access, the IPC file reader has specifics shortcuts and heuristics to make better use of IO, meaning different code paths than the IPC stream format. (it's not a problem to exercise the IPC stream reader code, either, even though we have a separate fuzz harness for it) |
|
You're right @kou, thanks. |
Rationale for this change
Enable differential fuzzing to strengthen the invariants exercised by the IPC file fuzzer.
What changes are included in this PR?
When the IPC file fuzzer reads the IPC file successfully, also read the underlying IPC stream and compare the resulting contents for equality. Inequality when reading is treated as a hard failure (crashing the process so that an issue is reported).
There is a caveat: a technically valid IPC file might read differently than the enclosed IPC stream. It seems unlikely that the fuzzer would generate such a file, but we'll see.
See discussion on the dev ML:
https://lists.apache.org/thread/jpxl3yzm96wkxzb1clokxklsy32b3plh
Are these changes tested?
By manually running the fuzz target against existing seed files.
Are there any user-facing changes?
No.