Skip to content

Fuzzing Crash: VarBinViewArray validity mismatch in mask operation (combine_validity creates Nullable for NonNullable array) #6048

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-array/src/arrays/masked/execute.rs:102 in the mask_validity_varbinview function

Error Message:

[Debug Assertion]: Invalid VarBinViewArray parameters:
  validity Array(BoolArray { dtype: Bool(NonNullable), bits: BitBuffer { buffer: Buffer<u8> { length: 1, alignment: Alignment(1), as_slice: [251] }, offset: 0, len: 3 }, validity: NonNullable, stats_set: ArrayStats { inner: RwLock { data: StatsSet { values: [] } } } }) incompatible with nullability NonNullable

Stack Trace:

   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: panic_display<vortex_error::VortexError>
   3: {closure#1}<(), vortex_error::VortexError>
   4: unwrap_or_else<(), vortex_error::VortexError, vortex_error::{impl#11}::vortex_expect::{closure_env#1}<(), vortex_error::VortexError>>
   5: vortex_expect<(), vortex_error::VortexError>
   6: new_unchecked at ./vortex-array/src/arrays/varbinview/array.rs:165
   7: mask_validity_varbinview at ./vortex-array/src/arrays/masked/execute.rs:102
   8: mask_validity_canonical at ./vortex-array/src/arrays/masked/execute.rs:42:35
   9: execute at ./vortex-array/src/arrays/masked/vtable/mod.rs:147:25
  10: to_canonical<vortex_array::arrays::masked::vtable::MaskedVTable>

Root Cause:

The bug is in the combine_validity function at vortex-array/src/arrays/masked/execute.rs:53-57:

fn combine_validity(validity: &Validity, mask: &Mask, len: usize) -> Validity {
    let current_mask = validity.to_mask(len);
    let combined = current_mask.bitand(mask);
    Validity::from_mask(combined, Nullability::Nullable)  // BUG: Always returns Nullable!
}

This function always creates a Nullable validity regardless of the input array's nullability. When applied to a VarBinViewArray with NonNullable dtype, the validation check at vortex-array/src/arrays/varbinview/array.rs:185-190 correctly fails:

vortex_ensure!(
    validity.nullability() == dtype.nullability(),
    "validity {:?} incompatible with nullability {:?}",
    validity,
    dtype.nullability()
);

What Happened:

  1. A ChunkedArray with Utf8(NonNullable) dtype containing a VarBinViewArray chunk is created
  2. A to_canonical operation triggers mask execution through MaskedVTable
  3. mask_validity_canonical calls mask_validity_varbinview
  4. mask_validity_varbinview calls combine_validity, which creates a Nullable validity
  5. VarBinViewArray::new_unchecked validates that validity nullability matches dtype nullability
  6. Validation fails because the array has NonNullable dtype but Nullable validity

Impact:

This bug affects all masked operations on non-nullable arrays, not just VarBinViewArray. The same pattern is used in:

  • mask_validity_varbinview (line 102)
  • mask_validity_bool (line 66)
  • mask_validity_primitive (line 72)
  • mask_validity_decimal (line 83)
  • mask_validity_listview (line 113)
  • mask_validity_fixed_size_list (line 123)
  • mask_validity_struct (line 130)

The issue only surfaces with VarBinViewArray because it has strict validation, but it's a latent bug throughout the masked array system.

Debug Output
FuzzArrayAction {
    array: ChunkedArray {
        dtype: Utf8(
            NonNullable,
        ),
        len: 3,
        chunk_offsets: PrimitiveArray {
            dtype: Primitive(
                U64,
                NonNullable,
            ),
            buffer: BufferHandle(
                Host(
                    Buffer<u8> {
                        length: 24,
                        alignment: Alignment(
                            8,
                        ),
                        as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...],
                    },
                ),
            ),
            validity: NonNullable,
            stats_set: ArrayStats {
                inner: RwLock {
                    data: StatsSet {
                        values: [],
                    },
                },
            },
        },
        chunks: [
            VarBinArray {
                dtype: Utf8(
                    NonNullable,
                ),
                bytes: Buffer<u8> {
                    length: 0,
                    alignment: Alignment(
                        1,
                    ),
                    as_slice: [],
                },
                offsets: PrimitiveArray {
                    dtype: Primitive(
                        U32,
                        NonNullable,
                    ),
                    buffer: BufferHandle(
                        Host(
                            Buffer<u8> {
                                length: 4,
                                alignment: Alignment(
                                    4,
                                ),
                                as_slice: [0, 0, 0, 0],
                            },
                        ),
                    ),
                    validity: NonNullable,
                    stats_set: ArrayStats {
                        inner: RwLock {
                            data: StatsSet {
                                values: [
                                    (
                                        IsSorted,
                                        Exact(
                                            ScalarValue(
                                                Bool(
                                                    true,
                                                ),
                                            ),
                                        ),
                                    ),
                                ],
                            },
                        },
                    },
                },
                validity: NonNullable,
                stats_set: ArrayStats {
                    inner: RwLock {
                        data: StatsSet {
                            values: [],
                        },
                    },
                },
            },
            VarBinViewArray {
                dtype: Utf8(
                    NonNullable,
                ),
                buffers: [],
                views: Buffer<vortex_vector::binaryview::view::BinaryView> {
                    length: 3,
                    alignment: Alignment(
                        16,
                    ),
                    as_slice: [BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }, BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }, BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }],
                },
                validity: NonNullable,
                stats_set: ArrayStats {
                    inner: RwLock {
                        data: StatsSet {
                            values: [],
                        },
                    },
                },
            },
        ],
        stats_set: ArrayStats {
            inner: RwLock {
                data: StatsSet {
                    values: [],
                },
            },
        },
    },
    actions: [
        (
            Compress(
                Default,
            ),
            Array(
                ChunkedArray {
                    dtype: Utf8(
                        NonNullable,
                    ),
                    len: 3,
                    chunk_offsets: PrimitiveArray {
                        dtype: Primitive(
                            U64,
                            NonNullable,
                        ),
                        buffer: BufferHandle(
                            Host(
                                Buffer<u8> {
                                    length: 24,
                                    alignment: Alignment(
                                        8,
                                    ),
                                    as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...],
                                },
                            ),
                        ),
                        validity: NonNullable,
                        stats_set: ArrayStats {
                            inner: RwLock {
                                data: StatsSet {
                                    values: [],
                                },
                            },
                        },
                    },
                    chunks: [
                        VarBinArray {
                            dtype: Utf8(
                                NonNullable,
                            ),
                            bytes: Buffer<u8> {
                                length: 0,
                                alignment: Alignment(
                                    1,
                                ),
                                as_slice: [],
                            },
                            offsets: PrimitiveArray {
                                dtype: Primitive(
                                    U32,
                                    NonNullable,
                                ),
                                buffer: BufferHandle(
                                    Host(
                                        Buffer<u8> {
                                            length: 4,
                                            alignment: Alignment(
                                                4,
                                            ),
                                            as_slice: [0, 0, 0, 0],
                                        },
                                    ),
                                ),
                                validity: NonNullable,
                                stats_set: ArrayStats {
                                    inner: RwLock {
                                        data: StatsSet {
                                            values: [
                                                (
                                                    IsSorted,
                                                    Exact(
                                                        ScalarValue(
                                                            Bool(
                                                                true,
                                                            ),
                                                        ),
                                                    ),
                                                ),
                                            ],
                                        },
                                    },
                                },
                            },
                            validity: NonNullable,
                            stats_set: ArrayStats {
                                inner: RwLock {
                                    data: StatsSet {
                                        values: [],
                                    },
                                },
                            },
                        },
                        VarBinViewArray {
                            dtype: Utf8(
                                NonNullable,
                            ),
                            buffers: [],
                            views: Buffer<vortex_vector::binaryview::view::BinaryView> {
                                length: 3,
                                alignment: Alignment(
                                    16,
                                ),
                                as_slice: [BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }, BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }, BinaryView { inline: Inlined { size: 0, data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] } }],
                            },
                            validity: NonNullable,
                            stats_set: ArrayStats {
                                inner: RwLock {
                                    data: StatsSet {
                                        values: [],
                                    },
                                },
                            },
                        },
                    ],
                    stats_set: ArrayStats {
                        inner: RwLock {
                            data: StatsSet {
                                values: [],
                            },
                        },
                    },
                },
            ),
        ),
    ],
}

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:

# The artifact contains array_ops/crash-caec2ffa2e36c2363c142c55a4176bef24a9ef3c
cargo +nightly fuzz run -D --sanitizer=none array_ops array_ops/crash-caec2ffa2e36c2363c142c55a4176bef24a9ef3c -- -rss_limit_mb=0
  1. Get full backtrace:
RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none array_ops array_ops/crash-caec2ffa2e36c2363c142c55a4176bef24a9ef3c -- -rss_limit_mb=0

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions