-
Notifications
You must be signed in to change notification settings - Fork 414
Open
Description
Apache Iceberg version
main (development)
Please describe the bug 🐞
Problem
Getting a pyarrow.lib.ArrowNotImplementedError: Function 'equal' has no kernel matching input types (extension<arrow.uuid>, extension<arrow.uuid>) when trying to scan a PyIceberg table with a row filter using UUID comparison. The error indicates that PyArrow's equal function doesn't have a kernel for comparing UUID extension types.
Environment
pyiceberg: Nightly build (expected to support UUIDs)
pyarrow: 21.0.0
Python: 3.13
Code to Reproduce
import uuid
from pyiceberg.expressions import EqualTo
# This fails with ArrowNotImplementedError
df = table.scan(row_filter=EqualTo("batch_id", uuid.UUID("0190de80-647f-4bbc-a80e-efda686b910f")))Full Error Stack Trace
File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/.venv/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1694, in batches_for_task
return list(self._record_batches_from_scan_tasks_and_deletes([task], deletes_per_file))
File "/Users/.venv/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1732, in _record_batches_from_scan_tasks_and_deletes
for batch in batches:
^^^^^^^
File "/Users/.venv/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1518, in _task_to_record_batches
fragment_scanner = ds.Scanner.from_fragment(
fragment=fragment,
...<4 lines>...
columns=[col.name for col in file_project_schema.columns],
)
File "pyarrow/_dataset.pyx", line 3692, in pyarrow._dataset.Scanner.from_fragment
File "pyarrow/_dataset.pyx", line 3458, in pyarrow._dataset._populate_builder
File "pyarrow/_compute.pyx", line 2732, in pyarrow._compute._bind
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Function 'equal' has no kernel matching input types (extension<arrow.uuid>, extension<arrow.uuid>)
Expected Behavior
The table scan should successfully filter rows by UUID without throwing a kernel matching error.
Willingness to contribute
- I can contribute a fix for this bug independently
- I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- I cannot contribute a fix for this bug at this time
Metadata
Metadata
Assignees
Labels
No labels