-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Apache Iceberg version
1.9.2 (latest release)
Query engine
None
Please describe the bug 🐞
PyIceberg Master 5e0499c934c5d5a5869c85452f0d586ef4c09c83
When reading a V3 table containing nanosecond timestamps, PyIceberg attempts to downcast them to microseconds but fails. This operation succeeds when the table column is defined as a microsecond timestamp, even though the underlying Parquet files store both cases as nanosecond timestamps.
virtual-env/bin/python create_table.py
Iceberg does not yet support 'ns' timestamp precision. Downcasting to 'us'.
Iceberg does not yet support 'ns' timestamp precision. Downcasting to 'us'.
Traceback (most recent call last):
File "playground/create_table.py", line 19, in <module>
print(tbl.scan(limit=3).to_arrow())
~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/table/__init__.py", line 1978, in to_arrow
).to_table(self.plan_files())
~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1614, in to_table
first_batch = next(batches)
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1665, in to_record_batches
for batches in executor.map(batches_for_task, tasks):
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
~~~~~~~~~~~~~~~~~^^^^^^^^^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
~~~~~~~~~~^^^^^^^^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.13/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1662, in batches_for_task
return list(self._record_batches_from_scan_tasks_and_deletes([task], deletes_per_file))
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1699, in _record_batches_from_scan_tasks_and_deletes
for batch in batches:
^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1513, in _task_to_record_batches
result_batch = _to_requested_schema(
projected_schema,
...<2 lines>...
downcast_ns_timestamp_to_us=True,
)
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1717, in _to_requested_schema
struct_array = visit_with_partner(
requested_schema,
...<2 lines>...
ArrowAccessor(file_schema),
)
File "/usr/lib/python3.13/functools.py", line 934, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/schema.py", line 662, in _
return visitor.schema(schema, partner, visit_with_partner(schema.as_struct(), struct_partner, visitor, accessor)) # type: ignore
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/functools.py", line 934, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/schema.py", line 677, in _
return visitor.struct(struct, partner, field_results)
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1816, in struct
array = self._cast_if_needed(field, field_array)
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/io/pyarrow.py", line 1757, in _cast_if_needed
promote(file_field.field_type, field.field_type), include_field_ids=self._include_field_ids
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/functools.py", line 934, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "playground/virtual-env/lib/python3.13/site-packages/pyiceberg/schema.py", line 1618, in promote
raise ResolveError(f"Cannot promote {file_type} to {read_type}")
pyiceberg.exceptions.ResolveError: Cannot promote timestamptz to timestamptz_ns
Actual parquet:
./clickhouse client -q "SELECT * FROM file('2025-07-24T00-00-00_2025-07-25T00-00-00_01K11MWTJ1RX6Q7VRZP7YSN490.parquet', Parquet) order by (event_time, insertion_no) LIMIT 3;" --port 9002
2025-07-24 00:00:00.006136643 8298661145 2025-07-24 00:00:00.003000000 8129311262230 1 3627.72 1.88 false 0 0
2025-07-24 00:00:00.006136643 8298661146 2025-07-24 00:00:00.003000000 8129311262230 1 3627.77 1.409 false 0 0
2025-07-24 00:00:00.006136643 8298661147 2025-07-24 00:00:00.003000000 8129311262230 1 3627.95 4.17 false 0 0
```
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
FredKhayat
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working