Skip to content

Commit febd7d6

Browse files
committed
UNPICK Update PyCapsule failure analysis to clarify symptoms and root causes
1 parent bb3844a commit febd7d6

File tree

1 file changed

+26
-28
lines changed

1 file changed

+26
-28
lines changed

dev/notes/pycapsule_failure.md

Lines changed: 26 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,36 @@
11
# PyCapsule Read Failure Analysis
22

33
## Symptom
4-
Running `python examples/pycapsule_failure.py` terminates the interpreter with a segmentation fault.
4+
Running `python examples/pycapsule_failure.py` now fails immediately with
5+
``ValueError: Table provider capsule is missing a destructor`` when
6+
``SessionContext.read_table`` attempts to coerce the fabricated capsule.
57

68
## Regression Surface
7-
Commit range `9b4f1442^..d629ced2` replaced the `Table` wrapper-based API with a generic constructor that accepts arbitrary Python objects and auto-discovers how to turn them into a `TableProvider`. The new `PyTable::new` implementation in Rust now attempts to coerce any object that exposes `__datafusion_table_provider__` into an FFI provider. 【F:src/table.rs†L54-L77】
9+
Commit range `9b4f1442^..6e449da5` teaches
10+
``table_provider_from_pycapsule`` to recognize raw ``PyCapsule`` instances
11+
before checking for a ``__datafusion_table_provider__`` attribute.
12+
【F:src/utils.rs†L205-L222】 As soon as ``SessionContext.read_table`` sees the
13+
dummy capsule exposed by the example it now calls straight into the capsule
14+
conversion path without first wrapping it in ``Table``.
815

916
## Root Cause
10-
`table_provider_from_pycapsule` only validates the capsule name before transmuting its pointer into an `FFI_TableProvider`. 【F:src/utils.rs†L127-L141】 The helper assumes the capsule contains a valid `FFI_TableProvider` allocation created by our bindings. The regression example fabricates a capsule with the correct name but with an arbitrary pointer (`ctypes.create_string_buffer`). 【F:examples/pycapsule_failure.py†L8-L24】 Because the new constructor now reaches this path for any `read_table` call, the bogus pointer is dereferenced immediately, corrupting memory and crashing the interpreter.
11-
12-
Prior to the refactor, callers could not pass arbitrary capsule-bearing objects to `SessionContext.read_table`; they first had to wrap them in `Table`/`RawTable`, which were only constructible through safe helpers that produced trusted capsules. The new auto-coercion path therefore widened the attack surface to unvalidated capsules, exposing the latent unsafety.
17+
The new fast-path reuses ``table_provider_from_capsule``, which enforces that
18+
the capsule has a destructor originating from ``datafusion_ffi``’s helper.
19+
【F:src/utils.rs†L193-L205】 The regression example still fabricates its capsule
20+
with ``PyCapsule_New`` and passes ``NULL`` for the destructor.
21+
【F:examples/pycapsule_failure.py†L10-L24】 Because no release callback is
22+
registered, the validation now fails with the ``missing a destructor`` error.
23+
Prior to the change, the example first wrapped the capsule in ``Table`` which
24+
never triggered the destructor check, masking the issue.
1325

1426
## Runtime failure after 91b90f44
15-
Commit 91b90f44 changed :meth:`SessionContext.read_table` so that any object
16-
exposing ``__datafusion_table_provider__`` is normalized through
17-
``Table.from_table_provider_capsule`` before delegating to the Rust context.
18-
【F:python/datafusion/context.py†L1189-L1198】 That helper now calls into the
19-
private binding ``df_internal.catalog.RawTable.from_table_provider_capsule`` to
20-
wrap the capsule, but the ``RawTable`` type exported from
21-
``datafusion._internal`` does not currently expose such a constructor.
22-
【F:python/datafusion/catalog.py†L176-L187】 At runtime the lookup therefore
23-
raises ``AttributeError`` and prevents `examples/pycapsule_failure.py` from
24-
running, regressing the original reproducer from a segfault into a hard failure.
25-
2627
## Suggested Tasks
27-
1. Export a ``RawTable.from_table_provider_capsule`` constructor from the Rust
28-
bindings and ensure it becomes available through
29-
``datafusion._internal.catalog`` during the wheel build so that the Python
30-
shim can locate it.
31-
2. Add an integration test that imports ``datafusion._internal`` and asserts
32-
``hasattr(df_internal.catalog.RawTable, "from_table_provider_capsule")``
33-
before exercising ``SessionContext.read_table`` with a raw capsule to catch
34-
regressions.
35-
3. Consider extending ``table_provider_from_pycapsule`` so that
36-
``RawTable.__new__`` can directly accept capsule instances (without going
37-
through the static helper) to reduce the surface area for Python/Rust API
38-
skew in the future.
28+
1. Provide a public helper (Python or Rust) that fabricates a minimal, but
29+
valid, table-provider capsule so that examples/tests no longer rely on
30+
``PyCapsule_New`` with a ``NULL`` destructor.
31+
2. Document the requirement that all externally supplied table-provider
32+
capsules must originate from ``datafusion_ffi`` helpers so the release hook
33+
is present; include guidance in ``examples/pycapsule_failure.py``.
34+
3. Add a regression test that round-trips a capsule produced by the new helper
35+
through ``SessionContext.read_table`` to ensure the destructor validation and
36+
success path both work as intended.

0 commit comments

Comments
 (0)