feat: Spec multi-result-set API #3871

zeroshade · 2026-01-08T18:04:11Z

Extracted from #3607 with influence by the comments there and main...CurtHagenlocher:arrow-adbc:MoreResults, this contains a proposal for handling multi-result set query execution via ADBC by adding a new function for drivers, AdbcStatementNextResultSet.

This also includes the necessary changes for an ADBC API Revision 1.2.0 (macro defines and so on). The comment above the function includes all the semantic definitions of the behavior.

zeroshade · 2026-01-08T18:08:09Z

After discussion and agreement on the proposed function, I'll update the relevant implementations here of the other driver managers to leverage and expose this (python, rust, Go, etc.)

lidavidm · 2026-01-09T01:27:11Z

Should we make a branch for spec changes?

zeroshade · 2026-01-09T01:32:31Z

Should we make a branch for spec changes?

That makes sense, gather the spec updates in the branch and then eventually decide to push it as one thing.

lidavidm · 2026-01-09T02:53:33Z

I made https://github.com/apache/arrow-adbc/tree/spec-1.2.0

zeroshade · 2026-01-09T15:39:58Z

I've changed the base for this PR to point to the spec-1.2.0 branch.

Now we just need people to weigh in on the proposal 😄

paleolimbot

Nice!

The previous PR also had:

ADBC_EXPORT
AdbcStatusCode AdbcStatementNextResultSetSchema(struct AdbcStatement* statement,
                                                struct ArrowSchema* schema,
                                                struct AdbcError* error);

Is that no longer planned or no longer needed?

paleolimbot · 2026-01-09T22:41:54Z

c/include/arrow-adbc/adbc.h

+ADBC_EXPORT
+AdbcStatusCode AdbcStatementNextResultSet(struct AdbcStatement* statement,
+                                          struct ArrowArrayStream* out,
+                                          struct AdbcPartitions* partitions,
+                                          int64_t* rows_affected,
+                                          struct AdbcError* error);


Would it be possible to have this just be:

ADBC_EXPORT AdbcStatusCode AdbcStatementNextResultSet(struct AdbcStatement* statement, struct AdbcError* error);

...and then subsequent calls to AdbcStatementExecute(), AdbcStatementExecuteSchema(), AdbcStatementExecutePartitions(), and future async versions of all of those don't have to be redefined?

That's a reasonable idea. Maybe we can specify that the query must be NULL in those cases? (Though, that would translate weirdly into many of the language bindings, albeit they also don't necessarily have to be 1:1.)

(I think we need some way to differentiate, in that case, "I want to start a new query" vs "I want to continue executing the query". And ideally we remain backwards-compatible, so that calling ExecuteQuery again would start a new query as it currently does.)

I tried to essentially handle all of the cases with the one function here (minus async) by specifying particular inputs as null or not. I'm don't particularly like having to figure out how to differentiate between the Execute* methods starting a new execution vs continuing after a call to NextResultSet, though that might just be bias from existing APIs I've used.

My thoughts were that the Execute* methods return a stream/schema/list of partitions, so calling NextResultSet should also do the same.

The other pattern I've seen would be that we'd create an AdbcResult struct which would have a NextResultSet callback on it (which IMO we should have done in the first place), but this would be a breaking change or at least require an entirely new set of functions.

@CurtHagenlocher @davidhcoe do either of you have any opinions on this?

Maybe we can specify that the query must be NULL in those cases?

Does that mean the user has to explicitly set it back to NULL? Or that the statement should have internally set it to NULL when it was executed? That's kind of annoying if you're trying to re-use the same Statement instance to execute the same query multiple times if you have to re-set the query to execute it again. Some users might expect that they need to hold on to the Statement to execute it without paying the cost of preparing it again. They might even try to cache Statements that they've already prepared alongside the parent connection.

Including the partitions API makes this method weirdly modal, I would say if you wanted to lean into the partitions API then you should commit to it and just return a partition from here. Or just split it into two separate methods.

But it's already hard to understand from the API design how you're supposed to use partitions, much less how you're supposed to implement them in the driver. Having to go back to the connection to read the partition doesn't make a ton of sense to me, unless it's a separate query or protocol command to read from it, like fetching from a named cursor.

I suggested privately that there should be a new ExecuteMulti method that explicitly returns a multi-result-set stream. That encodes the user's expectation up-front that there will be multiple result sets from the query, and doesn't require Statement itself to become any more stateful than it is now.

@paleolimbot and @davidhcoe / @CurtHagenlocher?

If everyone's okay with this idea I'll take a stab at updating this proposal accordingly

It's a great point that there's already a lot of state to keep track of in a statement, although that bookeeping still has to happen somewhere (both for the driver, ensuring it doesn't call something that will crash, and the caller, ensuring it doesn't violate any constraint about holding multiple results). Possibly there is just no getting around a lot of new C functions (whether they're on some new AdbcResultSet or the existing AdbcStatement).

I guess what I struggle with is how do you know when additional results are expected? Is there something that would split the SQL statements to know there are multiple? Do you always call AdbcStatementNextResultSet just to check if there is a second one?

I'm only familiar with the libpq way of doing this where you call PQgetResult() until it returns NULL (although some libraries probably expose a way to access the number of statements in something that is prepared).

The ODBC, JDBC and ADO.NET APIs all work in this fashion. One thing to consider for databases which support stored procedures is that it may not be possible to know in advance how many result sets they will return. I know this is true for MSSQL and think it might be true for PostgreSQL as well.

zeroshade · 2026-01-19T22:05:55Z

I've updated the PR based on the comments and discussion. Everyone please take a look and let me know what you think!

lidavidm · 2026-01-20T06:45:03Z

c/include/arrow-adbc/adbc.h

+/// with the result set.
+///
+/// \since ADBC API revision 1.2.0
+struct ADBC_EXPORT AdbcResultSet {


I think I'd vaguely prefer these as free functions to make it so that there's only one struct that needs possible extension later.

lidavidm · 2026-01-20T06:45:16Z

c/include/arrow-adbc/adbc.h

+/// \since ADBC API revision 1.2.0
+struct ADBC_EXPORT AdbcResultSet {
+  /// \brief opaque implementation-defined state
+  void* private_data;


Maybe we should also have a pointer to the AdbcDriver as with the other structs?

lidavidm · 2026-01-20T06:45:31Z

c/include/arrow-adbc/adbc.h

+  /// \return ADBC_STATUS_NOT_IMPLEMENTED if the driver only supports fetching results
+  ///   as partitions, ADBC_STATUS_INVALID_STATE if called at an inappropriate time,
+  ///   and ADBC_STATUS_OK (or an appropriate error code) otherwise.
+  AdbcStatusCode (*NextResultSet)(struct AdbcResultSet* self,


Did we also want a NextSchema callback?

Although, it'd probably be better to indicate the desire for that up front so that the driver can choose what to do properly. (And actually, I'm not sure if database clients generally support this in the first place...)

lidavidm · 2026-01-20T06:47:13Z

c/include/arrow-adbc/adbc.h

+  /// \brief opaque implementation-defined state
+  void* private_data;
+
+  /// \brief Release the result set and any associated resources.


We can say this also cancels the query if not completed?

feat: Spec multi-result-set API

ed8252a

zeroshade requested a review from lidavidm as a code owner January 8, 2026 18:04

zeroshade requested review from CurtHagenlocher, amoeba, felipecrv, kou and paleolimbot January 8, 2026 18:04

github-actions bot added this to the ADBC Libraries 22 milestone Jan 8, 2026

zeroshade added 3 commits January 8, 2026 13:09

linting

8ec1d95

fix tests

6abbf86

pre-commit

133e895

lidavidm modified the milestones: ADBC Libraries 22, ADBC Libraries 23, ADBC API Specification 1.2.0 (Async) Jan 9, 2026

zeroshade changed the base branch from main to spec-1.2.0 January 9, 2026 15:39

paleolimbot reviewed Jan 9, 2026

View reviewed changes

Update based on comments

52e646e

lidavidm reviewed Jan 20, 2026

View reviewed changes

feat: Spec multi-result-set API #3871

Are you sure you want to change the base?

feat: Spec multi-result-set API #3871

Conversation

zeroshade commented Jan 8, 2026

Uh oh!

zeroshade commented Jan 8, 2026

Uh oh!

lidavidm commented Jan 9, 2026

Uh oh!

zeroshade commented Jan 9, 2026

Uh oh!

lidavidm commented Jan 9, 2026

Uh oh!

zeroshade commented Jan 9, 2026

Uh oh!

paleolimbot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeroshade commented Jan 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants