Skip to content

Commit f4e76ea

Browse files
committed
Add example for concurrent processing of partitioned streams using asyncio
1 parent b6909a5 commit f4e76ea

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

docs/source/user-guide/dataframe/index.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -227,6 +227,20 @@ partition:
227227
for batch in stream:
228228
... # each stream yields RecordBatches
229229
230+
To process partitions concurrently, first collect the streams into a list
231+
and then poll each one in a separate ``asyncio`` task:
232+
233+
.. code-block:: python
234+
235+
import asyncio
236+
237+
async def consume(stream):
238+
async for batch in stream:
239+
...
240+
241+
streams = list(df.execute_stream_partitioned())
242+
await asyncio.gather(*(consume(s) for s in streams))
243+
230244
See :doc:`../io/arrow` for additional details on the Arrow interface.
231245

232246
HTML Rendering

0 commit comments

Comments
 (0)