axonops
diff --git a/‎docs/streaming.md‎
Lines changed: 12 additions & 0 deletions b/‎docs/streaming.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/true-async-paging.md‎
Lines changed: 118 additions & 0 deletions b/‎docs/true-async-paging.md‎
Lines changed: 118 additions & 0 deletions
diff --git a/‎examples/README.md‎
Lines changed: 8 additions & 3 deletions b/‎examples/README.md‎
Lines changed: 8 additions & 3 deletions
diff --git a/‎examples/fastapi_app/main.py‎
Lines changed: 6 additions & 8 deletions b/‎examples/fastapi_app/main.py‎
Lines changed: 6 additions & 8 deletions
diff --git a/‎examples/fastapi_app/main_enhanced.py‎
Lines changed: 9 additions & 9 deletions b/‎examples/fastapi_app/main_enhanced.py‎
Lines changed: 9 additions & 9 deletions
@@ -15,6 +15,7 @@ When you query Cassandra for potentially large result sets, you need to understa
 - [Advanced Patterns](#advanced-patterns)
 - [Performance Guidelines](#performance-guidelines)
 - [Common Pitfalls](#common-pitfalls)
+- [True Async Paging](#true-async-paging)
 
 ## How Cassandra Paging Works
 
@@ -848,6 +849,17 @@ async def load_batch(prepared_stmt, batch):
     )
 ```
 
+## True Async Paging
+
+For detailed information about True Async Paging behavior, common misconceptions, and best practices, see our dedicated guide: [True Async Paging](true-async-paging.md).
+
+Key points covered:
+- Critical importance of context managers
+- How paging actually works (on-demand, not pre-fetched)
+- When LIMIT is needed (hint: rarely with paging!)
+- Page size recommendations for different use cases
+- Common patterns and anti-patterns
+
 ## Conclusion
 
 The key to understanding streaming in async-cassandra is recognizing that:
 
@@ -0,0 +1,118 @@
+# True Async Paging in async-cassandra
+
+## Key Concepts
+
+### 1. Always Use Context Managers (CRITICAL)
+
+```python
+# ✅ CORRECT - Prevents resource leaks
+async with await session.execute_stream("SELECT * FROM table") as result:
+    async for row in result:
+        await process_row(row)
+
+# ❌ WRONG - Will leak resources!
+result = await session.execute_stream("SELECT * FROM table")
+async for row in result:  # Missing context manager!
+    await process_row(row)
+```
+
+### 2. How Paging Actually Works
+
+The Cassandra driver implements **true streaming** with these characteristics:
+
+- **On-Demand Fetching**: Pages are fetched as you consume data, NOT all at once
+- **Async Fetching**: While you process page N, the driver can fetch page N+1
+- **Memory Efficient**: Only one page is held in memory at a time
+- **No Pre-fetching All Data**: The driver doesn't load the entire result set
+
+### 3. Page Size Recommendations
+
+```python
+# Small Pages (1000-5000 rows)
+# ✅ Best for: Real-time processing, low memory usage, better responsiveness
+# ❌ Trade-off: More network round trips
+config = StreamConfig(fetch_size=1000)
+
+# Medium Pages (5000-10000 rows)
+# ✅ Best for: General purpose, good balance
+config = StreamConfig(fetch_size=5000)
+
+# Large Pages (10000-50000 rows)
+# ✅ Best for: Bulk exports, batch processing, fewer round trips
+# ❌ Trade-off: Higher memory usage, slower first results
+config = StreamConfig(fetch_size=20000)
+```
+
+### 4. LIMIT vs Paging
+
+**You don't need LIMIT with paging!**
+
+```python
+# ❌ UNNECESSARY - fetch_size already controls data flow
+stmt = await session.prepare("SELECT * FROM users LIMIT ?")
+async with await session.execute_stream(stmt, [1000]) as result:
+    # This limits total results, not page size!
+
+# ✅ CORRECT - Let paging handle the data flow
+stmt = await session.prepare("SELECT * FROM users")
+config = StreamConfig(fetch_size=1000)  # This controls page size
+async with await session.execute_stream(stmt, stream_config=config) as result:
+    # Process all data efficiently, page by page
+```
+
+### 5. Processing Patterns
+
+#### Row-by-Row Processing
+```python
+# Process each row as it arrives
+async with await session.execute_stream("SELECT * FROM large_table") as result:
+    async for row in result:
+        await process_row(row)  # Non-blocking, pages fetched as needed
+```
+
+#### Page-by-Page Processing
+```python
+# Process entire pages at once (e.g., for batch operations)
+config = StreamConfig(fetch_size=5000)
+async with await session.execute_stream("SELECT * FROM large_table", stream_config=config) as result:
+    async for page in result.pages():
+        # Process entire page (list of rows)
+        await bulk_insert_to_warehouse(page)
+```
+
+### 6. Common Misconceptions
+
+**Myth**: "The driver pre-fetches all pages"
+**Reality**: Pages are fetched on-demand as you consume data
+
+**Myth**: "I need LIMIT to control memory usage"
+**Reality**: `fetch_size` controls memory usage, LIMIT just limits total results
+
+**Myth**: "Larger pages are always better"
+**Reality**: It depends on your use case - see recommendations above
+
+**Myth**: "I can skip the context manager"
+**Reality**: Context managers are MANDATORY to prevent resource leaks
+
+### 7. Performance Tips
+
+1. **Match fetch_size to your processing speed**
+   - Fast processing → larger pages
+   - Slow processing → smaller pages
+
+2. **Use page callbacks for monitoring**
+   ```python
+   config = StreamConfig(
+       fetch_size=5000,
+       page_callback=lambda page_num, total_rows:
+           logger.info(f"Processing page {page_num}, total: {total_rows:,}")
+   )
+   ```
+
+3. **Consider network latency**
+   - High latency → larger pages (fewer round trips)
+   - Low latency → smaller pages are fine
+
+4. **Monitor memory usage**
+   - Each page holds `fetch_size` rows in memory
+   - Adjust based on row size and available memory
@@ -144,11 +144,16 @@ All examples follow the required pattern:
 ```python
 # ALWAYS use context managers for resource management
 async with AsyncCluster(["localhost"]) as cluster:
-    async with cluster.connect() as session:
-        # Your code here
-        pass
+    async with await cluster.connect() as session:
+        # For streaming, ALWAYS use context manager:
+        async with await session.execute_stream("SELECT * FROM table") as result:
+            async for row in result:
+                # Process row
+                pass
 ```
 
+**⚠️ CRITICAL**: See [True Async Paging](../docs/true-async-paging.md) for important details about streaming patterns and common mistakes.
+
 ### MANDATORY: Always Use PreparedStatements
 For any query with parameters:
 ```python
 
@@ -337,10 +337,9 @@ async def stream_users(
         stream_config = StreamConfig(fetch_size=fetch_size)
 
         # Use context manager for proper resource cleanup
-        stmt = await session.prepare("SELECT * FROM users LIMIT ?")
-        async with await session.execute_stream(
-            stmt, [limit], stream_config=stream_config
-        ) as result:
+        # Note: LIMIT not needed - fetch_size controls data flow
+        stmt = await session.prepare("SELECT * FROM users")
+        async with await session.execute_stream(stmt, stream_config=stream_config) as result:
             users = []
             async for row in result:
                 # Handle both dict-like and object-like row access
@@ -420,10 +419,9 @@ async def stream_users_by_pages(
         stream_config = StreamConfig(fetch_size=fetch_size, max_pages=max_pages)
 
         # Use context manager for automatic cleanup
-        stmt = await session.prepare("SELECT * FROM users LIMIT ?")
-        async with await session.execute_stream(
-            stmt, [limit], stream_config=stream_config
-        ) as result:
+        # Note: LIMIT not needed - fetch_size controls data flow
+        stmt = await session.prepare("SELECT * FROM users")
+        async with await session.execute_stream(stmt, stream_config=stream_config) as result:
             pages_info = []
             total_processed = 0
 
 
@@ -309,17 +309,16 @@ def page_callback(page_number: int, page_size: int):
         stream_config.page_callback = page_callback
 
         # Execute streaming query with prepared statement
-        stmt = await session.session.prepare("SELECT * FROM users LIMIT ?")
-        result = await session.session.execute_stream(
-            stmt,
-            [limit],
-            stream_config=stream_config,
-        )
+        # Note: LIMIT is not needed with paging - fetch_size controls data flow
+        stmt = await session.session.prepare("SELECT * FROM users")
 
         users = []
 
-        # Use context manager for proper cleanup
-        async with result as stream:
+        # CRITICAL: Always use context manager to prevent resource leaks
+        async with await session.session.execute_stream(
+            stmt,
+            stream_config=stream_config,
+        ) as stream:
             async for row in stream:
                 users.append(
                     {
@@ -329,7 +328,8 @@ def page_callback(page_number: int, page_size: int):
                     }
                 )
 
-                # Check if we've reached the limit
+                # Note: If you need to limit results, track count manually
+                # The fetch_size in StreamConfig controls page size efficiently
                 if limit and len(users) >= limit:
                     break