Skip to content

Commit abc497e

Browse files
committed
docs and sql injection
1 parent e1d6fc4 commit abc497e

File tree

5 files changed

+414
-38
lines changed

5 files changed

+414
-38
lines changed

SQL_INJECTION_FIXES.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# SQL Injection Vulnerability Fixes
2+
3+
## Overview
4+
5+
This document details the SQL injection vulnerabilities found and fixed in the async-python-cassandra-client codebase.
6+
7+
## Critical Vulnerabilities Fixed
8+
9+
### 1. Dynamic UPDATE Query Construction
10+
**Severity**: CRITICAL
11+
**Location**: `examples/fastapi_app/main.py:578`
12+
13+
**Vulnerable Code**:
14+
```python
15+
query = f"UPDATE users SET {', '.join(update_fields)} WHERE id = ?"
16+
```
17+
18+
**Issue**: User-controlled field names directly interpolated into SQL query.
19+
20+
**Fix**: Replaced with static prepared statements for all field combinations:
21+
```python
22+
# Prepare all possible UPDATE combinations
23+
update_name_query = await session.prepare("UPDATE users SET name = ? WHERE id = ?")
24+
update_email_query = await session.prepare("UPDATE users SET email = ? WHERE id = ?")
25+
update_age_query = await session.prepare("UPDATE users SET age = ? WHERE id = ?")
26+
# ... etc
27+
```
28+
29+
### 2. LIMIT Clause Injection
30+
**Severity**: HIGH
31+
**Locations**:
32+
- `examples/fastapi_app/main_enhanced.py:258`
33+
- `examples/fastapi_app/main_enhanced.py:311`
34+
35+
**Vulnerable Code**:
36+
```python
37+
result = await session.execute(f"SELECT * FROM users LIMIT {limit}", timeout=timeout)
38+
```
39+
40+
**Fix**: Use prepared statements:
41+
```python
42+
list_users_query = await session.prepare("SELECT * FROM users LIMIT ?")
43+
result = await session.execute(list_users_query, [limit])
44+
```
45+
46+
### 3. Table Name Injection
47+
**Severity**: HIGH
48+
**Location**: `examples/export_large_table.py` (multiple)
49+
50+
**Vulnerable Code**:
51+
```python
52+
result = await session.execute(f"SELECT COUNT(*) FROM {table_name}")
53+
```
54+
55+
**Fix**: Validate table names against system schema:
56+
```python
57+
# Validate table exists in system schema
58+
validation_query = await session.prepare(
59+
"SELECT table_name FROM system_schema.tables WHERE keyspace_name = ? AND table_name = ?"
60+
)
61+
result = await session.execute(validation_query, [keyspace, table_name])
62+
if not result.one():
63+
raise ValueError(f"Table {table_name} does not exist in keyspace {keyspace}")
64+
```
65+
66+
### 4. Keyspace Interpolation
67+
**Severity**: MEDIUM
68+
**Location**: `examples/fastapi_app/main.py` (multiple)
69+
70+
**Vulnerable Code**:
71+
```python
72+
query = f"SELECT * FROM {keyspace}.users WHERE age = ? ALLOW FILTERING"
73+
```
74+
75+
**Fix**: Hardcode keyspace names:
76+
```python
77+
query = "SELECT * FROM user_management.users WHERE age = ? ALLOW FILTERING"
78+
```
79+
80+
## Prevention Measures
81+
82+
### 1. New Test Suite
83+
Created `tests/unit/test_sql_injection_protection.py` to verify:
84+
- All queries use prepared statements
85+
- No string interpolation in SQL
86+
- Proper validation of identifiers
87+
- Secure handling of dynamic queries
88+
89+
### 2. Code Patterns to Avoid
90+
```python
91+
# ❌ NEVER DO THIS
92+
query = f"SELECT * FROM {table} WHERE {column} = {value}"
93+
query = "SELECT * FROM users WHERE id = " + user_id
94+
query = "SELECT * FROM users LIMIT %s" % limit
95+
96+
# ✅ ALWAYS DO THIS
97+
query = await session.prepare("SELECT * FROM users WHERE id = ?")
98+
result = await session.execute(query, [user_id])
99+
```
100+
101+
### 3. Best Practices
102+
1. **Always use prepared statements** for any user input
103+
2. **Validate identifiers** against system schema
104+
3. **Never use string interpolation** in queries
105+
4. **Parameterize everything**, including LIMIT clauses
106+
5. **Review all dynamic SQL** construction carefully
107+
108+
## Impact
109+
110+
- All example applications now secure against SQL injection
111+
- Test coverage ensures future code maintains security
112+
- Documentation updated to emphasize secure patterns
113+
- No functionality lost - all features work with secure implementation
114+
115+
## Verification
116+
117+
Run the SQL injection protection tests:
118+
```bash
119+
pytest tests/unit/test_sql_injection_protection.py -v
120+
```
121+
122+
## References
123+
124+
- [OWASP SQL Injection Prevention](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
125+
- [CQL Prepared Statements](https://docs.datastax.com/en/developer/python-driver/3.25/getting_started/#prepared-statements)

examples/export_large_table.py

Lines changed: 42 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -32,20 +32,30 @@
3232
logger.warning("aiofiles not installed - using synchronous file I/O")
3333

3434

35-
async def count_table_rows(session, table_name: str) -> int:
35+
async def count_table_rows(session, keyspace: str, table_name: str) -> int:
3636
"""Count total rows in a table (approximate for large tables)."""
3737
# Note: COUNT(*) can be slow on large tables
3838
# Consider using token ranges for very large tables
39-
result = await session.execute(f"SELECT COUNT(*) FROM {table_name}")
39+
# Using system schema to validate table exists and avoid SQL injection
40+
validation_query = await session.execute(
41+
"SELECT table_name FROM system_schema.tables WHERE keyspace_name = ? AND table_name = ?",
42+
[keyspace, table_name],
43+
)
44+
if not validation_query.one():
45+
raise ValueError(f"Table {keyspace}.{table_name} does not exist")
46+
47+
# Safe to use table name after validation - but still use qualified name
48+
# In production, consider using prepared statements even for COUNT queries
49+
result = await session.execute(f"SELECT COUNT(*) FROM {keyspace}.{table_name}")
4050
return result.one()[0]
4151

4252

43-
async def export_table_async(session, table_name: str, output_file: str):
53+
async def export_table_async(session, keyspace: str, table_name: str, output_file: str):
4454
"""Export table using async file I/O (requires aiofiles)."""
45-
logger.info(f"Starting async export of {table_name} to {output_file}")
55+
logger.info(f"Starting async export of {keyspace}.{table_name} to {output_file}")
4656

4757
# Get approximate row count for progress tracking
48-
total_rows = await count_table_rows(session, table_name)
58+
total_rows = await count_table_rows(session, keyspace, table_name)
4959
logger.info(f"Table has approximately {total_rows:,} rows")
5060

5161
# Configure streaming with progress callback
@@ -67,8 +77,16 @@ def progress_callback(page_num: int, rows_so_far: int):
6777
start_time = datetime.now()
6878

6979
# CRITICAL: Use context manager for streaming to prevent memory leaks
80+
# Validate table exists before streaming
81+
validation_query = await session.execute(
82+
"SELECT table_name FROM system_schema.tables WHERE keyspace_name = ? AND table_name = ?",
83+
[keyspace, table_name],
84+
)
85+
if not validation_query.one():
86+
raise ValueError(f"Table {keyspace}.{table_name} does not exist")
87+
7088
async with await session.execute_stream(
71-
f"SELECT * FROM {table_name}", stream_config=config
89+
f"SELECT * FROM {keyspace}.{table_name}", stream_config=config
7290
) as result:
7391
# Export to CSV
7492
async with aiofiles.open(output_file, "w", newline="") as f:
@@ -111,13 +129,13 @@ def progress_callback(page_num: int, rows_so_far: int):
111129
logger.info(f"- File size: {os.path.getsize(output_file):,} bytes")
112130

113131

114-
def export_table_sync(session, table_name: str, output_file: str):
132+
def export_table_sync(session, keyspace: str, table_name: str, output_file: str):
115133
"""Export table using synchronous file I/O."""
116-
logger.info(f"Starting sync export of {table_name} to {output_file}")
134+
logger.info(f"Starting sync export of {keyspace}.{table_name} to {output_file}")
117135

118136
async def _export():
119137
# Get approximate row count
120-
total_rows = await count_table_rows(session, table_name)
138+
total_rows = await count_table_rows(session, keyspace, table_name)
121139
logger.info(f"Table has approximately {total_rows:,} rows")
122140

123141
# Configure streaming
@@ -133,8 +151,16 @@ async def _export():
133151
start_time = datetime.now()
134152

135153
# Use context manager for proper streaming cleanup
154+
# Validate table exists before streaming
155+
validation_query = await session.execute(
156+
"SELECT table_name FROM system_schema.tables WHERE keyspace_name = ? AND table_name = ?",
157+
[keyspace, table_name],
158+
)
159+
if not validation_query.one():
160+
raise ValueError(f"Table {keyspace}.{table_name} does not exist")
161+
136162
async with await session.execute_stream(
137-
f"SELECT * FROM {table_name}", stream_config=config
163+
f"SELECT * FROM {keyspace}.{table_name}", stream_config=config
138164
) as result:
139165
# Export to CSV synchronously
140166
with open(output_file, "w", newline="") as f:
@@ -272,9 +298,13 @@ async def main():
272298

273299
# Export using async I/O if available
274300
if ASYNC_FILE_IO:
275-
await export_table_async(session, "products", str(output_dir / "products_async.csv"))
301+
await export_table_async(
302+
session, "export_example", "products", str(output_dir / "products_async.csv")
303+
)
276304
else:
277-
await export_table_sync(session, "products", str(output_dir / "products_sync.csv"))
305+
await export_table_sync(
306+
session, "export_example", "products", str(output_dir / "products_sync.csv")
307+
)
278308

279309
# Cleanup (optional)
280310
logger.info("\nCleaning up...")

0 commit comments

Comments
 (0)