Commit 4bc51be
committed
Fix integer overflow in CSV parser causing segfault
Fixes #63089
When parsing scientific notation in CSV files, extremely large exponent
values (e.g., '4e492493924924') caused integer overflow in the exponent
accumulation loop, leading to undefined behavior and segmentation faults.
The issue occurred in xstrtod() at pandas/_libs/src/parser/tokenizer.c
where exponent digits were accumulated without bounds checking:
int n = 0;
while (isdigit_ascii(*p)) {
n = n * 10 + (*p - '0'); // Overflow here with large exponents
...
}
Solution:
- Add a maximum exponent digits cap (MAX_EXPONENT_DIGITS = 4) to prevent
overflow while still allowing valid scientific notation
- Continue consuming remaining digits to maintain correct parsing position
- The capped value (up to 9999) is sufficient since the subsequent range
check (DBL_MIN_EXP to DBL_MAX_EXP) will catch invalid exponents
This fix prevents the overflow while maintaining correct parsing behavior
for both valid and invalid exponent values.
Signed-off-by: Samaresh Kumar Singh <ssam3003@gmail.com>1 parent 415830f commit 4bc51be
File tree
3 files changed
+55
-1
lines changed- pandas
- _libs/src/parser
- tests/io/parser
3 files changed
+55
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1510 | 1510 | | |
1511 | 1511 | | |
1512 | 1512 | | |
| 1513 | + | |
| 1514 | + | |
| 1515 | + | |
1513 | 1516 | | |
1514 | | - | |
| 1517 | + | |
| 1518 | + | |
| 1519 | + | |
| 1520 | + | |
1515 | 1521 | | |
1516 | 1522 | | |
1517 | 1523 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
0 commit comments