Fuzzy testing framework cleanup and drift reporting #47

kgrgpg · 2025-08-12T13:48:18Z

Overview

This PR cleans up and unifies the fuzzy-testing stack so we can generate deterministic scenario CSVs, produce Cadence tests with strict precision checks, and generate a per-step drift report to diagnose any mismatches. Legacy tests remain intact and continue to serve as the baseline for Scenarios 1–3.

Why

Prior branch iterations mixed several approaches and files, making it hard to trust results
We need strict precision (±0.0000001) and end-to-end repeatability
We want a single-button flow to generate CSVs → tests → drift report, plus a compact, consistent scenario numbering

Key changes

Compact CSV numbering (1–9)
- Removed Scenario 4 (Scaling) from the suite; it requires per-row resets and is excluded by design
- Renumbered the previous 5–10 as 4–9 in CSVs and generation
- Final CSV set in repo root:
  - Scenario1_FLOW.csv
  - Scenario2_Instant.csv
  - Scenario3_Path_{A,B,C,D}_precise.csv
  - Scenario4_VolatileMarkets.csv
  - Scenario5_GradualTrends.csv
  - Scenario6_EdgeCases.csv
  - Scenario7_MultiStepPaths_{Bear,Bull,Sideways,Crisis}.csv
  - Scenario8_RandomWalks.csv
  - Scenario9_ExtremeShocks_{FlashCrash,Rebound,YieldHyperInflate,MixedShock}.csv
Scenario splits
- Split Scenario 7 (MultiStepPaths) into four per-case CSVs
- Split Scenario 9 (ExtremeShocks) into four per-shock CSVs
- Each per-case file is independent and starts from a fresh baseline
Generator (Cadence tests)
- Outputs tests directly into cadence/tests/
- Uses strict ±0.0000001 tolerance for all asserts
- Adds per-step DRIFT logging (machine-parsable):
  - DRIFT|<Label>|<step>|<actualDebt>|<expectedDebt>|<actualY>|<expectedY>|<actualColl>|<expectedColl>
- Logs every step and performs a single final assertion to indicate if any step exceeded tolerance — so the drift report includes all rows, not just until first failure
- Aligns step-0 ordering with the simulator: open at baseline 1.0/1.0, set step-0 prices, replay CSV Actions in order, then validate
Drift report
- precision_reports/generate_drift_report.py discovers scenario tests dynamically and runs them
- Aggregates per-step deltas (absolute and percent) into precision_reports/UNIFIED_FUZZY_DRIFT_REPORT.md
One-command runner
- scripts/run_fuzzy.sh archives previous artifacts and runs the full pipeline:
  - Archives previous Scenario*.csv, generated tests, and the last drift report into archives/fuzzy_run_<timestamp>/
  - Ensures a Python venv with pandas
  - Generates CSVs → generates tests → rebuilds the drift report
Documentation
- Added FUZZY_TESTING.md with clear instructions, and linked the one-command shortcut
- Kept only the essential docs files: README.md, UNIFIED_TEST_SUITE.md, FUZZY_TESTING.md, and precision_reports/UNIFIED_FUZZY_DRIFT_REPORT.md
Cleanup
- Removed outdated markdown reports and previous generator/test artifacts
- Preserved legacy Cadence tests: rebalance_scenario{1,2}_test.cdc and rebalance_scenario3{a,b,c,d}_test.cdc

How to run locally

One command (recommended):

bash scripts/run_fuzzy.sh

Or step-by-step:

# 1) Generate CSVs
python3 tidal_simulator.py

# 2) Generate tests from CSVs
python3 generate_cadence_tests.py

# 3) Build drift report
python3 precision_reports/generate_drift_report.py

Inspect the outputs:
- CSVs: repo root (Scenario*.csv)
- Cadence tests: cadence/tests/rebalance_scenario*_test.cdc
- Drift report: precision_reports/UNIFIED_FUZZY_DRIFT_REPORT.md

Precision and semantics

Tests use strict ±0.0000001 for Debt, YieldUnits, and Collateral
DRIFT lines are logged at each step; the test fails once at the end if any step deviates beyond tolerance — enabling a complete per-row drift picture
Where large drifts are present, they typically indicate scenario/action-ordering mismatches between CSV expectations and on-chain rebalancing semantics, rather than math formula changes

Verification notes

Scenarios 1–3 match legacy expectations under strict precision (these legacy tests remain in the repo)
Scenario 4 (Scaling) is intentionally excluded to avoid per-row resets
Multi-path and random-walk scenarios now surface full step-by-step drifts to identify semantic gaps

Impact

Test generation and naming updated per compact numbering and per-case splits
Legacy tests remain intact and continue to be usable
CI (if any) can call bash scripts/run_fuzzy.sh to regenerate artifacts and rebuild the drift report

Follow-ups (optional)

If desired, adjust CSV Actions semantics or generator step order on specific rows to eliminate remaining large drifts while maintaining the strict tolerance
Consider a CI job artifact to upload UNIFIED_FUZZY_DRIFT_REPORT.md for quick review on PRs

Checklist

Precision: kept at ±0.0000001
Legacy tests: preserved and pass
Docs: updated (FUZZY_TESTING.md) and simplified
Runner: added scripts/run_fuzzy.sh for archive + regen + report

- Created Python simulator (tidal_simulator.py) with 9-decimal precision - Implemented extended simulator for scenarios 1-10 with fuzzy testing - Generated 13 Cadence test files from CSV scenarios - Fixed yield price monotonic constraint (only increase/stay same) - Added comprehensive test generation framework (generate_cadence_tests.py) - Fixed test patterns to match existing tests (force rebalancing, proper measurements) - Created precision comparison and reporting tools - Fixed position ID issues and Cadence syntax errors - Added extensive documentation and test reports Test Status: - 8/13 generated tests passing (62%) - 3/13 have minor calculation variances (1-3%) - 2/13 have syntax issues to fix All existing tests continue to pass (100%)

- Added 7 test scenarios with CSV configurations for various edge cases - Implemented tidal_protocol_simulator.py for protocol simulation - Created simple_cadence_test_generator.py for automated test generation - Generated Cadence test files for scenarios 1-5 and 3a-3d variations - Added unified test suite documentation - Included precision analysis and comparison reports - Updated existing scenario CSV files with refined test parameters

…io7/9 per case; strict 1e-7 comparisons with per-step DRIFT logging; new run_fuzzy.sh; cleaned outdated docs; preserved legacy tests; updated FUZZY_TESTING.md and drift report.

…1 post-rebalance semantics; drift report covers split tests

… semantics; drift report over split tests

…splits S8; drift report includes S8 per-walk; scenario mapping + strict tolerance retained

… and regenerate tests; update drift report; keep strict tolerance

nialexsan and others added 20 commits July 25, 2025 11:47

defi math utils

ccde7f3

revert

90ac0a8

Merge remote-tracking branch 'origin/main' into nialexsan/math-utils

febc2ab

add missing deployment

370bdfd

update deps

589977f

use rounding

9e45b78

update deps

9618705

update deps

bd1acdd

align with defi actions math utils

4250fad

update ref

ba552f9

update ref

502610e

Clean up old CSV files and generated test files

900f33c

Fuzzy testing framework overhaul: compact CSV numbering; split Scenar…

f4b51c8

…io7/9 per case; strict 1e-7 comparisons with per-step DRIFT logging; new run_fuzzy.sh; cleaned outdated docs; preserved legacy tests; updated FUZZY_TESTING.md and drift report.

Ignore fuzzy testing archives/ in git

b6aeb6b

Fuzzy testing: compact CSV naming; S7/S9 split; strict 1e-7; Scenario…

20118d0

…1 post-rebalance semantics; drift report covers split tests

Docs/tests: compact CSV naming; split S7/S9; Scenario1 post-rebalance…

bad5351

… semantics; drift report over split tests

Fuzzy testing: split Scenario 8 per-walk CSV/tests; ensure run_fuzzy …

bd2634b

…splits S8; drift report includes S8 per-walk; scenario mapping + strict tolerance retained

generator: correct Actions handling (Bal→tide, Borrow/Repay→protocol)…

eebd00d

… and regenerate tests; update drift report; keep strict tolerance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fuzzy testing framework cleanup and drift reporting #47

Fuzzy testing framework cleanup and drift reporting #47

Uh oh!

kgrgpg commented Aug 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fuzzy testing framework cleanup and drift reporting #47

Are you sure you want to change the base?

Fuzzy testing framework cleanup and drift reporting #47

Uh oh!

Conversation

kgrgpg commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Why

Key changes

How to run locally

Precision and semantics

Verification notes

Impact

Follow-ups (optional)

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kgrgpg commented Aug 12, 2025 •

edited

Loading