-
-
Notifications
You must be signed in to change notification settings - Fork 825
fix(test): enable shuffle mode and fix test isolation bugs #6601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add proper mock resets in beforeEach blocks to ensure tests are isolated and can run in any order with --sequence.shuffle=true. Key changes: - test/util/config/load.test.ts: Reset path.parse mock to actual implementation in beforeEach for combineConfigs, resolveConfigs, and resolveConfigs with external defaultTest blocks - test/redteam/providers/iterative.test.ts: Add mockReset() for hoisted mocks (mockGetProvider, mockGetTargetResponse, mockCheckPenalizedPhrases, mockGetGraderById) since clearAllMocks only clears call history, not mockReturnValue implementations - test/commands/modelScan.test.ts: Reset spawn, getModelAuditCurrentVersion, ModelAudit, and HuggingFace mocks in beforeEach for all describe blocks The root cause was that vi.clearAllMocks() only clears call history but doesn't reset mockReturnValue/mockResolvedValue implementations. When tests set these values, they persist across tests unless explicitly reset with mockReset() or mockImplementation(). Fixes #2265 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add sequence.shuffle: true to both vitest.config.ts and vitest.integration.config.ts to catch test isolation issues early. This is the staff engineer approach: - Single source of truth in config (not scattered across scripts) - Applies to all test runs (local, CI, watch mode) - Self-documenting with clear comments - Override-able with --sequence.shuffle=false for debugging Also updated test/AGENTS.md with: - Documentation about shuffle being enabled by default - Critical mock isolation guidance (vi.clearAllMocks vs mockReset) - Override flags for debugging 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reset runExtensionHook mock in defaultTest normalization tests - Add resolveConfigs mock setup in doGenerateRedteam external defaultTest tests Both issues were caused by tests relying on mock implementations set by previous tests, which vi.clearAllMocks() doesn't reset.
- Add runExtensionHook mock reset to main evaluator describe block - Add fetchWithTimeout mock reset to checkForUpdates describe block - Clear PROMPTFOO_DISABLE_UPDATE env var in checkForUpdates beforeEach Environment variables set by tests in one describe block were leaking to tests in other describe blocks when shuffle was enabled.
- evaluator.test.ts: Add runExtensionHook mock reset to defaultTest merging describe block - accounts.test.ts: Add readGlobalConfig mock setup in setUserEmail test - python.test.ts: Add path.resolve/path.extname mocks to 2 tests that relied on earlier test state - testCaseReader.test.ts: Fix xlsx module mock by calling resetModules before doMock These fixes ensure tests pass consistently regardless of execution order when running with shuffle enabled (vitest --sequence.shuffle=true). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
⏩ No test execution environment matched (86c8d28) View output ↗ View check history
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 All Clear
I reviewed this PR for LLM security vulnerabilities. The changes focus entirely on test infrastructure improvements - enabling random test execution order and fixing mock isolation issues. No LLM-related code was modified.
Minimum severity threshold for this scan: 🟡 Medium | Learn more
📝 WalkthroughWalkthroughThis PR implements systematic test isolation improvements across the codebase by enabling random test execution and establishing comprehensive mock reset patterns. Changes include enabling test sequence randomization in Vitest configuration files (vitest.config.ts, vitest.integration.config.ts), adding explicit mock resets in beforeEach hooks across multiple test files (assertions, commands, evaluator, config, redteam, updates, and utilities), converting several beforeEach hooks to async for proper initialization, and documenting best practices for mock isolation in test documentation. The overall objective is preventing test state leakage when tests execute in random order. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas requiring extra attention:
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (10)
test/updates.test.ts (1)
72-79: Stronger per-suite isolation for fetchWithTimeout and PROMPTFOO_DISABLE_UPDATEResetting
fetchWithTimeoutand clearingPROMPTFOO_DISABLE_UPDATEin thisbeforeEachmakes thecheckForUpdatessuite deterministic and order‑independent, even when other suites queuemockResolvedValueOncecalls or tweak that env var. The extramockReseton top of the global one is redundant but harmless and keeps the intent local to this block. Based on learnings, this aligns with the isolation guidance intest/AGENTS.md.test/redteam/commands/generate.test.ts (1)
1497-1526: Resetting resolveConfigs per-suite avoids hoisted mock leakageRe‑initializing
configModule.resolveConfigsin thisbeforeEachis the right way to stop implementations from other redteam suites leaking into the “external defaultTest” tests, especially with hoisted module mocks and shuffled ordering. The neutral default config you return is minimal but sufficient for these scenarios.test/evaluator.test.ts (3)
338-347: Resetting runExtensionHook in the main evaluator suite prevents hook leaksAdding
mockReset()followed by the default identity implementation in thisbeforeEachstops per‑test overrides ofrunExtensionHook(e.g., in the sessionId tests later in the suite) from leaking across tests. Combined withvi.clearAllMocks(), this gives the evaluator tests a predictable starting hook state under shuffled execution.
4034-4040: Same runExtensionHook reset pattern correctly applied to defaultTest-merging testsUsing the same
runExtensionHookreset in theevaluator defaultTest mergingsuite ensures those tests are not affected by hook behavior from the main evaluator block or vice versa. This is consistent with the isolation goal of the PR and keeps the extension‑related assertions here trustworthy regardless of run order.
4369-4374: Hook normalization is especially important for extension-focused testsFor the
defaultTest normalization for extensionssuite, normalizingrunExtensionHookbefore each test is critical, since these tests explicitly assert on how extensions manipulatedefaultTest. Guaranteeing a clean, array-backedrunExtensionHookmock per test avoids very subtle flakiness when other suites modify the same mock. This change aligns nicely with the new AGENTS guidance on extension hooks and defaultTest setup.test/util/config/load.test.ts (2)
199-217: Path/glob setup incombineConfigsbeforeEach improves isolationThe async
beforeEachthat clears/restores mocks, fixesprocess.cwd, and rewiresglobSync+path.parseback to the real implementation viavi.importActual('path')prevents pollution from other suites that also mock these APIs, which is important now that tests run in random order.If you find yourself tweaking this in more places, consider a small helper (e.g.
resetPathAndGlobMocks()) to DRY up the pattern. As per coding guidelines on test independence.
1360-1371:resolveConfigsbeforeEach correctly resets process/mocking stateResetting all mocks, re-spying
process.cwd, and restoringpath.parsefromvi.importActual('path')ensuresresolveConfigstests don't inherit cwd/path/glob state from other describes. This is a good fit for shuffle-enabled runs and for the CLI-exit tests that rely on a cleanprocessspy per test.Same helper you might use for
combineConfigscould also cover this to keep the reset logic in one place.test/util/testCaseReader.test.ts (1)
505-512: Module reset +vi.importActualusage is correct, but comment is staleMoving
vi.resetModules()before the mocks and using:const actualFs = await vi.importActual<typeof import('fs')>('fs'); vi.doMock('fs', () => ({ ...actualFs, existsSync: vi.fn().mockReturnValue(true), }));is the right way to get the real
fsinto a fresh module graph for this test. The remaining comment about “use require to get actual fs sincevi.importActualmay return mocked version” no longer matches the implementation and can be confusing.- // Mock fs module - use require to get actual fs since vi.importActual may return mocked version - const actualFs = await vi.importActual<typeof import('fs')>('fs'); + // Mock fs module using the real Node fs implementation + const actualFs = await vi.importActual<typeof import('fs')>('fs');You might also want to align the earlier XLSX test’s
vi.doMock('fs', () => ({ ...vi.importActual('fs'), ... }))with this safer pattern.As per coding guidelines about avoiding test pollution via module caches.
test/commands/modelScan.test.ts (1)
37-67: Shared beforeEach reset pattern correctly de-pollutes modelScan mocksAcross the various
describeblocks you now:
vi.clearAllMocks()per test,mockReset()thechild_process.spawnmock,- re-import and reset
getModelAuditCurrentVersionto a known default,- reset
ModelAudit.findByRevision/ModelAudit.createto the default “no existing scan, fixed id” behavior, and- reset HuggingFace helpers (
isHuggingFaceModel,getHuggingFaceMetadata,parseHuggingFaceModel) to neutral values.This removes hidden coupling between:
- CLI error-path tests,
- re-scan-on-version-change tests,
- installation detection (
checkModelAuditInstalled), and- temp-file / no-write behavior,
which is critical now that the test runner shuffles order. The process-exit spy setup/teardown per describe looks consistent with the move to
process.exitCode.Given the repetition, consider a small
async resetModelScanTestState()helper shared by these beforeEach blocks to keep future changes to the default mock behavior in one place. As per coding guidelines on deterministic, order-independent tests.Also applies to: 326-356, 546-572, 616-646, 865-895
test/AGENTS.md (1)
29-31: Shuffle + mock-isolation guidance matches implementation; consider heading tweakThe additions:
- documenting that tests run in random order by default (with
--sequence.shuffle/--sequence.seedknobs), and- the “Critical: Mock Isolation” section clarifying
vi.clearAllMocks()vsmockReset()and showing abeforeEachpattern,are exactly aligned with the changes in the test files (hoisted mocks + per-describe resets) and with the independence requirements in this repo.
To satisfy markdownlint (MD036) and improve structure, you could make the “Critical: Mock Isolation” label an actual heading instead of bold text:
-**Critical: Mock Isolation** +### Critical: Mock IsolationBased on learnings about documenting agent/test behavior in AGENTS.md.
Also applies to: 72-84
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
test/AGENTS.md(2 hunks)test/assertions/python.test.ts(2 hunks)test/commands/modelScan.test.ts(5 hunks)test/evaluator.test.ts(3 hunks)test/globalConfig/accounts.test.ts(1 hunks)test/redteam/commands/generate.test.ts(1 hunks)test/redteam/providers/iterative.test.ts(1 hunks)test/updates.test.ts(1 hunks)test/util/config/load.test.ts(6 hunks)test/util/testCaseReader.test.ts(1 hunks)vitest.config.ts(1 hunks)vitest.integration.config.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx,js,jsx}: Follow consistent import order (Biome handles sorting)
Use consistent curly braces for all control statements
Preferconstoverlet; avoidvar
Use object shorthand syntax whenever possible
Useasync/awaitfor asynchronous code
Files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tsvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
test/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Vitest for all tests (both
test/andsrc/app/)
Files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
test/**/*.test.{ts,tsx,js}
📄 CodeRabbit inference engine (AGENTS.md)
Backend tests in
test/should use Vitest with globals enabled (describe,it,expectavailable without imports)
Files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
test/**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (test/AGENTS.md)
test/**/*.test.{ts,tsx,js,jsx}: Never increase test timeouts - fix the slow test instead
Never use.only()or.skip()in committed code
Always clean up mocks inafterEachusingvi.resetAllMocks()
Import test utilities explicitly from 'vitest':describe,it,expect,beforeEach,afterEach,vi
Use Vitest's mocking utilities (vi.mock,vi.fn,vi.spyOn) rather than other mocking libraries
Prefer shallow mocking over deep mocking when using Vitest
Mock external dependencies but not the code being tested
Reset mocks between tests to prevent test pollution
Ensure all tests are independent and can run in any order
Clean up test data and mocks after each test
Test failures should be deterministic
For database tests, use in-memory instances or proper test fixtures
Files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
test/**/AGENTS.md
📄 CodeRabbit inference engine (test/CLAUDE.md)
Document all agent implementations and capabilities in AGENTS.md
Files:
test/AGENTS.md
🧠 Learnings (32)
📓 Common learnings
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Ensure all tests are independent and can run in any order
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Reset mocks between tests to prevent test pollution
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Use Vitest's mocking utilities (`vi.mock`, `vi.fn`, `vi.spyOn`) rather than other mocking libraries
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Always run tests with `--randomize` flag to ensure test independence
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Always clean up mocks in `afterEach` using `vi.resetAllMocks()`
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:02.324Z
Learning: Applies to test/**/*.{ts,tsx,js,jsx} : Use Vitest for all tests (both `test/` and `src/app/`)
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/app/AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:48.482Z
Learning: Applies to src/app/**/*.test.{ts,tsx} : Use `vi.fn()` for mocks and `vi.mock()` for module mocking in Vitest test files
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Test failures should be deterministic
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Prefer shallow mocking over deep mocking when using Vitest
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:14.828Z
Learning: Applies to src/redteam/test/redteam/**/*.ts : Add tests for new red team plugins in `test/redteam/` directory following the pattern in `src/redteam/plugins/pii.ts`
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Reset mocks between tests to prevent test pollution
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdtest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Clean up test data and mocks after each test
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/evaluator.test.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Always clean up mocks in `afterEach` using `vi.resetAllMocks()`
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdtest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Ensure all tests are independent and can run in any order
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.020Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.020Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Never increase test timeouts - fix the slow test instead
Applied to files:
test/updates.test.tstest/AGENTS.md
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : For database tests, use in-memory instances or proper test fixtures
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.md
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Mock external dependencies but not the code being tested
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdtest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:09:06.028Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/providers/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:06.028Z
Learning: Applies to src/providers/test/providers/**/*.ts : Test provider success AND error cases, including rate limits, timeouts, and invalid configs
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/AGENTS.mdtest/evaluator.test.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:09:06.028Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/providers/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:06.028Z
Learning: Applies to src/providers/test/providers/**/*.ts : Provider tests must NEVER make real API calls - mock all HTTP requests using `vi.mock`
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdtest/evaluator.test.tstest/assertions/python.test.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Test failures should be deterministic
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/AGENTS.mdvitest.config.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Use Vitest's mocking utilities (`vi.mock`, `vi.fn`, `vi.spyOn`) rather than other mocking libraries
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:08:48.482Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/app/AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:48.482Z
Learning: Applies to src/app/**/*.test.{ts,tsx} : Use `vi.fn()` for mocks and `vi.mock()` for module mocking in Vitest test files
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/globalConfig/accounts.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Import test utilities explicitly from 'vitest': `describe`, `it`, `expect`, `beforeEach`, `afterEach`, `vi`
Applied to files:
test/updates.test.tstest/redteam/commands/generate.test.tstest/redteam/providers/iterative.test.tstest/commands/modelScan.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:08:02.324Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:02.324Z
Learning: Applies to test/**/*.test.{ts,tsx,js} : Backend tests in `test/` should use Vitest with globals enabled (`describe`, `it`, `expect` available without imports)
Applied to files:
test/updates.test.tsvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:09:14.828Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:14.828Z
Learning: Applies to src/redteam/test/redteam/**/*.ts : Add tests for new red team plugins in `test/redteam/` directory following the pattern in `src/redteam/plugins/pii.ts`
Applied to files:
test/redteam/commands/generate.test.tstest/redteam/providers/iterative.test.tsvitest.config.tstest/assertions/python.test.ts
📚 Learning: 2025-12-09T06:09:14.828Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:14.828Z
Learning: Applies to src/redteam/plugins/*.ts : Generate targeted test cases for specific vulnerability types in plugin implementations
Applied to files:
test/redteam/commands/generate.test.ts
📚 Learning: 2025-12-09T06:09:14.828Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/AGENTS.md:0-0
Timestamp: 2025-12-09T06:09:14.828Z
Learning: Applies to src/redteam/plugins/*.ts : Include assertions defining failure conditions in plugin test cases
Applied to files:
test/redteam/commands/generate.test.tstest/redteam/providers/iterative.test.tstest/assertions/python.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/providers/**/*.test.{ts,tsx,js,jsx} : For provider testing, include test coverage for: success case, error cases (4xx, 5xx, rate limits), configuration validation, and token usage tracking
Applied to files:
test/redteam/providers/iterative.test.tstest/AGENTS.mdtest/evaluator.test.ts
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Applies to test/**/*.test.{ts,tsx,js,jsx} : Prefer shallow mocking over deep mocking when using Vitest
Applied to files:
test/redteam/providers/iterative.test.tstest/util/testCaseReader.test.tstest/AGENTS.mdvitest.config.tstest/evaluator.test.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-12-09T06:08:12.794Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: drizzle/AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:12.794Z
Learning: Applies to drizzle/test/**/*.{js,ts} : Use in-memory SQLite databases in test files to verify migrations work correctly without affecting production data
Applied to files:
test/commands/modelScan.test.ts
📚 Learning: 2025-12-09T06:08:55.096Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/commands/AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:55.096Z
Learning: Follow the standard CLI command structure demonstrated in `src/commands/eval.ts`: register with Commander, setup environment, track telemetry, use logger for output, and handle errors properly
Applied to files:
test/commands/modelScan.test.ts
📚 Learning: 2025-11-29T00:24:24.883Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-29T00:24:24.883Z
Learning: Applies to test/**/AGENTS.md : Document all agent implementations and capabilities in AGENTS.md
Applied to files:
test/AGENTS.md
📚 Learning: 2025-11-29T00:24:20.916Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/server/CLAUDE.md:0-0
Timestamp: 2025-11-29T00:24:20.916Z
Learning: Applies to src/server/**/AGENTS.md : Maintain clear documentation of agent architecture and design decisions in AGENTS.md
Applied to files:
test/AGENTS.md
📚 Learning: 2025-11-29T00:24:07.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/commands/CLAUDE.md:0-0
Timestamp: 2025-11-29T00:24:07.021Z
Learning: Applies to src/commands/**/*.md : Documentation about agents should be maintained in AGENTS.md
Applied to files:
test/AGENTS.md
📚 Learning: 2025-11-29T00:24:20.916Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/server/CLAUDE.md:0-0
Timestamp: 2025-11-29T00:24:20.916Z
Learning: Applies to src/server/**/AGENTS.md : Document agent responsibilities, capabilities, and interactions in AGENTS.md
Applied to files:
test/AGENTS.md
📚 Learning: 2025-11-29T00:24:17.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-11-29T00:24:17.021Z
Learning: Applies to src/redteam/**/*agent*.{ts,tsx,js,jsx} : Maintain clear agent interface definitions and usage patterns
Applied to files:
test/AGENTS.md
📚 Learning: 2025-12-10T02:05:13.021Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: test/AGENTS.md:0-0
Timestamp: 2025-12-10T02:05:13.021Z
Learning: Always run tests with `--randomize` flag to ensure test independence
Applied to files:
test/AGENTS.mdvitest.config.tsvitest.integration.config.ts
📚 Learning: 2025-12-09T06:08:02.324Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:02.324Z
Learning: Applies to test/**/*.{ts,tsx,js,jsx} : Use Vitest for all tests (both `test/` and `src/app/`)
Applied to files:
test/AGENTS.mdvitest.config.tsvitest.integration.config.ts
📚 Learning: 2025-07-18T17:25:57.700Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-07-18T17:25:57.700Z
Learning: Applies to **/*.{test,spec}.{js,ts,jsx,tsx} : Avoid disabling or skipping tests unless absolutely necessary and documented
Applied to files:
test/AGENTS.md
📚 Learning: 2025-12-09T06:08:02.324Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-09T06:08:02.324Z
Learning: Applies to src/app/**/*.test.{ts,tsx} : Frontend tests in `src/app/` should use Vitest with explicit imports
Applied to files:
vitest.config.tstest/assertions/python.test.tsvitest.integration.config.tstest/util/config/load.test.ts
📚 Learning: 2025-10-06T03:43:01.653Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-06T03:43:01.653Z
Learning: Applies to test/**/*.{ts,tsx,js,jsx} : Follow Jest best practices using describe and it blocks in tests
Applied to files:
test/util/config/load.test.ts
🧬 Code graph analysis (4)
test/updates.test.ts (1)
src/util/fetch/index.ts (1)
fetchWithTimeout(141-173)
test/globalConfig/accounts.test.ts (1)
src/globalConfig/globalConfig.ts (1)
readGlobalConfig(21-39)
test/evaluator.test.ts (1)
src/evaluatorHelpers.ts (1)
runExtensionHook(542-595)
test/assertions/python.test.ts (3)
scripts/generate-blog-image.cjs (1)
path(3-3)scripts/update-changelog-version.cjs (1)
path(11-11)scripts/validate-changelog.cjs (1)
path(13-13)
🪛 markdownlint-cli2 (0.18.1)
test/AGENTS.md
72-72: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Test on Node 24.x and ubuntu-latest
- GitHub Check: webui tests
- GitHub Check: Test on Node 24.x and windows-latest
- GitHub Check: Build Docs
- GitHub Check: Share Test
- GitHub Check: Test on Node 22.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and macOS-latest
- GitHub Check: Test on Node 20.x and windows-latest
- GitHub Check: Test on Node 22.x and macOS-latest
- GitHub Check: Test on Node 22.x and windows-latest
- GitHub Check: Redteam (Staging API)
- GitHub Check: Redteam (Production API)
- GitHub Check: Generate Assets
- GitHub Check: Build on Node 24.x
- GitHub Check: Build on Node 22.x
- GitHub Check: Build on Node 20.x
- GitHub Check: security-scan
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (7)
test/globalConfig/accounts.test.ts (1)
153-164: Explicitly mocking readGlobalConfig avoids cross-test leakageMocking
readGlobalConfigto a minimal{ id: 'test-id' }here ensuressetUserEmailis exercised against a clean config and won’t accidentally inheritaccountstate from other tests when order is randomized. This is a good, tight fix for test isolation.test/assertions/python.test.ts (1)
379-447: Per-test path.resolve/extname mocks fix order dependencyAdding explicit
path.resolve/path.extnamemocks in these tests ensures file‑based Python assertions no longer depend on whatever values previous tests left on the sharedpathmock. This matches the isolation guidance for shared mocks and makes these cases safe under shuffled execution.vitest.config.ts (1)
20-26: Enabling sequence.shuffle with clear guidance looks goodTurning on
sequence.shuffleby default (with comments on how to disable for debugging) is aligned with the test AGENTS guidance about catching order‑coupled tests early. The placement undertestconfig is appropriate and should work well with the existing forked worker setup.vitest.integration.config.ts (1)
20-26: Consistent randomized sequencing for integration testsMirroring
sequence.shuffle: trueinto the integration config keeps unit and integration suites aligned on “random by default, override for debugging,” which is exactly what the updated testing guidance calls for. No issues with the surrounding fork/timeout settings.test/redteam/providers/iterative.test.ts (1)
47-56: Hoisted mocks are now correctly reset for shuffle-safe isolationUsing
mockReset()on the hoisted mocks inbeforeEach(on top ofvi.clearAllMocks()) ensures both call history and implementations are cleared between tests, which is exactly what you want withvi.hoisted+ random test order. This prevents leakedmockReturnValue/mockResolvedValueOncestate across describes.As per coding guidelines about resetting mocks between tests and the new mock isolation rules in
test/AGENTS.md, this looks solid.test/util/config/load.test.ts (2)
1650-1661:readConfigbeforeEach correctly resets$RefParserandpath.parseThe new async
beforeEachthat:
- clears/restores mocks,
- resets
path.parseto the real implementation, andmockReset()smockDereferenceand re-establishes a pass-through implementation,prevents queued
mockResolvedValueOncecalls or custompath.parseimplementations from other tests leaking intoreadConfigbehavior.This directly implements the mock-isolation guidance for hoisted mocks in
test/AGENTS.md.
1851-1861:resolveConfigs with external defaultTestnow has deterministic basePath, deref, and glob behaviorThis
beforeEachdoes three important things for isolation:
- Sets
cliState.basePathto a known value.- Resets
mockDereferenceto a pass-through implementation.- Restores
path.parseviavi.importActual('path').The explicit
vi.mocked(globSync).mockReturnValue(['config.json']);also removes any dependence on prior glob mocks. Together these make the external-defaultTest scenario stable under random test ordering.If any other suites depend on
cliState.basePath’s default value, double-check they explicitly set it in their own setup so this mutation can’t leak across files. Based on learnings about test independence.Also applies to: 1881-1881
Replace hard-coded /tmp paths with path.join(os.tmpdir(), ...) to fix test failures on Windows where /tmp doesn't exist. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add mockReset() calls for mockedCheckModelAuditInstalled and mockedSpawn in beforeEach to ensure test isolation when tests run in random order. vi.clearAllMocks() only clears call history, not mock implementations set via mockResolvedValue/mockReturnValue. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
JustinBeckwith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never seen this! Very cool.
Add mockReset() and mockImplementation() for runExtensionHook in the 'Evaluator with external defaultTest' describe block's beforeEach hook. This ensures the mock is properly reset between tests when running with shuffle mode enabled, fixing the Windows Node 24 CI failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
sequence: { shuffle: true })Closes #2265
Test isolation fixes
load.test.tspath.parseandmockDereferencemocks persistedmockReset()in 4 describe blocksiterative.test.tsmockReset()for all hoisted mocksmodelScan.test.tsevaluator.test.tsrunExtensionHookmock in 3 describe blocksmockReset()+ restore default implementationgenerate.test.tsresolveConfigsmock had no defaultupdates.test.tsPROMPTFOO_DISABLE_UPDATEenv var pollutiondelete process.env.Xin beforeEachaccounts.test.tsreadGlobalConfigmock state leakedpython.test.tspath.resolve/path.extnamemocks missingtestCaseReader.test.tsresetModules()beforedoMockwatsonx.test.tsWatsonXAI.newInstancemock missingKey learnings documented in AGENTS.md
vi.clearAllMocks()only clears call history, NOTmockImplementation()- usemockReset()mockResolvedValueOnce()queues surviveclearAllMocks()- usemockReset()to clearvi.importActualto return mocked modules - callresetModules()firstTest plan
--sequence.shuffle=false)