From 03b83e4e1a3740a58bf28045ea4f166dfd05c88a Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 17 Dec 2025 21:16:43 +0000 Subject: [PATCH 1/2] Initial plan From be670fe990bfd19ae32f8c8339f2c9816230c188 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 17 Dec 2025 21:21:58 +0000 Subject: [PATCH 2/2] Add spec.md and progress.md documenting PyRIT 0.10.0 SQLite changes Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com> --- .../azure/ai/evaluation/red_team/progress.md | 136 ++++++++++++++++++ .../azure/ai/evaluation/red_team/spec.md | 97 +++++++++++++ 2 files changed, 233 insertions(+) create mode 100644 sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md create mode 100644 sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md new file mode 100644 index 000000000000..20c334b72785 --- /dev/null +++ b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md @@ -0,0 +1,136 @@ +# PyRIT FoundryScenario Integration - Progress Tracker + +**Last Updated:** 2025-12-17 +**Current Phase:** Planning Complete +**Next Milestone:** Phase 1 Implementation Start + +## Executive Summary + +This document tracks the progress of integrating PyRIT's FoundryScenario into Azure AI Evaluation's red teaming framework. The integration targets PyRIT 0.10.0, which removed DuckDB support in favor of SQLite as the sole memory backend. + +## Design Decisions + +| Decision | Rationale | Date | +|----------|-----------|------| +| Target PyRIT 0.10.0+ | Latest stable version with SQLite-only backend | 2025-12-17 | +| Use SQLite memory | Only option in PyRIT 0.10.0+ (DuckDB removed) | 2025-12-17 | +| Integrate FoundryScenario | Leverage PyRIT's scenario framework for structured attacks | 2025-12-17 | +| Maintain backward compatibility | Preserve existing red team API surface | 2025-12-17 | +| Store DB in output directory | Co-locate memory with scan results | 2025-12-17 | + +## Phase Progress + +### Phase 0: Planning ✅ (Completed 2025-12-17) + +- [x] Research PyRIT 0.10.0 changes +- [x] Identify DuckDB removal and SQLite migration +- [x] Document breaking changes in current codebase +- [x] Define technical specification +- [x] Create implementation roadmap + +### Phase 1: Implementation ⬜ (Week 1-2) + +- [ ] Update `_red_team.py` to use SQLite initialization +- [ ] Create `_foundry_scenario.py` implementation +- [ ] Create `_scenario_manager.py` for lifecycle management +- [ ] Integrate with `_orchestrator_manager.py` +- [ ] Update imports and dependencies + +### Phase 2: Testing & Validation ⬜ (Week 3) + +- [ ] Write unit tests for new components +- [ ] Create integration tests +- [ ] Run end-to-end scenario tests +- [ ] Performance benchmarking +- [ ] Security review + +### Phase 3: Documentation & Release ⬜ (Week 4) + +- [ ] API documentation +- [ ] Migration guide +- [ ] Sample scenarios +- [ ] Code review +- [ ] Release preparation + +## Current Blockers/Challenges + +1. **No blockers at this time** - Planning phase complete + +2. **Dependencies:** + - Requires PyRIT >= 0.10.0 + - Must coordinate with PyRIT team for scenario API stability + +3. **Testing Challenges:** + - Need comprehensive scenario coverage + - Must validate memory persistence across sessions + - Performance testing with large scenario sets + +4. **Breaking Change in PyRIT 0.10.0:** + - **Issue:** Current code uses `initialize_pyrit(memory_db_type=DUCK_DB)` which no longer exists + - **Location:** `_red_team.py` line ~222 + - **Fix Required:** Change to `initialize_pyrit(memory_db_type=SQLITE, memory_db_path=db_path)` + - **Impact:** High - must be addressed before Phase 1 implementation + +## Risk Assessment + +| Risk | Likelihood | Impact | Mitigation | +|------|------------|--------|------------| +| Breaking changes in PyRIT 0.10.0+ | Medium | High | Pin PyRIT version, comprehensive testing | +| Performance degradation with SQLite | Low | Medium | Benchmark and optimize queries | +| Backward compatibility issues | Low | High | Maintain existing API surface | +| Memory persistence issues | Low | Medium | Thorough integration testing | + +## Key Metrics + +### Code Changes (Estimated) +- New files: 2 (`_foundry_scenario.py`, `_scenario_manager.py`) +- Modified files: 2 (`_red_team.py`, `_orchestrator_manager.py`) +- Lines of code: ~500-800 new, ~50-100 modified + +### Test Coverage Goals +- Unit test coverage: >90% +- Integration test coverage: >85% +- E2E scenario coverage: 100% of critical paths + +### Performance Targets +- SQLite initialization: <100ms +- Scenario execution: No regression vs. current implementation +- Memory query latency: <50ms for typical queries + +## Timeline + +| Phase | Start Date | End Date | Status | +|-------|------------|----------|--------| +| Phase 0: Planning | 2025-12-10 | 2025-12-17 | ✅ Complete | +| Phase 1: Implementation | TBD | TBD | ⬜ Not Started | +| Phase 2: Testing | TBD | TBD | ⬜ Not Started | +| Phase 3: Documentation | TBD | TBD | ⬜ Not Started | + +## Next Steps + +1. **Immediate (Week 1):** + - Fix breaking change in `_red_team.py` (DUCK_DB → SQLITE) + - Set up development environment with PyRIT 0.10.0 + - Begin `_foundry_scenario.py` implementation + +2. **Short-term (Week 2):** + - Complete core implementation + - Begin unit testing + - Integration with orchestrator manager + +3. **Medium-term (Week 3-4):** + - Comprehensive testing + - Documentation + - Code review and release preparation + +## References + +- [PyRIT 0.10.0 Release Notes](https://github.com/Azure/PyRIT/releases) +- [Technical Specification](spec.md) +- [Azure AI Evaluation Red Team Documentation](../../README.md) + +--- + +**Document Owner:** Azure AI Evaluation Team +**Last Review:** 2025-12-17 +**Next Review:** TBD (Phase 1 start) diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md new file mode 100644 index 000000000000..5a0d8649dcc1 --- /dev/null +++ b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md @@ -0,0 +1,97 @@ +# PyRIT FoundryScenario Integration - Technical Specification v2.0 + +**Last Updated:** 2025-12-17 +**Status:** Planning Complete +**Owner:** Azure AI Evaluation Team +**Target PyRIT Version:** 0.10.0 + +> **Note:** This specification targets PyRIT 0.10.0, which removed DuckDB support. SQLite is now the only supported memory backend. + +## Overview + +This specification outlines the technical approach for integrating PyRIT's FoundryScenario into Azure AI Evaluation's red teaming framework. + +## PyRIT Memory: SQLite (v0.10.0+) + +**PyRIT 0.10.0 removed DuckDB support.** SQLite is now the only supported memory backend. + +### Implementation + +```python +from pyrit.common import initialize_pyrit, SQLITE + +# In RedTeam.__init__() or ScenarioManager.__init__() +db_path = os.path.join(self.output_dir, "pyrit_memory.db") +initialize_pyrit(memory_db_type=SQLITE, memory_db_path=db_path) +``` + +### Memory Retrieval + +When retrieving results from PyRIT memory: + +```python +from pyrit.memory import CentralMemory + +# CentralMemory uses the SQLite backend configured during initialization +memory = CentralMemory.get_memory_instance() + +# Query by labels (stored in SQLite) +message_pieces = memory.get_message_pieces( + labels={"risk_category": risk_category.value} +) +``` + +**Note:** `CentralMemory.get_memory_instance()` returns the **singleton instance** that uses the SQLite backend configured during `initialize_pyrit()`. + +## Implementation Phases + +### ⬜ Phase 1: Implementation (Week 1-2) + +**Breaking Change Alert:** Current `_red_team.py` uses `initialize_pyrit(memory_db_type=DUCK_DB)` which was removed in PyRIT 0.10.0. Must update to SQLite before implementing FoundryScenario. + +**Core Files to Create:** +- `_foundry_scenario.py`: FoundryScenario implementation +- `_scenario_manager.py`: Scenario lifecycle management + +**Files to Modify:** +- `_red_team.py`: Update PyRIT initialization to use SQLite +- `_orchestrator_manager.py`: Integration points for FoundryScenario + +### ⬜ Phase 2: Testing & Validation (Week 3) + +**Testing Strategy:** +- Unit tests for FoundryScenario components +- Integration tests with existing red team framework +- End-to-end scenario execution tests + +### ⬜ Phase 3: Documentation & Release (Week 4) + +**Deliverables:** +- API documentation +- Migration guide for existing red team usage +- Sample scenarios and usage examples + +## Technical Requirements + +### Dependencies +- PyRIT >= 0.10.0 (SQLite backend only) +- Azure AI Evaluation SDK +- Python >= 3.9 + +### Configuration +- SQLite database path configurable via output directory +- Memory persistence across scan sessions +- Label-based query support for result retrieval + +## Success Metrics + +- [ ] All existing red team functionality preserved +- [ ] FoundryScenario successfully integrated +- [ ] SQLite memory backend properly configured +- [ ] No performance degradation compared to current implementation +- [ ] Comprehensive test coverage (>90%) + +## References + +- [PyRIT Documentation](https://github.com/Azure/PyRIT) +- [Azure AI Evaluation SDK](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/evaluation/azure-ai-evaluation)