From 5d956c67523affe85b7efd1ee227b214e57b407a Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 18 Dec 2025 19:04:18 +0000 Subject: [PATCH 1/2] Initial plan From e8d6b2a33e93343f1707468f4e219fbc19685a83 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 18 Dec 2025 19:11:15 +0000 Subject: [PATCH 2/2] Add comprehensive v3.0 spec.md and progress.md for PyRIT 0.10.0 integration Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com> --- .../azure/ai/evaluation/red_team/progress.md | 252 ++++++++++++++ .../azure/ai/evaluation/red_team/spec.md | 314 ++++++++++++++++++ 2 files changed, 566 insertions(+) create mode 100644 sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md create mode 100644 sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md new file mode 100644 index 000000000000..4b96503e48f9 --- /dev/null +++ b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/progress.md @@ -0,0 +1,252 @@ +# PyRIT FoundryScenario Integration - Progress Tracking + +**Last Updated:** 2025-12-18 +**Current Phase:** Planning Complete +**Next Milestone:** Phase 1 Implementation Start + +## Current Status + +### Overall Progress: 0% Complete + +| Phase | Status | Start Date | Completion Date | Progress | +|-------|--------|------------|-----------------|----------| +| Planning & Design | ✅ Complete | 2025-10-01 | 2025-12-18 | 100% | +| Phase 1: Core Infrastructure | ⏳ Not Started | TBD | TBD | 0% | +| Phase 2: Result Processing | ⏳ Not Started | TBD | TBD | 0% | +| Phase 3: End-to-End Integration | ⏳ Not Started | TBD | TBD | 0% | +| Phase 4: Testing & Documentation | ⏳ Not Started | TBD | TBD | 0% | + +## Blockers + +### High Priority + +1. **PyRIT 0.10.0 Upgrade Required:** + - **Issue:** Project currently depends on PyRIT version that may not include 0.10.0 features + - **Action:** Upgrade to PyRIT 0.10.0+ in requirements + - **Owner:** TBD + - **Status:** Not Started + +2. **SQLite Migration Path:** + - **Issue:** Need to ensure smooth transition from any existing DuckDB usage + - **Action:** Audit codebase for DuckDB references + - **Owner:** TBD + - **Status:** Not Started + +3. **RAI Service Endpoint Availability:** + - **Issue:** Need to validate RAI simulation endpoint is accessible and configured + - **Action:** Verify endpoint credentials and permissions + - **Owner:** TBD + - **Status:** Not Started + +4. **Breaking Change in PyRIT 0.10.0:** + - **Issue:** Current code uses `initialize_pyrit(memory_db_type=DUCK_DB)` which no longer exists + - **Location:** `_red_team.py` line 222 + - **Fix Required:** Change to `initialize_pyrit(memory_db_type=SQLITE, memory_db_path=db_path)` + - **Impact:** High - must be addressed before Phase 1 implementation + +### Medium Priority + +5. **Test Infrastructure Setup:** + - **Issue:** Need mock PyRIT scenario for testing without external dependencies + - **Action:** Create test fixtures and mocks + - **Owner:** TBD + - **Status:** Not Started + +6. **Performance Baseline:** + - **Issue:** Need to establish current performance metrics before migration + - **Action:** Run performance benchmarks on existing implementation + - **Owner:** TBD + - **Status:** Not Started + +## Phase 1: Core Infrastructure (Not Started) + +### Tasks + +- [ ] Create strategy mapping module + - [ ] Define `ATTACK_STRATEGY_TO_FOUNDRY_STRATEGY` mapping + - [ ] Add unit tests for mapping correctness + - [ ] Document strategy equivalence + +- [ ] Update PyRIT initialization + - [ ] Fix breaking change in `_red_team.py` line ~234 + - [ ] Implement SQLite database path configuration + - [ ] Add error handling for initialization failures + - [ ] Add logging for initialization steps + +- [ ] Implement scenario manager + - [ ] Create `_scenario_manager.py` module + - [ ] Implement FoundryScenario creation logic + - [ ] Add memory label configuration + - [ ] Implement scenario execution orchestration + +- [ ] Add context preservation + - [ ] Design memory label schema + - [ ] Implement label attachment during scenario creation + - [ ] Add label-based retrieval methods + - [ ] Test label persistence and retrieval + +### Deliverables + +- [ ] `_utils/strategy_mapping.py` with complete mappings +- [ ] Updated `_red_team.py` with SQLite initialization +- [ ] `_scenario_manager.py` with basic scenario orchestration +- [ ] Unit tests with >80% coverage for new modules + +## Phase 2: Result Processing (Not Started) + +### Tasks + +- [ ] Create result converter module + - [ ] Implement `_result_converter.py` + - [ ] Use `get_message_pieces()` API + - [ ] Extract MessagePiece data correctly + - [ ] Handle edge cases (empty results, errors) + +- [ ] Update result processor + - [ ] Migrate from PromptRequestPiece to MessagePiece + - [ ] Update data access patterns + - [ ] Preserve existing result format + - [ ] Add backward compatibility checks + +- [ ] Integration with evaluation pipeline + - [ ] Connect result converter to evaluation processor + - [ ] Validate result schema compatibility + - [ ] Add result export functionality + - [ ] Test end-to-end result flow + +### Deliverables + +- [ ] `_result_converter.py` with full conversion logic +- [ ] Updated `_result_processor.py` using MessagePiece +- [ ] Integration tests for result processing +- [ ] Documentation for result schema + +## Phase 3: End-to-End Integration (Not Started) + +### Tasks + +- [ ] Connect all components + - [ ] Wire scenario manager into `_red_team.py` + - [ ] Connect result converter to main flow + - [ ] Add orchestration logic + - [ ] Implement cleanup procedures + +- [ ] Error handling and resilience + - [ ] Add retry logic for transient failures + - [ ] Implement proper error propagation + - [ ] Add logging and diagnostics + - [ ] Handle partial success scenarios + +- [ ] Progress tracking + - [ ] Implement progress callbacks + - [ ] Add status reporting + - [ ] Create progress persistence + - [ ] Add cancellation support + +### Deliverables + +- [ ] Fully integrated red team scan functionality +- [ ] Comprehensive error handling +- [ ] Progress tracking implementation +- [ ] Integration tests covering all strategies + +## Phase 4: Testing & Documentation (Not Started) + +### Tasks + +- [ ] Unit testing + - [ ] Achieve >90% coverage for new modules + - [ ] Add edge case tests + - [ ] Add error scenario tests + - [ ] Add performance tests + +- [ ] Integration testing + - [ ] End-to-end scan tests + - [ ] Multi-strategy tests + - [ ] Context preservation tests + - [ ] Backward compatibility tests + +- [ ] Documentation + - [ ] Update API documentation + - [ ] Create migration guide + - [ ] Add code examples + - [ ] Create sample notebooks + +### Deliverables + +- [ ] Test suite with >90% coverage +- [ ] Published API documentation +- [ ] Migration guide for users +- [ ] Sample code and notebooks + +## Design Decisions + +| Decision | Rationale | Date | +|----------|-----------|------| +| Target PyRIT 0.10.0+ | Latest stable version with SQLite-only backend | 2025-12-18 | +| Use SQLite memory | Only option in PyRIT 0.10.0+ (DuckDB removed) | 2025-12-18 | +| Use MessagePiece data model | PyRIT 0.10.0 renamed PromptRequestPiece | 2025-12-18 | +| Preserve public API | Ensure backward compatibility for users | 2025-11-15 | +| Use memory labels for context | Enable filtering and reconstruction of scan sessions | 2025-11-15 | +| Abstract FoundryStrategy mapping | Decouple Azure AI Evaluation from PyRIT internals | 2025-10-15 | +| Maintain abstraction layer | Protect against future PyRIT breaking changes | 2025-10-15 | + +## Risk Register + +| Risk | Status | Mitigation | +|------|--------|------------| +| PyRIT API changes in 0.10.0 | ⚠️ Active | Documented in spec, ready to implement | +| SQLite performance at scale | 🔍 Monitoring | Will benchmark during Phase 2 | +| Memory label key collisions | ✅ Mitigated | Use namespaced keys | +| Backward compatibility issues | 🔍 Monitoring | Extensive testing planned in Phase 4 | + +## Metrics + +### Code Quality Targets +- Test Coverage: >90% +- Pylint Score: >9.0 +- Type Coverage: >95% +- Documentation Coverage: 100% + +### Performance Targets +- Scenario Execution Latency: <5% increase vs current +- Memory Query Performance: <100ms for typical scan +- Result Processing Throughput: >100 conversations/sec +- Resource Utilization: <10% increase in memory/CPU + +## Team Communication + +### Weekly Sync Topics +1. Blocker review and resolution +2. Phase progress updates +3. Design decision review +4. Risk assessment +5. Next week planning + +### Stakeholder Updates +- **Weekly:** Progress summary to team leads +- **Bi-weekly:** Demo to product management +- **Monthly:** Executive summary with metrics + +## Next Actions + +1. **Immediate (This Week):** + - Assign owners to Phase 1 tasks + - Upgrade PyRIT to 0.10.0 + - Set up development environment + +2. **Short Term (Next 2 Weeks):** + - Begin Phase 1 implementation + - Create strategy mapping module + - Fix breaking change in `_red_team.py` + +3. **Medium Term (Next Month):** + - Complete Phase 1 + - Begin Phase 2 + - Conduct first integration tests + +--- + +**Document Version History:** +- v2.0 (2025-12-18): Updated for PyRIT 0.10.0 alignment +- v1.0 (2025-10-01): Initial progress tracking document diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md new file mode 100644 index 000000000000..ee0957b9f1d6 --- /dev/null +++ b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/spec.md @@ -0,0 +1,314 @@ +# PyRIT FoundryScenario Integration - Technical Specification v3.0 + +**Last Updated:** 2025-12-18 +**Status:** Planning Complete +**Owner:** Azure AI Evaluation Team +**Target PyRIT Version:** 0.10.0 + +> **Breaking Changes in PyRIT 0.10.0:** +> - DuckDB support removed (SQLite only) +> - `PromptRequestPiece` renamed to `MessagePiece` +> - `get_prompt_request_pieces()` renamed to `get_message_pieces()` + +## Executive Summary + +This specification outlines the integration of PyRIT's FoundryScenario framework into Azure AI Evaluation's red teaming module. The integration leverages: + +- **Message-based data model** using `MessagePiece` for conversation tracking +- **SQLite memory** (only option in PyRIT 0.10.0+) for persistent storage +- **FoundryStrategy** mapping for attack strategy orchestration +- **Memory labels** for context preservation across scan sessions + +## Core Components + +### Strategy Mapping Layer + +**File:** `_utils/strategy_mapping.py` + +Maps Azure AI Evaluation's `AttackStrategy` enum to PyRIT's `FoundryStrategy`: + +```python +from pyrit.scenario.scenarios.foundry_scenario import FoundryStrategy + +ATTACK_STRATEGY_TO_FOUNDRY_STRATEGY: Dict[AttackStrategy, FoundryStrategy] = { + AttackStrategy.Direct: FoundryStrategy.Jailbreak, + AttackStrategy.PAIR: FoundryStrategy.Pair, + AttackStrategy.ROT13: FoundryStrategy.ROT13, + AttackStrategy.Base64: FoundryStrategy.Base64, +} +``` + +### Scenario Manager + +**File:** `_scenario_manager.py` + +Manages FoundryScenario lifecycle and configuration: + +- Initialize PyRIT with SQLite (line ~234 breaking change fix) +- Use RAI service simulation endpoint for adversarial chat +- Create FoundryScenario instances per risk category + +**Key Responsibilities:** +- Configure memory database paths +- Set up prompt target endpoints +- Manage scenario execution lifecycle +- Handle objective generation and context injection + +### Result Converter + +**File:** `_result_converter.py` + +Converts PyRIT memory data to Azure AI Evaluation results: + +- Use `get_message_pieces()` instead of `get_prompt_request_pieces()` +- Access `MessagePiece` properties (not `PromptRequestPiece`) +- Extract conversation history and metadata +- Generate evaluation-compatible result objects + +## PyRIT Memory: SQLite (v0.10.0+) + +**PyRIT 0.10.0 removed DuckDB support.** SQLite is now the only supported memory backend. + +### Implementation + +```python +from pyrit.common import initialize_pyrit, SQLITE + +# In ScenarioManager.__init__() +db_path = os.path.join(self.output_dir, "pyrit_memory.db") +initialize_pyrit(memory_db_type=SQLITE, memory_db_path=db_path) +``` + +### Memory Retrieval + +```python +from pyrit.memory import CentralMemory + +memory = CentralMemory.get_memory_instance() +message_pieces = memory.get_message_pieces( + labels={"risk_category": risk_category.value} +) +``` + +## Context Preservation with Memory Labels + +Memory labels attach metadata to each conversation turn, enabling filtering and reconstruction: + +```python +# Attach labels when creating scenario +scenario._memory_labels = { + "risk_category": risk_category.value, + "scan_session_id": scan_session_id, + "objective": objective, + "context": context_data, + "risk_subtype": risk_subtype, +} + +# Retrieve during result processing +memory = CentralMemory.get_memory_instance() +message_pieces = memory.get_message_pieces(labels={"risk_category": "violence"}) + +for piece in message_pieces: + context = piece.labels.get("context", {}) + risk_subtype = piece.labels.get("risk_subtype", "") +``` + +## Migration Strategy + +### Phase 1: Core Infrastructure (Weeks 1-2) + +**⚠️ Breaking Change Alert:** Current `_red_team.py` (line 222) uses: +```python +initialize_pyrit(memory_db_type=DUCK_DB) # ❌ Removed in PyRIT 0.10.0 +``` +Must update to: +```python +initialize_pyrit(memory_db_type=SQLITE, memory_db_path=db_path) # ✅ Only option +``` + +**Tasks:** +1. Create `_utils/strategy_mapping.py` with FoundryStrategy mappings +2. Update PyRIT initialization in `_red_team.py` to use SQLite +3. Implement `_scenario_manager.py` for FoundryScenario orchestration +4. Add memory label configuration for context preservation + +**Deliverables:** +- Working FoundryScenario execution for single risk category +- SQLite memory database with proper labeling +- Unit tests for strategy mapping + +### Phase 2: Result Processing (Week 3) + +**Tasks:** +1. Implement `_result_converter.py` using `get_message_pieces()` +2. Update `_result_processor.py` to use MessagePiece data model +3. Integrate with existing evaluation pipeline +4. Add result formatting and export logic + +**Deliverables:** +- Complete result conversion pipeline +- Integration tests with mock PyRIT scenarios +- Documentation for result schema + +### Phase 3: End-to-End Integration (Week 4) + +**Tasks:** +1. Connect all components in `_red_team.py` +2. Add error handling and retry logic +3. Implement progress tracking and logging +4. Performance optimization and validation + +**Deliverables:** +- Full red team scan execution +- Integration tests covering all attack strategies +- Performance benchmarks + +### Phase 4: Testing & Documentation (Week 5) + +**Tasks:** +1. Comprehensive unit and integration tests +2. Update API documentation +3. Create migration guide for existing users +4. Add code examples and usage samples + +**Deliverables:** +- >90% test coverage +- Published documentation +- Sample notebooks + +## Architecture Diagrams + +### Component Interaction Flow + +``` +User API Call + ↓ +RedTeam.scan() + ↓ +ScenarioManager + ├── Initialize PyRIT (SQLite) + ├── Create FoundryScenario instances + └── Execute attack strategies + ↓ +PyRIT Memory (SQLite) + └── Store MessagePieces with labels + ↓ +ResultConverter + ├── Query message_pieces by labels + ├── Extract conversation history + └── Build RedTeamResult objects + ↓ +RedTeamResult + └── Return to user +``` + +### Data Flow + +``` +Attack Objective + ↓ +FoundryScenario.execute() + ↓ +Adversarial Prompt → Target System → Response + ↓ +MessagePiece (with labels) + ↓ +SQLite Database + ↓ +get_message_pieces(labels=...) + ↓ +RedTeamResult +``` + +## API Design + +### Public Interface (No Changes) + +The external API remains unchanged to ensure backward compatibility: + +```python +from azure.ai.evaluation.red_team import RedTeam + +red_team = RedTeam(...) +result = red_team.scan( + risk_categories=[RiskCategory.VIOLENCE], + attack_strategies=[AttackStrategy.Direct, AttackStrategy.PAIR] +) +``` + +### Internal Changes + +All changes are internal implementation details: +- Strategy mapping happens transparently +- Memory storage is abstracted +- Result conversion is automatic + +## Testing Strategy + +### Unit Tests +- Strategy mapping correctness +- Memory label configuration +- Result conversion logic +- Error handling + +### Integration Tests +- End-to-end scenario execution +- Memory persistence and retrieval +- Multi-strategy orchestration +- Context preservation + +### Performance Tests +- Scenario execution latency +- Memory query performance +- Result processing throughput +- Resource utilization + +## Risks and Mitigation + +| Risk | Impact | Mitigation | +|------|--------|------------| +| PyRIT API instability | High | Pin to stable version 0.10.0+ | +| SQLite performance issues | Medium | Optimize query patterns, add indexes | +| Memory label collisions | Low | Use namespaced label keys | +| Breaking changes in future PyRIT versions | Medium | Maintain abstraction layer | + +## Success Criteria + +1. ✅ All attack strategies execute via FoundryScenario +2. ✅ Context preservation works across scan sessions +3. ✅ Results match existing format (backward compatible) +4. ✅ Performance meets SLA (<5% degradation) +5. ✅ Test coverage >90% +6. ✅ Zero breaking changes to public API + +## Appendix + +### PyRIT 0.10.0 Breaking Changes Reference + +| Old API | New API | Impact | +|---------|---------|--------| +| `DUCK_DB` | `SQLITE` | High - required change | +| `PromptRequestPiece` | `MessagePiece` | High - data model change | +| `get_prompt_request_pieces()` | `get_message_pieces()` | High - method rename | +| `memory_db_type=DUCK_DB` | `memory_db_type=SQLITE, memory_db_path=...` | High - signature change | + +### Variable Naming Conventions + +Use consistent terminology in code: +- `message_pieces` (not `prompt_request_pieces`) +- `piece` (not `prompt`) +- `get_message_pieces()` (not `get_prompt_request_pieces()`) + +### Dependencies + +```python +# requirements.txt +pyrit>=0.10.0 +``` + +--- + +**Document Version History:** +- v3.0 (2025-12-18): Updated for PyRIT 0.10.0 breaking changes +- v2.0 (2025-11-15): Added context preservation strategy +- v1.0 (2025-10-01): Initial specification