Skip to content

Conversation

@apetraru-uipath
Copy link
Contributor

@apetraru-uipath apetraru-uipath commented Dec 23, 2025

Integration Tests for Agent Guardrails

Overview

Adds comprehensive integration tests for agent guardrails, verifying proper invocation and enforcement at Agent, LLM, and Tool scopes.

Files Added

  • tests/cli/test_agent_with_guardrails.py (1325 lines) - 8 integration tests
  • tests/cli/mocks/joke_agent_with_guardrails.py (333 lines) - Mock agent with 6 guardrails
  • tests/cli/mocks/joke_agent_uipath.json (139 lines) - Guardrail configurations
  • tests/cli/mocks/joke_agent_langgraph.json (8 lines) - LangGraph config
  • tests/cli/conftest.py (updated) - Shared fixtures for mocking

Test Coverage

Agent-Level Guardrails

Test Guardrail Stage Action Trigger Validates
test_pii_guardrail_not_triggered PII Detection POST_EXECUTION Block No PII Execution succeeds
test_pii_guardrail_triggered PII Detection POST_EXECUTION Block Email in output Execution blocked

LLM-Level Guardrails

Test Guardrail Stage Action Trigger Validates
test_prompt_injection_guardrail_triggered Prompt Injection PRE_EXECUTION Block Malicious prompt LLM never invoked, execution blocked
test_llm_pii_escalation_guardrail_hitl PII Detection POST_EXECUTION Escalate Email in output HITL triggered, user approves, execution continues
test_llm_pii_escalation_guardrail_rejected PII Detection POST_EXECUTION Escalate Email in output HITL triggered, user rejects, execution stops

Tool-Level Guardrails

Test Guardrail Stage Action Trigger Validates
test_tool_guardrail_filter_output Custom (word match) POST_EXECUTION Filter "donkey" in output Field removed, execution succeeds
test_tool_guardrail_block_execution Custom (word match) PRE_EXECUTION Block "forbidden" in input Tool never invoked, execution blocked
test_tool_pii_guardrail_triggered PII Detection PRE_EXECUTION Block Email in input Tool never invoked, execution blocked

Guardrails Tested

Built-in Validators

  • PII Detection (Agent, LLM & Tool scopes) - Detects Email, Address, Person with 0.5 threshold
  • Prompt Injection (LLM scope) - Detects malicious prompts with 0.5 threshold

Custom Deterministic Guardrails

  • Filter Action - Removes input_phrase field when containing "donkey"
  • Block Action - Blocks tool execution when input contains "forbidden"

Actions Tested

  • Block Action - Stops execution at Agent, LLM, or Tool scope
  • Filter Action - Removes fields from outputs
  • Escalate Action (HITL) - Triggers human approval with both approval and rejection flows

@apetraru-uipath apetraru-uipath force-pushed the chore/tests_for_guardrails branch 4 times, most recently from 185868d to 63e3421 Compare December 23, 2025 20:14
Add comprehensive integration tests for guardrails at different scopes:
- Agent-level guardrails (PII detection)
- LLM-level guardrails (Prompt injection)
- Tool-level guardrails (Filter, Block, and PII detection)

Tests verify that guardrails are properly invoked and block/filter as expected.
@apetraru-uipath apetraru-uipath force-pushed the chore/tests_for_guardrails branch from 63e3421 to 3876794 Compare December 23, 2025 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant