Skip to content

Commit 63d7bce

Browse files
committed
Merge branch 'feature/error-analyzer-7' into 'develop'
Feature: Integrated X-Ray tool with error analyzer agent See merge request genaiic-reusable-assets/engagement-artifacts/genaiic-idp-accelerator!367
2 parents 4fb9055 + d781b61 commit 63d7bce

File tree

39 files changed

+2595
-1001
lines changed

39 files changed

+2595
-1001
lines changed

config_library/pattern-1/lending-package-sample/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,19 @@ agents:
242242
- Extract log_group, log_stream, and events data from tool response
243243
- Show complete log group and log stream names without truncation
244244
- Present actual log messages from events array in code blocks
245-
245+
246+
ANALYSIS GUIDELINES:
247+
- If has_performance_issues is false, focus on application logic errors
248+
- Use service timeline to rule out infrastructure bottlenecks
249+
- Service response times help eliminate timeout-related causes
250+
- For application errors use CloudWatch error messages for recommendations
251+
252+
ROOT CAUSE DETERMINATION:
253+
- Start with Step Function failure details (most specific)
254+
- Validate with CloudWatch error logs (most detailed)
255+
- Use X-Ray to categorize as infrastructure vs. application issue
256+
- DynamoDB provides supporting timeline context only
257+
246258
RECOMMENDATION GUIDELINES:
247259
For code-related issues or system bugs:
248260
- Do not suggest code modifications

config_library/pattern-2/bank-statement-sample/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -703,7 +703,19 @@ agents:
703703
- Extract log_group, log_stream, and events data from tool response
704704
- Show complete log group and log stream names without truncation
705705
- Present actual log messages from events array in code blocks
706-
706+
707+
ANALYSIS GUIDELINES:
708+
- If has_performance_issues is false, focus on application logic errors
709+
- Use service timeline to rule out infrastructure bottlenecks
710+
- Service response times help eliminate timeout-related causes
711+
- For application errors use CloudWatch error messages for recommendations
712+
713+
ROOT CAUSE DETERMINATION:
714+
- Start with Step Function failure details (most specific)
715+
- Validate with CloudWatch error logs (most detailed)
716+
- Use X-Ray to categorize as infrastructure vs. application issue
717+
- DynamoDB provides supporting timeline context only
718+
707719
RECOMMENDATION GUIDELINES:
708720
For code-related issues or system bugs:
709721
- Do not suggest code modifications

config_library/pattern-2/criteria-validation/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,19 @@ agents:
325325
- Extract log_group, log_stream, and events data from tool response
326326
- Show complete log group and log stream names without truncation
327327
- Present actual log messages from events array in code blocks
328-
328+
329+
ANALYSIS GUIDELINES:
330+
- If has_performance_issues is false, focus on application logic errors
331+
- Use service timeline to rule out infrastructure bottlenecks
332+
- Service response times help eliminate timeout-related causes
333+
- For application errors use CloudWatch error messages for recommendations
334+
335+
ROOT CAUSE DETERMINATION:
336+
- Start with Step Function failure details (most specific)
337+
- Validate with CloudWatch error logs (most detailed)
338+
- Use X-Ray to categorize as infrastructure vs. application issue
339+
- DynamoDB provides supporting timeline context only
340+
329341
RECOMMENDATION GUIDELINES:
330342
For code-related issues or system bugs:
331343
- Do not suggest code modifications

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1482,7 +1482,19 @@ agents:
14821482
- Extract log_group, log_stream, and events data from tool response
14831483
- Show complete log group and log stream names without truncation
14841484
- Present actual log messages from events array in code blocks
1485-
1485+
1486+
ANALYSIS GUIDELINES:
1487+
- If has_performance_issues is false, focus on application logic errors
1488+
- Use service timeline to rule out infrastructure bottlenecks
1489+
- Service response times help eliminate timeout-related causes
1490+
- For application errors use CloudWatch error messages for recommendations
1491+
1492+
ROOT CAUSE DETERMINATION:
1493+
- Start with Step Function failure details (most specific)
1494+
- Validate with CloudWatch error logs (most detailed)
1495+
- Use X-Ray to categorize as infrastructure vs. application issue
1496+
- DynamoDB provides supporting timeline context only
1497+
14861498
RECOMMENDATION GUIDELINES:
14871499
For code-related issues or system bugs:
14881500
- Do not suggest code modifications

config_library/pattern-2/rvl-cdip-package-sample-with-few-shot-examples/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1216,7 +1216,19 @@ agents:
12161216
- Extract log_group, log_stream, and events data from tool response
12171217
- Show complete log group and log stream names without truncation
12181218
- Present actual log messages from events array in code blocks
1219-
1219+
1220+
ANALYSIS GUIDELINES:
1221+
- If has_performance_issues is false, focus on application logic errors
1222+
- Use service timeline to rule out infrastructure bottlenecks
1223+
- Service response times help eliminate timeout-related causes
1224+
- For application errors use CloudWatch error messages for recommendations
1225+
1226+
ROOT CAUSE DETERMINATION:
1227+
- Start with Step Function failure details (most specific)
1228+
- Validate with CloudWatch error logs (most detailed)
1229+
- Use X-Ray to categorize as infrastructure vs. application issue
1230+
- DynamoDB provides supporting timeline context only
1231+
12201232
RECOMMENDATION GUIDELINES:
12211233
For code-related issues or system bugs:
12221234
- Do not suggest code modifications

config_library/pattern-2/rvl-cdip-package-sample/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -941,7 +941,19 @@ agents:
941941
- Extract log_group, log_stream, and events data from tool response
942942
- Show complete log group and log stream names without truncation
943943
- Present actual log messages from events array in code blocks
944-
944+
945+
ANALYSIS GUIDELINES:
946+
- If has_performance_issues is false, focus on application logic errors
947+
- Use service timeline to rule out infrastructure bottlenecks
948+
- Service response times help eliminate timeout-related causes
949+
- For application errors use CloudWatch error messages for recommendations
950+
951+
ROOT CAUSE DETERMINATION:
952+
- Start with Step Function failure details (most specific)
953+
- Validate with CloudWatch error logs (most detailed)
954+
- Use X-Ray to categorize as infrastructure vs. application issue
955+
- DynamoDB provides supporting timeline context only
956+
945957
RECOMMENDATION GUIDELINES:
946958
For code-related issues or system bugs:
947959
- Do not suggest code modifications

config_library/pattern-3/rvl-cdip-package-sample/config.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -800,7 +800,19 @@ agents:
800800
- Extract log_group, log_stream, and events data from tool response
801801
- Show complete log group and log stream names without truncation
802802
- Present actual log messages from events array in code blocks
803-
803+
804+
ANALYSIS GUIDELINES:
805+
- If has_performance_issues is false, focus on application logic errors
806+
- Use service timeline to rule out infrastructure bottlenecks
807+
- Service response times help eliminate timeout-related causes
808+
- For application errors use CloudWatch error messages for recommendations
809+
810+
ROOT CAUSE DETERMINATION:
811+
- Start with Step Function failure details (most specific)
812+
- Validate with CloudWatch error logs (most detailed)
813+
- Use X-Ray to categorize as infrastructure vs. application issue
814+
- DynamoDB provides supporting timeline context only
815+
804816
RECOMMENDATION GUIDELINES:
805817
For code-related issues or system bugs:
806818
- Do not suggest code modifications

lib/idp_common_pkg/idp_common/agents/common/response_utils.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,12 @@ def parse_agent_response(response) -> Dict[str, Any]:
6767
response_str = str(response)
6868
logger.debug(f"Processing AgentResult as string: {response_str[:100]}...")
6969

70+
# Check if response looks like JSON before trying to parse
71+
response_str = response_str.strip()
72+
if not (response_str.startswith("{") or response_str.startswith("```")):
73+
logger.debug("Response doesn't appear to be JSON, returning as text")
74+
return {"responseType": "text", "content": response_str}
75+
7076
# Extract JSON from markdown code blocks if present
7177
json_str = extract_json_from_markdown(response_str)
7278

@@ -77,8 +83,5 @@ def parse_agent_response(response) -> Dict[str, Any]:
7783
)
7884
return parsed_response
7985
except json.JSONDecodeError as e:
80-
logger.error(f"Failed to parse extracted JSON: {e}")
81-
logger.error(f"Full LLM response: {response_str}")
82-
logger.error(f"Extracted content: {json_str}")
83-
# Return a text response with the raw output as fallback
86+
logger.warning(f"Failed to parse as JSON, returning as text: {e}")
8487
return {"responseType": "text", "content": response_str}

lib/idp_common_pkg/idp_common/agents/error_analyzer/config.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ def get_context_limits() -> Dict[str, int]:
9494
"max_events_per_log_group": 5,
9595
"max_log_groups": 20,
9696
"max_stepfunction_timeline_events": 3,
97-
"max_stepfunction_error_length": 200,
97+
"max_stepfunction_error_length": 400,
9898
"time_range_hours_default": 24,
9999
}
100100

@@ -143,11 +143,9 @@ def create_error_response(error: str, **kwargs) -> Dict[str, Any]:
143143
return response
144144

145145

146-
def create_success_response(data: Dict[str, Any]) -> Dict[str, Any]:
147-
"""Creates standardized success response with consistent format."""
148-
response = {"success": True}
149-
response.update(data)
150-
return response
146+
def create_response(data: Dict[str, Any]) -> Dict[str, Any]:
147+
"""Creates standardized response with consistent format."""
148+
return data
151149

152150

153151
def safe_int_conversion(value: Any, default: int = 0) -> int:

lib/idp_common_pkg/idp_common/agents/error_analyzer/tools/__init__.py

Lines changed: 24 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -13,29 +13,33 @@
1313
- Lambda function context extraction
1414
"""
1515

16-
from .cloudwatch_tools import search_document_logs, search_stack_logs
17-
from .document_analysis_tool import analyze_document_failure
18-
from .dynamodb_tools import (
19-
get_document_by_key,
20-
get_document_status,
21-
get_tracking_table_name,
22-
query_tracking_table,
16+
from .cloudwatch_tool import cloudwatch_document_logs, cloudwatch_stack_logs
17+
from .document_analysis_tool import analyze_document_error
18+
from .dynamodb_tool import (
19+
dynamodb_document_record,
20+
dynamodb_document_status,
21+
dynamodb_table_name,
22+
dynamodb_tracking_query,
2323
)
2424
from .error_analysis_tool import analyze_errors
25-
from .general_analysis_tool import analyze_recent_system_errors
26-
from .lambda_tools import get_document_context
27-
from .stepfunction_tools import analyze_stepfunction_execution
25+
from .general_analysis_tool import analyze_general_errors
26+
from .lambda_tool import lambda_document_context
27+
from .stepfunction_tool import stepfunction_execution_details
28+
from .xray_tool import xray_document_analysis, xray_service_map, xray_stack_traces
2829

2930
__all__ = [
3031
"analyze_errors",
31-
"analyze_document_failure",
32-
"analyze_recent_system_errors",
33-
"search_document_logs",
34-
"search_stack_logs",
35-
"get_document_context",
36-
"get_document_by_key",
37-
"get_document_status",
38-
"get_tracking_table_name",
39-
"query_tracking_table",
40-
"analyze_stepfunction_execution",
32+
"analyze_document_error",
33+
"analyze_general_errors",
34+
"cloudwatch_document_logs",
35+
"cloudwatch_stack_logs",
36+
"lambda_document_context",
37+
"dynamodb_document_record",
38+
"dynamodb_document_status",
39+
"dynamodb_table_name",
40+
"dynamodb_tracking_query",
41+
"stepfunction_execution_details",
42+
"xray_document_analysis",
43+
"xray_service_map",
44+
"xray_stack_traces",
4145
]

0 commit comments

Comments
 (0)