-
Notifications
You must be signed in to change notification settings - Fork 933
Feature/tesseract anomaly detection #403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vmodak-nv
wants to merge
2
commits into
NVIDIA:main
Choose a base branch
from
vmodak-nv:feature/tesseract-anomaly-detection
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
327 changes: 327 additions & 0 deletions
327
industries/asset_lifecycle_management_agent/configs/config-reasoning-tesseract.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,327 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2023-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| general: | ||
| use_uvloop: true | ||
| telemetry: | ||
| logging: | ||
| console: | ||
| _type: console | ||
| level: DEBUG | ||
| # level: INFO | ||
| # file: | ||
| # _type: file | ||
| # path: "alm.log" | ||
| # level: DEBUG | ||
| # tracing: | ||
| # phoenix: | ||
| # _type: phoenix | ||
| # endpoint: http://localhost:6006/v1/traces | ||
| # project: alm-agent | ||
| # catalyst: | ||
| # _type: catalyst | ||
| # project: "alm-agent" | ||
| # dataset: "alm-agent" | ||
|
|
||
| llms: | ||
| # SQL query generation model | ||
| sql_llm: | ||
| _type: nim | ||
| model_name: "qwen/qwen2.5-coder-32b-instruct" | ||
|
|
||
| # Data analysis and tool calling model | ||
| analyst_llm: | ||
| _type: nim | ||
| model_name: "qwen/qwen3-coder-480b-a35b-instruct" | ||
|
|
||
| # Python code generation model | ||
| coding_llm: | ||
| _type: nim | ||
| model_name: "qwen/qwen2.5-coder-32b-instruct" | ||
|
|
||
| # Main reasoning and planning model | ||
| reasoning_llm: | ||
| _type: nim | ||
| model_name: "nvidia/llama-3.3-nemotron-super-49b-v1" | ||
|
|
||
| # Multimodal evaluation model (Vision-Language Model) | ||
| multimodal_judging_llm: | ||
| _type: nim | ||
| model_name: nvidia/llama-3.1-nemotron-nano-vl-8b-v1 | ||
|
|
||
| embedders: | ||
| # Text embedding model for vector database operations | ||
| vanna_embedder: | ||
| _type: nim | ||
| model_name: "nvidia/llama-3_2-nv-embedqa-1b-v2" | ||
|
|
||
| functions: | ||
| sql_retriever: | ||
| _type: generate_sql_query_and_retrieve_tool | ||
| llm_name: "sql_llm" | ||
| embedding_name: "vanna_embedder" | ||
| # Vector store configuration | ||
| vector_store_type: "chromadb" # Optional, chromadb is default | ||
| vector_store_path: "database" | ||
| # Database configuration | ||
| db_type: "sqlite" # Optional, sqlite is default | ||
| db_connection_string_or_path: "database/nasa_turbo.db" | ||
| # Output configuration | ||
| output_folder: "output_data" | ||
| vanna_training_data_path: "vanna_training_data.yaml" | ||
|
|
||
| predict_rul: | ||
| _type: predict_rul_tool | ||
| output_folder: "output_data" | ||
| scaler_path: "models/scaler_model.pkl" | ||
| model_path: "models/xgb_model_fd001.pkl" | ||
|
|
||
| anomaly_detection: | ||
| _type: nv_tesseract_anomaly_detection_tool | ||
| nim_endpoint: "http://localhost:8001" | ||
| timeout: 120 | ||
| output_folder: "output_data" | ||
| custom_threshold: 3.0 # Lower threshold to catch gradual degradation (default: None for NIM auto-threshold) | ||
|
|
||
| plot_distribution: | ||
| _type: plot_distribution_tool | ||
| output_folder: "output_data" | ||
|
|
||
| plot_line_chart: | ||
| _type: plot_line_chart_tool | ||
| output_folder: "output_data" | ||
|
|
||
| plot_comparison: | ||
| _type: plot_comparison_tool | ||
| output_folder: "output_data" | ||
|
|
||
| plot_anomaly: | ||
| _type: plot_anomaly_tool | ||
| output_folder: "output_data" | ||
|
|
||
| code_generation_assistant: | ||
| _type: code_generation_assistant | ||
| llm_name: "coding_llm" | ||
| code_execution_tool: "code_execution" | ||
| verbose: true | ||
|
|
||
| code_execution: | ||
| _type: code_execution | ||
| uri: http://127.0.0.1:6000/execute | ||
| sandbox_type: "local" | ||
| max_output_characters: 2000 | ||
|
|
||
| data_analysis_assistant: | ||
| _type: react_agent | ||
| llm_name: "analyst_llm" | ||
| max_iterations: 20 | ||
| max_retries: 2 | ||
| tool_names: [ | ||
| "sql_retriever", | ||
| "predict_rul", | ||
| "plot_distribution", | ||
| "plot_line_chart", | ||
| "plot_comparison", | ||
| "anomaly_detection", | ||
| "plot_anomaly", | ||
| "code_generation_assistant" | ||
| ] | ||
| parse_agent_response_max_retries: 2 | ||
| system_prompt: | | ||
| ### TASK DESCRIPTION #### | ||
| You are a helpful data analysis assistant specializing in Asset Lifecycle Management tasks, currently focused on predictive maintenance for turbofan engines. | ||
| **USE THE PROVIDED PLAN THAT FOLLOWS "Here is the plan that you could use if you wanted to.."** | ||
|
|
||
| ### TOOLS ### | ||
| You can use the following tools to help with your task: | ||
| {tools} | ||
|
|
||
| ### RESPONSE FORMAT ### | ||
| **STRICTLY RESPOND IN EITHER OF THE FOLLOWING FORMATS**: | ||
|
|
||
| **FORMAT 1 (to share your thoughts)** | ||
| Input plan: Summarize all the steps in the plan. | ||
| Executing step: the step you are currently executing from the plan along with any instructions provided | ||
| Thought: describe how you are going to execute the step | ||
|
|
||
| **FORMAT 2 (to return the final answer)** | ||
| Input plan: Summarize all the steps in the plan. | ||
| Executing step: the step you are currently executing from the plan along with any instructions provided | ||
| Thought: describe how you are going to execute the step | ||
| Final Answer: the final answer to the original input question including the absolute file paths of the generated files with | ||
| `./output_data/` prepended to the filename. | ||
|
|
||
| **FORMAT 3 (when using a tool)** | ||
| Input plan: Summarize all the steps in the plan. | ||
| Executing step: the step you are currently executing from the plan along with any instructions provided | ||
| Thought: describe how you are going to execute the step | ||
| Action: the action to take, should be one of [{tool_names}] | ||
| Action Input: the input to the tool (if there is no required input, include "Action Input: None") | ||
| Observation: wait for the tool to finish execution and return the result | ||
|
|
||
| ### HOW TO CHOOSE THE RIGHT TOOL ### | ||
| Follow these guidelines while deciding the right tool to use: | ||
| **CRITICAL: When writing Action: tool_name, use PLAIN TEXT ONLY. Do NOT use markdown formatting like **tool_name**. Just write the tool name directly.** | ||
| **Ensure that tool calls do not use single quotes or double quotes within the key-value pairs.** | ||
|
|
||
| 1. **SQL Retrieval Tool** | ||
| - Use this tool to retrieve data from the database. | ||
| - NEVER generate SQL queries by yourself, instead pass the top-level instruction to the tool. | ||
|
|
||
| 2. **Prediction Tools** | ||
| - Use predict_rul for RUL prediction requests. | ||
| - Always call data retrieval tool to get sensor data before predicting RUL. | ||
|
|
||
| 3. **Analysis and Plotting Tools** | ||
| - plot_line_chart: to plot line charts between two columns of a dataset. | ||
| - plot_distribution: to plot a histogram/distribution analysis of a column. | ||
| - plot_comparison: to compare two columns of a dataset by plotting both of them on the same chart. | ||
|
|
||
| 4. **Anomaly Detection Tools** | ||
| - Use anomaly_detection tool for production-grade anomaly detection using NV Tesseract foundation model via NVIDIA NIM. | ||
| - **REQUIRES JSON DATA**: First use sql_retriever to get sensor data, then pass the JSON file path to anomaly_detection. | ||
| - **OUTPUT**: Creates enhanced sensor data with added 'is_anomaly' boolean column. | ||
| - Use plot_anomaly to create interactive visualizations of anomaly detection results. | ||
|
|
||
| 5. **Code Generation Guidelines** | ||
| When using code_generation_assistant, provide comprehensive instructions in a single parameter: | ||
| • Include complete task description with user context and requirements | ||
| • Specify available data files and their structure (columns, format, location) | ||
| • Combine multiple related tasks into bullet points within one instruction | ||
| • Mention specific output requirements (HTML files, JSON data, visualizations) | ||
| • The tool automatically generates and executes Python code, returning results and file paths. | ||
|
|
||
| 6. **File Path Handling** | ||
| - When giving instructions to the code_generation_assistant, use only the filename itself (for example, filename.json). Do not include any folder paths, since the code_generation_assistant already operates within the outputs directory. | ||
|
|
||
| ### TYPICAL WORKFLOW FOR EXECUTING A PLAN ### | ||
|
|
||
| First, Data Extraction and analysis | ||
| - Use SQL retrieval tool to fetch required data | ||
| - **Use code_generation_assistant to perform any data processing using Python code ONLY IF INSTRUCTED TO DO SO.** | ||
| Finally, Data Visualization | ||
| - Use existing plotting tools to generate plots | ||
| - Use predict_rul, anomaly_detection or any other relevant tools to perform analysis | ||
| Finally, return the result to the user | ||
| - Return processed information to calling agent. | ||
|
|
||
| workflow: | ||
| _type: reasoning_agent | ||
| augmented_fn: "data_analysis_assistant" | ||
| llm_name: "reasoning_llm" | ||
| verbose: true | ||
| reasoning_prompt_template: | | ||
| ### DESCRIPTION ### | ||
| You are a Data Analysis Reasoning and Planning Expert specialized in Asset Lifecycle Management, with expertise in analyzing turbofan engine sensor data and predictive maintenance tasks. | ||
| You are tasked with creating detailed execution plans for addressing user queries while being conversational and helpful. | ||
|
|
||
| Your Role and Capabilities:** | ||
| - Expert in Asset Lifecycle Management, turbofan engine data analysis, predictive maintenance, and anomaly detection | ||
| - Provide conversational responses while maintaining technical accuracy | ||
| - Create step-by-step execution plans using available tools which will be invoked by a data analysis assistant | ||
|
|
||
| **You are given a data analysis assistant to execute your plan, all you have to do is generate the plan** | ||
| DO NOT USE MARKDOWN FORMATTING IN YOUR RESPONSE. | ||
|
|
||
| ### ASSISTANT DESCRIPTION ### | ||
| {augmented_function_desc} | ||
|
|
||
| ### TOOLS AVAILABLE TO THE ASSISTANT ### | ||
| {tools} | ||
|
|
||
| ### CONTEXT ### | ||
| You work with turbofan engine sensor data from multiple engines in a fleet. The data contains: | ||
| - **Time series data** from different engines, each with unique wear patterns and operational history separated into | ||
| four datasets (FD001, FD002, FD003, FD004), each dataset is further divided into training and test subsets. | ||
| - **26 data columns**: unit number, time in cycles, 3 operational settings, and 21 sensor measurements | ||
| - **Engine lifecycle**: Engines start operating normally, then develop faults that grow until system failure | ||
| - **Asset Lifecycle Management - Operation & Maintenance Phase**: Predict Remaining Useful Life (RUL) - how many operational cycles before failure | ||
| - **Data characteristics**: Contains normal operational variation, sensor noise, and progressive fault development | ||
| This context helps you understand user queries about engine health, sensor patterns, failure prediction, and maintenance planning. | ||
| REMEMBER TO RELY ON DATA ANALYSIS ASSITANT TO RETRIEVE DATA FROM THE DATABASE. | ||
|
|
||
| ### SPECIAL CONSTRAINTS ### | ||
| Create execution plans for Asset Lifecycle Management tasks (currently focused on predictive maintenance and sensor data analysis). For other queries, use standard reasoning. | ||
| Apply piecewise RUL transformation to the actual RUL values when plotting it against predicted RUL values using the code generation assistant. | ||
|
|
||
| ### GUIDELINES ### | ||
| **DO NOT use predict_rul tool to fetch RUL data unless the user explicitly uses the word "predict" or something similar, this is because there is also ground truth RUL data in the database which the user might request sometimes.** | ||
| **REMEMBER: SQL retrieval tool is smart enough to understand queries like counts, totals, basic facts etc. It can use UNIQUE(), COUNT(), SUM(), AVG(), MIN(), MAX() to answer simple queries. NO NEED TO USE CODE GENERATION ASSISTANT FOR SIMPLE QUERIES.** | ||
| **CODE GENERATION ASSISTANT IS COSTLY AND UNRELIABLE MOST OF THE TIMES. SO PLEASE USE IT ONLY FOR COMPLEX QUERIES THAT REQUIRE DATA PROCESSING AND VISUALIZATION.** | ||
|
|
||
| **User Input:** | ||
| {input_text} | ||
|
|
||
| Analyze the input and create an appropriate execution plan in bullet points. | ||
|
|
||
| eval: | ||
| general: | ||
| output: | ||
| dir: "eval_output" | ||
| cleanup: true | ||
| dataset: | ||
| _type: json | ||
| file_path: "eval_data/eval_set_master.json" | ||
| query_delay: 10 # seconds between queries | ||
| max_concurrent: 1 # process queries sequentially | ||
|
|
||
| evaluators: | ||
| multimodal_eval: | ||
| _type: multimodal_llm_judge_evaluator | ||
| llm_name: "multimodal_judging_llm" | ||
| judge_prompt: | | ||
| You are an expert evaluator for Asset Lifecycle Management agentic workflows, with expertise in predictive maintenance tasks. | ||
| Your task is to evaluate how well a generated response (which may include both text and visualizations) | ||
| matches the reference answer for a given question. | ||
|
|
||
| Question: {question} | ||
| Reference Answer: {reference_answer} | ||
| Generated Response: {generated_answer} | ||
|
|
||
| IMPORTANT: You MUST provide your response ONLY as a valid JSON object. | ||
| Do not include any text before or after the JSON. | ||
|
|
||
| # EVALUATION LOGIC | ||
| Your evaluation mode is determined by whether actual plot images are attached to this message: | ||
| - If PLOT IMAGES are attached → Perform ONLY PLOT EVALUATION by examining the actual plot images | ||
| - If NO IMAGES are attached → Perform ONLY TEXT EVALUATION of the text response | ||
|
|
||
| DO NOT confuse text mentions of plots/files with actual attached images. | ||
| Only evaluate plots if you can actually see plot images in this message. | ||
|
|
||
| ## TEXT EVALUATION (only when no images are attached) | ||
| Check if the generated text answer semantically matches the reference answer: | ||
| - 1.0: Generated answer fully matches the reference answer semantically | ||
| - 0.5: Generated answer partially matches with some missing/incorrect elements | ||
| - 0.0: Generated answer does not match the reference answer semantically | ||
|
|
||
| ## PLOT EVALUATION (only when images are attached) | ||
| Use the reference answer as expected plot description and check how well the actual plot matches: | ||
| - 1.0: Generated plot shows all major elements described in the reference answer | ||
| - 0.5: Generated plot shows some elements but missing significant aspects | ||
| - 0.0: Generated plot does not match the reference answer description | ||
|
|
||
| # RESPONSE FORMAT | ||
| You MUST respond with ONLY this JSON format: | ||
| {{ | ||
| "score": 0.0, | ||
| "reasoning": "EVALUATION TYPE: [TEXT or PLOT] - [your analysis and score with justification]" | ||
| }} | ||
|
|
||
| CRITICAL REMINDER: | ||
| - If images are attached → Use "EVALUATION TYPE: PLOT" | ||
| - If no images → Use "EVALUATION TYPE: TEXT" | ||
|
|
||
| Replace the score with your actual evaluation (0.0, 0.5, or 1.0). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_type: moment_anomaly_detection_tool
config: ...