Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# How Triage Preemptive Tool Calls Resolve Arguments for `device_lookup`

## Overview

This guide explains how the triage preemptive tool call determines the correct arguments for the `device_lookup` function, how it uses function signatures and docstrings, and what known issues and caveats exist (particularly around category values such as `smartphones` vs `phones`).

## Prerequisites

Before using or modifying the triage preemptive tool call behavior, you should:

- Be familiar with:
- Python function signatures and docstrings (`__doc__`).
- How tools are registered and exposed to the triage/infobot system (for example, via `TOOLS_BY_ROUTINE` or similar registries).
- The initialization flow of:
- `KnowledgeBase.init`
- The FastAPI application startup
- Swarm (or equivalent orchestration layer)
- Have access to and be able to read the relevant codebase, including:
- `libraries/nectar/nectar/util.py` (specifically around line 224).
- The `infobot_tools` module.
- `engine.py` where `KnowledgeBase.init` is invoked.

## How Argument Resolution Works

### 1. Argument Extraction from Function Signature

The triage preemptive tool call primarily determines the arguments for `device_lookup` from the function’s Python signature, not from the docstring.

- The tool registration logic inspects the function object for:
- Parameter names
- Parameter types (if annotated)
- Default values
- These are then converted into the tool schema (for example, via a helper such as `function_to_json` or equivalent logic in `nectar/util.py`).

Implication:
Even if the docstring is not yet set or is empty, the argument list itself is still derived correctly from the function signature.

### 2. Use of Docstrings for Descriptions and Semantic Hints

The docstring of `device_lookup` is used to provide:

- Descriptions of the function and its parameters.
- Semantic hints such as:
- Valid categories (for example, `phones` vs `smartphones`).
- Brand or category filtering behavior.

The current behavior is:

- **Signature** → Source of argument names and structure.
- **Docstring** → Source of argument descriptions and domain-specific hints (such as which category labels to use).

This explains why the system can still call `device_lookup` with the correct argument names, but may choose the wrong category value (for example, `smartphones` instead of `phones`) if the docstring is not available or not correctly initialized.

### 3. Initialization Order and Docstring Assignment

There is a critical dependency on when the docstring for `device_lookup` is set relative to when the knowledge base and tools are initialized:

- In one implementation:
- `__doc__` for `device_lookup` is set at the **top level** of the module.
- When the module is imported, the docstring is immediately available.
- When the function is added to `TOOLS_BY_ROUTINE` (or similar), the docstring is already populated.

- In the `infobot_tools` implementation:
- The docstring (`__doc__`) is set **inside an `init` function**, not at the top level.
- `KnowledgeBase.init` is called from `engine.py` when FastAPI starts, before Swarm is initialized.
- If `KnowledgeBase.init` (or any tool registration logic) runs **before** `init` in `infobot_tools` is called, then:
- The `device_lookup` function is registered with an **empty or default docstring**.
- Argument descriptions and category hints from the docstring are not available at registration time.

This ordering issue is the likely cause of incorrect category suggestions (for example, `smartphones` instead of `phones`), because the tool schema is built before the docstring-based metadata is applied.

## Important Notes and Caveats

1. **Docstring Timing Matters**
The docstring must be set **before** the function is registered as a tool. If `__doc__` is assigned inside an initialization function that runs later, the tool registry will not see the updated docstring.

2. **Function Imports Do Not Re-Apply Docstrings**
Importing the function from `infobot_tools` executes top-level code only. If the docstring is set inside `init` and `init` is not called before tool registration, the docstring will remain empty at registration time.

3. **Differences Between Implementations**
There are known differences between:
- The “core” or reference implementation (for example, in `nectar/util.py`).
- The `infobot_tools` implementation.

These differences include:
- Where and when `__doc__` is set.
- How and when the knowledge base is initialized.
- Potentially, whether category/brand filtering is implemented in the same way.

4. **Recent Hotfixes May Not Be Final**
A recent change was added to address this issue (ensuring `KnowledgeBase.init` runs at FastAPI startup and/or adjusting when docstrings are set). The author expressed low confidence that this is a robust or long-term fix. A better abstraction for tools is desired.

5. **Enum-Based Categories as a Future Improvement**
An “enum trick” was mentioned as a potential solution. Using enumerations for categories would:
- Make valid category values explicit.
- Reduce reliance on docstring parsing for category semantics.
- Improve robustness of category selection (for example, enforcing `phones` instead of `smartphones`).

## Troubleshooting

### Symptom: Category is `smartphones` Instead of `phones`

**Observed issue**
The triage tool call is producing a category of `smartphones` instead of the expected `phones`. The correct category information is known to be present in the `device_lookup` docstring.

**Likely cause**
The docstring for `device_lookup` is not set at the time the tool is registered, due to initialization order:

- `KnowledgeBase.init` (or equivalent tool registration) runs.
- `device_lookup` is added to `TOOLS_BY_ROUTINE` while its `__doc__` is empty.
- Later, `init` in `infobot_tools` sets the docstring, but the tool schema has already been built.

**Steps to diagnose**

1. **Check where `__doc__` is set for `device_lookup`**
- Confirm whether `__doc__` is assigned at the top level of the module or inside an `init` function.
- If it is inside `init`, note that it may be too late for tool registration.

2. **Verify initialization order**
- Confirm when `KnowledgeBase.init` is called in `engine.py` relative to:
- FastAPI application startup.
- Any `init` function in `infobot_tools` that sets docstrings.
- Ensure that the function that sets `__doc__` is executed **before** `KnowledgeBase.init`.

3. **Inspect the tool registry at runtime**
- After FastAPI starts and before any requests, inspect `TOOLS_BY_ROUTINE` (or equivalent) to see:
- Whether `device_lookup` is present.
- What description and argument metadata it has.
- Confirm whether the category hints from the docstring are present in the tool metadata.

4. **Compare with the reference implementation**
- Review the logic around line 224 in `libraries/nectar/nectar/util.py` to see:
- How function signatures and docstrings are converted into tool schemas.
- Whether your `infobot_tools` implementation diverges in a way that affects docstring usage.

**Potential fixes**

- **Move docstring assignment to top level**
Set `device_lookup.__doc__` at module import time, not inside `init`, so it is always available when the function is imported and registered.

- **Ensure `init` runs before tool registration**
If you must keep docstring assignment inside `init`, call `init` before `KnowledgeBase.init` or any tool registration logic.

- **Introduce explicit enums for categories**
Replace or supplement docstring-based category hints with explicit enumerations in the function signature or tool schema. This reduces reliance on docstring timing and parsing.

## Additional Information Needed

To fully document and validate the behavior, the following would be helpful:

- The exact implementation of:
- The function that converts Python functions into tool schemas (for example, `function_to_json`).
- The `KnowledgeBase.init` method and how it registers tools.
- The `infobot_tools` `init` function and where `device_lookup.__doc__` is set.
- A definitive list of valid categories and how they are intended to be enforced (for example, via enums, constants, or docstring conventions).
- Confirmation of the final, agreed-upon abstraction for tools (for example, a standardized way to specify argument descriptions and allowed values outside of docstrings).

---
*Source: [Original Slack thread](https://distylai.slack.com/archives/impl-tower-infobot/p1737151724894869)*