diff --git a/content/en/llm_observability/instrumentation/api.md b/content/en/llm_observability/instrumentation/api.md
index 63cf4aaeaa1..83abc4859ff 100644
--- a/content/en/llm_observability/instrumentation/api.md
+++ b/content/en/llm_observability/instrumentation/api.md
@@ -153,6 +153,8 @@ If the request is successful, the API responds with a 202 network code and an em
| messages| [Message](#message) | List of messages. This should only be used for LLM spans. |
| documents| [Document](#document) | List of documents. This should only be used as the output for retrieval spans |
| prompt | [Prompt](#prompt) | Structured prompt metadata that includes the template and variables used for the LLM input. This should only be used for input IO on LLM spans. |
+| embedding | [float] | Vector embedding representation of the input or output. |
+| parameters | Dict[key (string), any] | Additional parameters associated with the input or output. |
**Note**: When only `input.messages` is set for an LLM span, Datadog infers `input.value` from `input.messages` and uses the following inference logic:
@@ -164,8 +166,10 @@ If the request is successful, the API responds with a 202 network code and an em
| Field | Type | Description |
|----------------------|--------|--------------------------|
-| content [*required*] | string | The body of the message. |
+| content | string | The body of the message. |
| role | string | The role of the entity. |
+| tool_calls | [[ToolCall](#toolcall)] | List of tool calls made in this message. |
+| tool_results | [[ToolResult](#toolresult)] | List of tool results returned in this message. |
#### Document
| Field | Type | Description |
@@ -175,6 +179,24 @@ If the request is successful, the API responds with a 202 network code and an em
| score | float | The score associated with this document. |
| id | string | The id of this document. |
+#### ToolCall
+
+| Field | Type | Description |
+|----------------------|--------|--------------------------|
+| name | string | The name of the tool being called. |
+| arguments | Dict[key (string), any] | Arguments passed to the tool. |
+| tool_id | string | Unique identifier for this tool call. |
+| type | string | The type of tool call. |
+
+#### ToolResult
+
+| Field | Type | Description |
+|----------------------|--------|--------------------------|
+| name | string | The name of the tool that returned this result. |
+| result | string | The result returned by the tool. |
+| tool_id | string | Unique identifier matching the tool call. |
+| type | string | The type of tool result. |
+
#### Prompt
LLM Observability registers new versions of templates when the template or chat_template value is updated. If the input is expected to change between invocations, extract the dynamic parts into a variable.
@@ -183,6 +205,7 @@ If the request is successful, the API responds with a 202 network code and an em
{{% tab "Model" %}}
| Field | Type | Description |
|----------------------|--------|--------------------------|
+| name | string | The name of the prompt. |
| id | string | Logical identifier for this prompt template. Should be unique per `ml_app`. |
| version | string | Version tag for the prompt (for example, "1.0.0"). If not provided, LLM Observability automatically generates a version by computing a hash of the template content. |
| template | string | Single string template form. Use placeholder syntax (like `{{variable_name}}`) to embed variables. This should not be set with `chat_template`. |
@@ -222,22 +245,32 @@ If the request is successful, the API responds with a 202 network code and an em
| error | [Error](#error) | Error information on the span. |
| input | [IO](#io) | The span's input information. |
| output | [IO](#io) | The span's output information. |
+| expected_output | [IO](#io) | The expected output for the span. Used for evaluation purposes. |
| metadata | Dict[key (string), value] where the value is a float, bool, or string | Data about the span that is not input or output related. Use the following metadata keys for LLM spans: `temperature`, `max_tokens`, `model_name`, and `model_provider`. |
+| tool_definitions | [[ToolDefinition](#tooldefinition)] | List of tools available for use in the LLM request. |
+| intent | string | The intent or purpose of the span. |
#### Metrics
| Field | Type | Description |
|------------------------|---------|--------------|
+| prompt_tokens | float64 | The number of prompt tokens. **Only valid for LLM spans.** |
+| completion_tokens | float64 | The number of completion tokens. **Only valid for LLM spans.** |
| input_tokens | float64 | The number of input tokens. **Only valid for LLM spans.** |
| output_tokens | float64 | The number of output tokens. **Only valid for LLM spans.** |
+| reasoning_output_tokens | float64 | The number of reasoning output tokens. **Only valid for LLM spans.** |
| total_tokens | float64 | The total number of tokens associated with the span. **Only valid for LLM spans.** |
| time_to_first_token | float64 | The time in seconds it takes for the first output token to be returned in streaming-based LLM applications. Set for root spans. |
| time_per_output_token | float64 | The time in seconds it takes for the per output token to be returned in streaming-based LLM applications. Set for root spans. |
-| input_cost | float64 | The input cost in dollars. **Only valid for LLM and embedding spans.** |
-| output_cost | float64 | The output cost in dollars. **Only valid for LLM spans.** |
-| total_cost | float64 | The total cost in dollars. **Only valid for LLM spans.** |
-| non_cached_input_cost | float64 | The non cached input cost in dollars. **Only valid for LLM spans.** |
-| cache_read_input_cost | float64 | The cache read input cost in dollars. **Only valid for LLM spans.** |
-| cache_write_input_cost | float64 | The cache write input cost in dollars. **Only valid for LLM spans.** |
+| estimated_input_cost | int64 | The estimated input cost in dollars. **Only valid for LLM and embedding spans.** |
+| estimated_output_cost | int64 | The estimated output cost in dollars. **Only valid for LLM spans.** |
+| estimated_total_cost | int64 | The estimated total cost in dollars. **Only valid for LLM spans.** |
+| cache_read_input_tokens | int64 | The number of cache read input tokens. **Only valid for LLM spans.** |
+| cache_write_input_tokens | int64 | The number of cache write input tokens. **Only valid for LLM spans.** |
+| non_cached_input_tokens | int64 | The number of non-cached input tokens. **Only valid for LLM spans.** |
+| estimated_cache_read_input_cost | int64 | The estimated cache read input cost in dollars. **Only valid for LLM spans.** |
+| estimated_cache_write_input_cost | int64 | The estimated cache write input cost in dollars. **Only valid for LLM spans.** |
+| estimated_non_cached_input_cost | int64 | The estimated non-cached input cost in dollars. **Only valid for LLM spans.** |
+| estimated_reasoning_output_cost | int64 | The estimated reasoning output cost in dollars. **Only valid for LLM spans.** |
#### Span
@@ -255,6 +288,17 @@ If the request is successful, the API responds with a 202 network code and an em
| metrics | [Metrics](#metrics) | Datadog metrics to collect. |
| session_id | string | The span's `session_id`. Overrides the top-level `session_id` field. |
| tags | [[Tag](#tag)] | A list of tags to apply to this particular span. |
+| service | string | The service name associated with the span. |
+| ml_app | string | The ML application name. Overrides the top-level `ml_app` field. |
+| ml_app_version | string | The ML application version. |
+
+#### ToolDefinition
+
+| Field | Type | Description |
+|-------------|-------------------|---------------------|
+| name | string | The name of the tool. |
+| description | string | The description of the tool's function. |
+| schema | Dict[key (string), any] | The schema defining the arguments the tool accepts. |
#### SpansRequestData
| Field | Type | Description |
@@ -466,6 +510,7 @@ Evaluations must be joined to a unique span. You can identify the target span us
| join_on [*required*] | [[JoinOn](#joinon)] | How the evaluation is joined to a span. |
| timestamp_ms [*required*] | int64 | A UTC UNIX timestamp in milliseconds representing the time the request was sent. |
| ml_app [*required*] | string | The name of your LLM application. See [Application naming guidelines](#application-naming-guidelines). |
+| ml_app_version | string | The version of the ML application that produced this metric. |
| metric_type [*required*] | string | The type of evaluation: `"categorical"`, `"score"`, or `"boolean"`. |
| label [*required*] | string | The unique name or label for the provided evaluation . |
| categorical_value [*required if the metric_type is "categorical"*] | string | A string representing the category that the evaluation belongs to. |
@@ -474,6 +519,7 @@ Evaluations must be joined to a unique span. You can identify the target span us
| assessment | string | An assessment of this evaluation. Accepted values are `pass` and `fail`. |
| reasoning | string | A text explanation of the evaluation result. |
| tags | [[Tag](#tag)] | A list of tags to apply to this particular evaluation metric. |
+| metadata | Dict[key (string), any] | Additional metadata to attach to the evaluation metric. |
#### JoinOn