From 934a984d2c9ae650ec97a30f455fad0f96d56738 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Thu, 11 Sep 2025 12:40:06 -0600 Subject: [PATCH 01/12] Getting the basic edits made. --- ...-announcing-major-command-deprecations.mdx | 32 +++++++++++++++++++ fern/pages/models/models.mdx | 15 --------- .../command-beta.mdx | 2 +- .../command-r-plus.mdx | 2 +- .../command-r.mdx | 2 +- .../command-r7b.mdx | 2 +- .../text-generation/structured-outputs.mdx | 4 --- .../text-generation/summarizing-text.mdx | 4 +-- .../v2/text-generation/structured-outputs.mdx | 4 --- fern/v1.yml | 12 ++----- fern/v2.yml | 6 ---- 11 files changed, 41 insertions(+), 44 deletions(-) create mode 100644 fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx diff --git a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx new file mode 100644 index 000000000..3377eb6b5 --- /dev/null +++ b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx @@ -0,0 +1,32 @@ +--- +title: "Announcing Major Command Deprecations" +slug: "changelog/2025-09-15-major-command-deprecations" +createdAt: "Thu Aug 28 2025 00:00:00 (MST)" +hidden: false +description: >- + This announcement covers the release of Command A Translate, Cohere's most powerful translation model. + + +Today, we're announcing: + +we are deprecating the following: +Command +- command +- command-r-03-2024 +- command-r-plus-04-2024 + +Recommendation for users: Users must move over to newer models, i.e. command-r-08-2024, command-r-plus-08-2024, and/or command-a-03-2025command-light +Endpoints: + +- v1/connectors +- v1/chat +- v1/generate +- v1/summarize +- v1/classify + +Recommendation for users: All platform users recommended to move to the v2 API with this change +Other Products & Features: Fine-tuning capabilities across the platform +This means fine-tuning on platform will become unavailable, and meaning that any command-light, command, command-r, rerank, and classify fine-tuned models will become unavailable +command Slack app +Coral Web (Coral.cohere.com) +V1 API Parameters: connectors and search_queries_only in /v1/chat. \ No newline at end of file diff --git a/fern/pages/models/models.mdx b/fern/pages/models/models.mdx index 3a0025441..aa876ffc0 100644 --- a/fern/pages/models/models.mdx +++ b/fern/pages/models/models.mdx @@ -44,15 +44,6 @@ Command is Cohere's default generation model that takes a user instruction (or c | `command-a-translate-08-2025` | Command A Translate is Cohere’s state of the art machine translation model, excelling at a variety of translation tasks on 23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Chinese, Arabic, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian. | Text | 8K | 8k | [Chat](/reference/chat)| | `command-a-reasoning-08-2025` | Command A Reasoning is Cohere's first reasoning model, able to 'think' before generating an output in a way that allows it to perform well in certain kinds of nuanced problem-solving and agent-based tasks in 23 languages. | Text | 256k | 32k | [Chat](/reference/chat)| | `command-a-vision-07-2025` | Command A Vision is our first model capable of processing images, excelling in enterprise use cases such as analyzing charts, graphs, and diagrams, table understanding, OCR, document Q&A, and object detection. It officially supports English, Portuguese, Italian, French, German, and Spanish. | Text, Images | 128K | 8K | [Chat](/reference/chat)| -| `command-r-plus-04-2024` | Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use. | Text | 128k | 4k | [Chat](/reference/chat) | -| `command-r-plus` | `command-r-plus` is an alias for `command-r-plus-04-2024`, so if you use `command-r-plus` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat) | -| `command-r-08-2024` | `command-r-08-2024` is an update of the Command R model, delivered in August 2024. Find more information [here](https://docs.cohere.com/changelog/command-gets-refreshed) | Text | 128k | 4k | [Chat](/reference/chat) | -| `command-r-03-2024` | Command R is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents. | Text | 128k | 4k | [Chat](/reference/chat) | -| `command-r` | `command-r` is an alias for `command-r-03-2024`, so if you use `command-r` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat) | -| `command` | An instruction-following conversational model that performs language tasks with high quality, more reliably and with a longer context than our base generative models. | Text | 4k | 4k | [Chat](/reference/chat),
[Summarize](/reference/summarize) | -| `command-nightly` | To reduce the time between major releases, we put out nightly versions of command models. For `command`, that is `command-nightly`.

Be advised that `command-nightly` is the latest, most experimental, and (possibly) unstable version of its default counterpart. Nightly releases are updated regularly, without warning, and are not recommended for production use. | Text | 128k | 4k | [Chat](/reference/chat) | -| `command-light` | A smaller, faster version of `command`. Almost as capable, but a lot faster. | Text | 4k | 4k | [Chat](/reference/chat),
[Summarize](/reference/summarize-2) | -| `command-light-nightly` | To reduce the time between major releases, we put out nightly versions of command models. For `command-light`, that is `command-light-nightly`.

Be advised that `command-light-nightly` is the latest, most experimental, and (possibly) unstable version of its default counterpart. Nightly releases are updated regularly, without warning, and are not recommended for production use. | Text | 4k | 4k | [Chat](/reference/chat) | ### Using Command Models on Different Platforms @@ -62,12 +53,6 @@ In this table, we provide some important context for using Cohere Command models | :------------------------------- | :------------------------------ | :-------------------- | :----------------------- | :------------------------------- | | `command-a-03-2025` | (Coming Soon) | Unique per deployment | Unique per deployment | `cohere.command-a-03-2025` | | `command-r7b-12-2024` | N/A | N/A | N/A | N/A | -| `command-r-plus` | `cohere.command-r-plus-v1:0` | Unique per deployment | Unique per deployment | `cohere.command-r-plus v1.2` | -| `command-r` | `cohere.command-r-v1:0` | Unique per deployment | Unique per deployment | `cohere.command-r-16k v1.2` | -| `command` | `cohere.command-text-v14` | N/A | N/A | `cohere.command v15.6` | -| `command-nightly` | N/A | N/A | N/A | N/A | -| `command-light` | `cohere.command-light-text-v14` | N/A | N/A | `cohere.command-light v15.6` | -| `command-light-nightly` | N/A | N/A | N/A | N/A | ## Embed diff --git a/fern/pages/models/the-command-family-of-models/command-beta.mdx b/fern/pages/models/the-command-family-of-models/command-beta.mdx index 06e2a34de..0d825add9 100644 --- a/fern/pages/models/the-command-family-of-models/command-beta.mdx +++ b/fern/pages/models/the-command-family-of-models/command-beta.mdx @@ -1,7 +1,7 @@ --- title: Cohere's Command and Command Light slug: docs/command-beta -hidden: false +hidden: true description: >- Cohere's Command offers cutting-edge generative capabilities with weekly updates for improved performance and user feedback. diff --git a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx index 9079f8813..b2aed2597 100644 --- a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx @@ -2,7 +2,7 @@ title: Cohere's Command R+ Model subtitle: Command R+ model details and specifications slug: docs/command-r-plus -hidden: false +hidden: true description: >- Command R+ is Cohere's optimized for conversational interaction and long-context tasks, best suited for complex RAG workflows and multi-step tool use. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg diff --git a/fern/pages/models/the-command-family-of-models/command-r.mdx b/fern/pages/models/the-command-family-of-models/command-r.mdx index f9f8ed048..a9e79b750 100644 --- a/fern/pages/models/the-command-family-of-models/command-r.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r.mdx @@ -2,7 +2,7 @@ title: Cohere's Command R Model subtitle: Command R model details and specifications slug: docs/command-r -hidden: false +hidden: true description: >- Command R is a conversational model that excels in language tasks and supports multiple languages, making it ideal for coding use cases. image: ../../../assets/images/49841d1-cohere_meta_image.jpg diff --git a/fern/pages/models/the-command-family-of-models/command-r7b.mdx b/fern/pages/models/the-command-family-of-models/command-r7b.mdx index 67af56716..e378bb1be 100644 --- a/fern/pages/models/the-command-family-of-models/command-r7b.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r7b.mdx @@ -2,7 +2,7 @@ title: Cohere's Command R7B Model subtitle: Command R7B model details and specifications slug: docs/command-r7b -hidden: false +hidden: true description: >- Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, and agents. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg diff --git a/fern/pages/text-generation/structured-outputs.mdx b/fern/pages/text-generation/structured-outputs.mdx index b39454672..cb2eda028 100644 --- a/fern/pages/text-generation/structured-outputs.mdx +++ b/fern/pages/text-generation/structured-outputs.mdx @@ -19,10 +19,6 @@ Structured Outputs is a feature that forces the LLM’s response to strictly fol Compatible models: - Command A -- Command R 08 2024 -- Command R -- Command R+ 08 2024 -- Command R+ ## How to Use Structured Outputs diff --git a/fern/pages/text-generation/summarizing-text.mdx b/fern/pages/text-generation/summarizing-text.mdx index 54a8bbc29..90a79a4d4 100644 --- a/fern/pages/text-generation/summarizing-text.mdx +++ b/fern/pages/text-generation/summarizing-text.mdx @@ -10,7 +10,7 @@ slug: /docs/summarizing-text Text summarization distills essential information and generates concise snippets from dense documents. With Cohere, you can do text summarization via the Chat endpoint. -Command R and R+ support a 128k context length and Command A supports a 256k context length, so you can pass long documents to be summarized. +Command A supports a 256k context length, so you can pass long documents to be summarized. ## Basic summarization @@ -224,7 +224,7 @@ co.chat( ## Migration from Summarize to Chat Endpoint -To use the Command R/R+ models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint. +To use Cohere models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint. ```python PYTHON # Before diff --git a/fern/pages/v2/text-generation/structured-outputs.mdx b/fern/pages/v2/text-generation/structured-outputs.mdx index 3e2cc9df2..cf7c4908c 100644 --- a/fern/pages/v2/text-generation/structured-outputs.mdx +++ b/fern/pages/v2/text-generation/structured-outputs.mdx @@ -20,10 +20,6 @@ Structured Outputs is a feature that forces the LLM’s response to strictly fol Compatible models: - Command A -- Command R+ 08 2024 -- Command R+ -- Command R 08 2024 -- Command R ## How to Use Structured Outputs diff --git a/fern/v1.yml b/fern/v1.yml index 891baa99d..08501a166 100644 --- a/fern/v1.yml +++ b/fern/v1.yml @@ -64,12 +64,6 @@ navigation: path: pages/models/the-command-family-of-models/command-a-reasoning.mdx - page: Command A Vision path: pages/models/the-command-family-of-models/command-a-vision.mdx - - page: Command R+ - path: pages/models/the-command-family-of-models/command-r-plus.mdx - - page: Command R - path: pages/models/the-command-family-of-models/command-r.mdx - - page: Command and Command Light - path: pages/models/the-command-family-of-models/command-beta.mdx - page: Embed path: pages/models/cohere-embed.mdx - page: Rerank @@ -86,13 +80,13 @@ navigation: - page: Introduction to Text Generation at Cohere path: pages/text-generation/introduction-to-text-generation-at-cohere.mdx - page: Using the Chat API - path: pages/text-generation/chat-api.mdx + path: pages/v2/text-generation/chat-api.mdx - page: Streaming Responses path: pages/text-generation/streaming.mdx - page: Structured Outputs - path: pages/text-generation/structured-outputs.mdx + path: pages/v2/text-generation/structured-outputs.mdx - page: Predictable Outputs - path: pages/text-generation/predictable-outputs.mdx + path: pages/v2/text-generation/predictable-outputs.mdx - page: Advanced Generation Parameters path: pages/text-generation/advanced-generation-hyperparameters.mdx - page: Retrieval Augmented Generation (RAG) diff --git a/fern/v2.yml b/fern/v2.yml index 21d6ecb8e..a5a95ed4d 100644 --- a/fern/v2.yml +++ b/fern/v2.yml @@ -64,12 +64,6 @@ navigation: path: pages/models/the-command-family-of-models/command-a-reasoning.mdx - page: Command A Vision path: pages/models/the-command-family-of-models/command-a-vision.mdx - - page: Command R+ - path: pages/models/the-command-family-of-models/command-r-plus.mdx - - page: Command R - path: pages/models/the-command-family-of-models/command-r.mdx - - page: Command and Command Light - path: pages/models/the-command-family-of-models/command-beta.mdx - page: Embed path: pages/models/cohere-embed.mdx - page: Rerank From 0bb06f8d61c7c9700e0f98e59d3f2fb996a7b2c9 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:06:21 -0600 Subject: [PATCH 02/12] Making a bunch of changes. --- fern/pages/fine-tuning/chat-fine-tuning.mdx | 6 +- .../chat-improving-the-results.mdx | 6 +- .../chat-preparing-the-data.mdx | 6 +- .../chat-starting-the-training.mdx | 6 +- .../chat-understanding-the-results.mdx | 6 +- .../fine-tuning/classify-fine-tuning.mdx | 6 +- .../classify-improving-the-results.mdx | 6 +- .../classify-preparing-the-data.mdx | 8 +- .../classify-starting-the-training.mdx | 6 +- .../classify-understanding-the-results.mdx | 6 +- fern/pages/fine-tuning/fine-tuning-on-aws.mdx | 4 + ...tuning-cohere-models-on-amazon-bedrock.mdx | 4 + ...ning-cohere-models-on-amazon-sagemaker.mdx | 4 + .../fine-tuning-with-the-cohere-dashboard.mdx | 7 +- .../fine-tuning-with-the-python-sdk.mdx | 6 +- fern/pages/fine-tuning/fine-tuning.mdx | 6 +- fern/pages/fine-tuning/rerank-fine-tuning.mdx | 6 +- .../rerank-improving-the-results.mdx | 6 +- .../rerank-preparing-the-data.mdx | 6 +- .../rerank-starting-the-training.mdx | 6 +- .../rerank-understanding-the-results.mdx | 6 +- .../troubleshooting-a-fine-tuned-model.mdx | 7 +- fern/pages/models/models.mdx | 5 +- .../command-r-plus.mdx | 6 +- .../command-r.mdx | 10 +- .../command-r7b.mdx | 2 +- .../connectors/connector-authentication.mdx | 6 +- .../connectors/connector-faqs.mdx | 6 +- .../creating-and-deploying-a-connector.mdx | 6 +- .../connectors/managing-your-connector.mdx | 6 +- .../text-generation/connectors/overview-1.mdx | 138 +----------------- fern/v1.yml | 19 +-- fern/v2.yml | 5 + 33 files changed, 160 insertions(+), 179 deletions(-) diff --git a/fern/pages/fine-tuning/chat-fine-tuning.mdx b/fern/pages/fine-tuning/chat-fine-tuning.mdx index f40a511b7..9a0338092 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning.mdx @@ -2,7 +2,7 @@ title: "Fine-tuning for Cohere's Chat Model" slug: "docs/chat-fine-tuning" -hidden: False +hidden: true description: "This document provides guidance on fine-tuning, evaluating, and improving chat models." image: "../../assets/images/6ff1f01-cohere_meta_image.jpg" keywords: "chat models, fine-tuning language models, fine-tuning, fine-tuning chat models" @@ -10,4 +10,8 @@ keywords: "chat models, fine-tuning language models, fine-tuning, fine-tuning ch createdAt: "Fri Nov 10 2023 18:20:28 GMT+0000 (Coordinated Universal Time)" updatedAt: "Fri Mar 15 2024 04:42:37 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + This section contains information on [fine-tuning](/docs/chat-starting-the-training), [evaluating](/docs/chat-understanding-the-results), and [improving](/docs/chat-improving-the-results) chat models. diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-improving-the-results.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-improving-the-results.mdx index 84e6797e8..d06c4a76d 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-improving-the-results.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-improving-the-results.mdx @@ -1,7 +1,7 @@ --- title: Improving the Chat Fine-tuning Results slug: docs/chat-improving-the-results -hidden: false +hidden: true description: >- Learn how to refine data, iterate on hyperparameters, and troubleshoot to fine-tune your Chat model effectively. @@ -10,6 +10,10 @@ keywords: 'fine-tuning, fine-tuning language models, chat models' createdAt: 'Mon Nov 13 2023 17:30:43 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:43:11 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + There are several things you need to take into account to achieve the best fine-tuned model for Chat: ## Refining data quality diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx index 11215b476..f94b110c4 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx @@ -1,7 +1,7 @@ --- title: Preparing the Chat Fine-tuning Data slug: docs/chat-preparing-the-data -hidden: false +hidden: true description: >- Prepare your data for fine-tuning a Command model for Chat with this step-by-step guide, including data formatting, requirements, and best @@ -11,6 +11,10 @@ keywords: 'fine-tuning, fine-tuning language models' createdAt: 'Thu Nov 16 2023 02:53:26 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Tue May 07 2024 19:35:14 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can prepare your data for fine-tuning a one of the Command family of models for Chat. ### Data format diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx index 5ae5f3e56..567f2471c 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx @@ -1,7 +1,7 @@ --- title: Starting the Chat Fine-Tuning Run slug: docs/chat-starting-the-training -hidden: false +hidden: true description: >- Learn how to fine-tune a Command model for chat with the Cohere Web UI or Python SDK, including data requirements, pricing, and calling your model. @@ -10,6 +10,10 @@ keywords: 'fine-tuning, fine-tuning language models' createdAt: 'Fri Nov 10 2023 18:22:10 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Wed Jun 12 2024 00:17:37 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can start training a fine-tuning model for Chat on both the Web UI and the Python SDK. ## Cohere Dashboard diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-understanding-the-results.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-understanding-the-results.mdx index f789ae477..6fedaaf91 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-understanding-the-results.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-understanding-the-results.mdx @@ -1,7 +1,7 @@ --- title: Understanding the Chat Fine-tuning Results slug: docs/chat-understanding-the-results -hidden: false +hidden: true description: >- Learn how to evaluate and troubleshoot a fine-tuned chat model with accuracy and loss metrics. @@ -10,6 +10,10 @@ keywords: 'chat models, fine-tuning, fine-tuning language models' createdAt: 'Fri Nov 10 2023 18:22:54 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:43:03 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + The outputs of a fine-tuned model for Chat are often best evaluated qualitatively. While the performance metrics are a good place to start, you'll still have to assess whether it _feels_ right to arrive at a comprehensive understanding of the model’s performance. When you create a fine-tuned model for Chat, you will see metrics that look like this: diff --git a/fern/pages/fine-tuning/classify-fine-tuning.mdx b/fern/pages/fine-tuning/classify-fine-tuning.mdx index 925995cfc..791c2f3bb 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning.mdx @@ -2,7 +2,7 @@ title: "Fine-tuning for Cohere's Classify Model" slug: "docs/classify-fine-tuning" -hidden: false +hidden: true description: "This document provides guidance on fine-tuning, evaluating, and improving classification models." image: "../../assets/images/4aa4671-cohere_meta_image.jpg" keywords: "classification, classification models, fine-tuning large language models" @@ -10,4 +10,8 @@ keywords: "classification, classification models, fine-tuning large language mod createdAt: "Fri Nov 10 2023 18:12:45 GMT+0000 (Coordinated Universal Time)" updatedAt: "Fri Mar 15 2024 04:41:11 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + This section contains information on [fine-tuning](/docs/classify-starting-the-training), [evaluating](/docs/classify-understanding-the-results), and [improving](/docs/classify-improving-the-results) classification models. diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-improving-the-results.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-improving-the-results.mdx index ad966121f..de706f86a 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-improving-the-results.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-improving-the-results.mdx @@ -1,7 +1,7 @@ --- title: Improving the Classify Fine-tuning Results slug: docs/classify-improving-the-results -hidden: false +hidden: true description: >- Troubleshoot your fine-tuned classification model with these tips for refining data quality and improving results. @@ -10,6 +10,10 @@ keywords: 'classification models, fine-tuning, fine-tuning classification models createdAt: 'Fri Nov 10 2023 20:16:25 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:41:45 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + There are several things you need to take into account to achieve the best fine-tuned model for Classification, all of which are based on giving the model higher-quality data. ## Refining data quality diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx index 886986adc..23c6eefda 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx @@ -1,7 +1,7 @@ --- title: Preparing the Classify Fine-tuning data slug: docs/classify-preparing-the-data -hidden: false +hidden: true description: >- Learn how to prepare your data for fine-tuning classification models, including single-label and multi-label data formats and dataset cleaning tips. @@ -9,7 +9,13 @@ image: ../../../assets/images/033184f-cohere_meta_image.jpg keywords: 'classification models, fine-tuning, fine-tuning language models' createdAt: 'Wed Nov 15 2023 22:21:51 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Wed Apr 03 2024 15:23:42 GMT+0000 (Coordinated Universal Time)' + --- + + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can prepare your data for fine-tuning models for Classification. For classification fine-tuning jobs we can choose between two types of datasets: diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx index 9414d8603..73a599db7 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx @@ -1,7 +1,7 @@ --- title: Training and deploying a fine-tuned Cohere model. slug: docs/classify-starting-the-training -hidden: false +hidden: true description: >- Fine-tune classification models with Cohere's Web UI or Python SDK using custom datasets. (V1) image: ../../../assets/images/3fe7824-cohere_meta_image.jpg @@ -9,6 +9,10 @@ keywords: 'classification models, fine-tuning language models, fine-tuning' createdAt: 'Fri Nov 10 2023 18:14:01 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Thu Jun 13 2024 13:10:55 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can start training a fine-tuning model for Classification with both the [Web UI](/docs/fine-tuning-with-the-web-ui) and the Python SDK. ## Web UI diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-understanding-the-results.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-understanding-the-results.mdx index f34ed294f..ebfc25036 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-understanding-the-results.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-understanding-the-results.mdx @@ -1,7 +1,7 @@ --- title: Understanding the Classify Fine-tuning Results slug: docs/classify-understanding-the-results -hidden: false +hidden: true description: >- Understand the performance metrics for a fine-tuned classification model and learn how to interpret its accuracy, precision, recall, and F1 scores. @@ -10,6 +10,10 @@ keywords: 'fine-tuning, classification, fine-tuning language models' createdAt: 'Fri Nov 10 2023 18:16:09 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:41:36 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will explain the metrics for a fine-tuned model for Classification. Fine-tuned models for Classification are trained using data of examples mapping to predicted labels, and for that reason they are evaluated using the same methods and performance metrics. You can also provide a test set of data that we will use to calculate performance metrics. If a test set is not provided, we will split your training data randomly to calculate these performance metrics. diff --git a/fern/pages/fine-tuning/fine-tuning-on-aws.mdx b/fern/pages/fine-tuning/fine-tuning-on-aws.mdx index ffa60b176..79cba1711 100644 --- a/fern/pages/fine-tuning/fine-tuning-on-aws.mdx +++ b/fern/pages/fine-tuning/fine-tuning-on-aws.mdx @@ -10,4 +10,8 @@ keywords: "generative AI on AWS, SageMaker, Bedrock" createdAt: "Fri Nov 10 2023 18:35:37 GMT+0000 (Coordinated Universal Time)" updatedAt: "Wed May 08 2024 20:00:51 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, you'll find information on how to fine-tune Cohere's generative models in the AWS ecosystem. diff --git a/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-bedrock.mdx b/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-bedrock.mdx index c0200e346..e98f02c3a 100644 --- a/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-bedrock.mdx +++ b/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-bedrock.mdx @@ -10,6 +10,10 @@ keywords: "AWS generative AI, language models on SageMaker, language models on B createdAt: "Fri Nov 10 2023 18:36:00 GMT+0000 (Coordinated Universal Time)" updatedAt: "Wed May 08 2024 20:00:51 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, you'll find information on how to fine-tune Cohere's generative models on AWS Bedrock. Bedrock customers can fine-tune Cohere’s [Command Light](https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/providers?model=cohere.command-light-text-v14) as well as [Command model](https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/providers?model=cohere.command-text-v14) for their use cases. diff --git a/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-sagemaker.mdx b/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-sagemaker.mdx index 883ad53b3..c6ebb51ba 100644 --- a/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-sagemaker.mdx +++ b/fern/pages/fine-tuning/fine-tuning-on-aws/fine-tuning-cohere-models-on-amazon-sagemaker.mdx @@ -6,6 +6,10 @@ hidden: true createdAt: "Fri Nov 10 2023 18:36:16 GMT+0000 (Coordinated Universal Time)" updatedAt: "Tue May 07 2024 00:13:24 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, you'll find information on how to fine-tune Cohere's generative models on AWS Sagemaker. Sagemaker customers can fine-tune Cohere’s [Command R model](<>) for their use cases. diff --git a/fern/pages/fine-tuning/fine-tuning-with-the-cohere-dashboard.mdx b/fern/pages/fine-tuning/fine-tuning-with-the-cohere-dashboard.mdx index 1bfbf6779..bac8017d8 100644 --- a/fern/pages/fine-tuning/fine-tuning-with-the-cohere-dashboard.mdx +++ b/fern/pages/fine-tuning/fine-tuning-with-the-cohere-dashboard.mdx @@ -1,7 +1,7 @@ --- title: Fine-tuning with Cohere's Dashboard slug: docs/fine-tuning-with-the-cohere-dashboard -hidden: false +hidden: true description: >- Use the Cohere Web UI to start the fine-tuning jobs and track the progress. image: ../../assets/images/da7d0fa-cohere_meta_image.jpg @@ -9,6 +9,11 @@ keywords: 'fine-tuning, fine-tuning large language models' createdAt: 'Wed Nov 15 2023 21:36:22 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Thu Jun 13 2024 00:39:00 GMT+0000 (Coordinated Universal Time)' --- + + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + ![](../../assets/images/be2ccc8-image.png) Customers can kick off fine-tuning jobs by completing the data preparation and validation steps through the [Cohere dashboard](https://dashboard.cohere.com/fine-tuning). This is useful for customers who don't need or don't want to create a fine-tuning job programmatically via the [Fine-tuning API](/reference/listfinetunedmodels) or via the Cohere [Python SDK](/docs/fine-tuning), instead preferring the ease and simplicity of a web interface. diff --git a/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx b/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx index 0927be143..1eadc79fc 100644 --- a/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx +++ b/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx @@ -1,7 +1,7 @@ --- title: Programmatic Fine-tuning slug: docs/fine-tuning-with-the-python-sdk -hidden: false +hidden: true description: >- Fine-tune models using the Cohere Python SDK programmatically and monitor the results through the Dashboard Web UI. image: ../../assets/images/782e60c-cohere_meta_image.jpg @@ -9,6 +9,10 @@ keywords: 'python, fine-tuning, fine-tuning large language models' createdAt: 'Fri Nov 10 2023 18:29:56 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Thu May 09 2024 02:54:41 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In addition to using the [Web UI](/docs/fine-tuning-with-the-web-ui) for fine-tuning models, customers can also kick off fine-tuning jobs programmatically using the [Fine-tuning API](/reference/listfinetunedmodels) or via the [Cohere Python SDK](https://pypi.org/project/cohere/). This can be useful for fine-tuning jobs that happen on a regular cadence, such as fine-tuning nightly on newly-acquired data. ## Datasets diff --git a/fern/pages/fine-tuning/fine-tuning.mdx b/fern/pages/fine-tuning/fine-tuning.mdx index 7b193e45b..9ed83bc57 100644 --- a/fern/pages/fine-tuning/fine-tuning.mdx +++ b/fern/pages/fine-tuning/fine-tuning.mdx @@ -1,7 +1,7 @@ --- title: Introduction to Fine-Tuning with Cohere Models slug: docs/fine-tuning -hidden: false +hidden: true description: >- Fine-tune Cohere's large language models for specific tasks, styles, and formats with custom data. @@ -10,6 +10,10 @@ keywords: 'fine-tuning language models, fine-tuning' createdAt: 'Fri Nov 10 2023 17:49:53 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Mon Jun 17 2024 01:54:04 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + Our ready-to-use large language models, such as `Command R`, as well as `Command R+`, are very good at producing responses to natural language prompts. However, there are many cases in which getting the best model performance requires performing an **additional** round of training on custom user data. Creating a custom model using this process is called **fine-tuning**. ### Why Fine-tune? diff --git a/fern/pages/fine-tuning/rerank-fine-tuning.mdx b/fern/pages/fine-tuning/rerank-fine-tuning.mdx index 82247fad9..754309066 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning.mdx @@ -2,7 +2,7 @@ title: Fine-tuning for Cohere's Rerank Model slug: "docs/rerank-fine-tuning" -hidden: false +hidden: true description: "This document provides guidance on fine-tuning, evaluating, and improving rerank models." image: "../../assets/images/3130066-cohere_meta_image.jpg" keywords: "rerank models, generative AI, fine-tuning, fine-tuning language models" @@ -10,4 +10,8 @@ keywords: "rerank models, generative AI, fine-tuning, fine-tuning language model createdAt: "Fri Nov 10 2023 18:20:10 GMT+0000 (Coordinated Universal Time)" updatedAt: "Fri Mar 15 2024 04:41:53 GMT+0000 (Coordinated Universal Time)" --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + This section contains information on [fine-tuning](/docs/rerank-starting-the-training), [evaluating](/docs/rerank-understanding-the-results), and [improving](/docs/rerank-improving-the-results) rerank models. diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-improving-the-results.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-improving-the-results.mdx index a5546a4d3..c0e700d64 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-improving-the-results.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-improving-the-results.mdx @@ -1,7 +1,7 @@ --- title: Improving the Rerank Fine-tuning Results slug: docs/rerank-improving-the-results -hidden: false +hidden: true description: >- Tips for achieving the best fine-tuned rerank model and troubleshooting guide for fine-tuned models. image: ../../../assets/images/55d219e-cohere_meta_image.jpg @@ -9,6 +9,10 @@ keywords: 'fine-tuning, fine-tuning language models, rerank' createdAt: 'Thu Nov 16 2023 02:59:09 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:42:26 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + There are several things to take into account to achieve the best fine-tuned rerank model, most of which revolve around refining the quality of your data. ## Refining Data Quality diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx index 9cff7d0c7..be2e750a8 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx @@ -1,7 +1,7 @@ --- title: Preparing the Rerank Fine-tuning Data slug: docs/rerank-preparing-the-data -hidden: false +hidden: true description: >- Learn how to prepare and format your data for fine-tuning Cohere's Rerank model. @@ -10,6 +10,10 @@ keywords: 'fine-tuning, fine-tuning language models' createdAt: 'Thu Nov 16 2023 02:58:29 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Tue May 07 2024 02:26:45 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can prepare your data for fine-tuning for Rerank. ### Data format diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx index 4fb006ad5..7ede89d83 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx @@ -1,7 +1,7 @@ --- title: Starting the Rerank Fine-Tuning slug: docs/rerank-starting-the-training -hidden: false +hidden: true description: >- How to start training a fine-tuning model for Rerank using both the Web UI and the Python SDK. image: ../../../assets/images/062ae18-cohere_meta_image.jpg @@ -9,6 +9,10 @@ keywords: 'fine-tuning, fine-tuning language models' createdAt: 'Mon Nov 13 2023 19:52:04 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Tue May 07 2024 21:37:02 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will walk through how you can start training a fine-tuning model for Rerank on both the Web UI and the Python SDK. ## Web UI diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-understanding-the-results.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-understanding-the-results.mdx index 8a7930ab3..33dd7fe0b 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-understanding-the-results.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-understanding-the-results.mdx @@ -1,7 +1,7 @@ --- title: Understanding the Rerank Fine-tuning Results slug: docs/rerank-understanding-the-results -hidden: false +hidden: true description: >- Understand how fine-tuned models for Rerank are evaluated, and learn about the specific metrics used, including Accuracy, MRR, and nDCG. @@ -10,6 +10,10 @@ keywords: 'fine-tuning, data, fine-tuning language models' createdAt: 'Thu Nov 16 2023 02:58:54 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:42:19 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + In this section, we will explain the metrics for a fine-tuned model for Rerank. Fine-tuned models for Rerank are trained using data consisting of queries mapping to relevant passages and documents and, for that reason, are evaluated using the same methods and performance metrics. You can also provide a test set of data that we will use to calculate performance metrics. If a test set is not provided, we will split your training data randomly to calculate performance metrics. diff --git a/fern/pages/fine-tuning/troubleshooting-a-fine-tuned-model.mdx b/fern/pages/fine-tuning/troubleshooting-a-fine-tuned-model.mdx index 29582c388..cc8dd6df0 100644 --- a/fern/pages/fine-tuning/troubleshooting-a-fine-tuned-model.mdx +++ b/fern/pages/fine-tuning/troubleshooting-a-fine-tuned-model.mdx @@ -1,7 +1,7 @@ --- title: FAQs for Troubleshooting A Fine-Tuned Model slug: docs/troubleshooting-a-fine-tuned-model -hidden: false +hidden: true description: >- Train custom AI models with Cohere's platform and leverage human evaluations to compare model performances. @@ -10,6 +10,11 @@ keywords: 'fine-tuning, fine-tuning language models' createdAt: 'Fri Nov 10 2023 20:33:04 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Tue May 07 2024 21:38:22 GMT+0000 (Coordinated Universal Time)' --- + + +Cohere's fine-tuning feature was deprecated on September 15, 2025 + + ## How Long Does it Take to Train a Model? Trainings are completed sequentially, and when you launch a training session, it is added to the end of a queue. Depending on the length of our training queue, training may take between an hour and a day to complete. diff --git a/fern/pages/models/models.mdx b/fern/pages/models/models.mdx index aa876ffc0..787393963 100644 --- a/fern/pages/models/models.mdx +++ b/fern/pages/models/models.mdx @@ -28,7 +28,7 @@ At the end of each major section below, you'll find technical details about how In this section, we'll provide some high-level context on Cohere's offerings, and what the strengths of each are. -- The Command family of models includes [Command A](https://docs.cohere.com/docs/command-a), [Command R7B](https://docs.cohere.com/docs/command-r7b), [Command A Translate](https://docs.cohere.com/docs/command-a-translate), [Command A Reasoning](https://docs.cohere.com/docs/command-a-reasoning), [Command A Vision](https://docs.cohere.com/docs/command-a-vision), [Command R+](/docs/command-r-plus), [Command R](/docs/command-r), and [Command](https://cohere.com/models/command?_gl=1*15hfaqm*_ga*MTAxNTg1NTM1MS4xNjk1MjMwODQw*_ga_CRGS116RZS*MTcxNzYwMzYxMy4zNTEuMS4xNzE3NjAzNjUxLjIyLjAuMA..). Together, they are the text-generation LLMs powering tool-using agents, [retrieval augmented generation](/docs/retrieval-augmented-generation-rag) (RAG), translation, copywriting, and similar use cases. They work through the [Chat](/reference/chat) endpoint, which can be used with or without RAG. +- The Command family of models includes [Command A](https://docs.cohere.com/docs/command-a), [Command R7B](https://docs.cohere.com/docs/command-r7b), [Command A Translate](https://docs.cohere.com/docs/command-a-translate), [Command A Reasoning](https://docs.cohere.com/docs/command-a-reasoning), [Command A Vision](https://docs.cohere.com/docs/command-a-vision), [Command R+](/docs/command-r-plus), [Command R](/docs/command-r). Together, they are the text-generation LLMs powering tool-using agents, [retrieval augmented generation](/docs/retrieval-augmented-generation-rag) (RAG), translation, copywriting, and similar use cases. They work through the [Chat](/reference/chat) endpoint, which can be used with or without RAG. - [Rerank](https://cohere.com/blog/rerank/?_gl=1*1t6ls4x*_ga*MTAxNTg1NTM1MS4xNjk1MjMwODQw*_ga_CRGS116RZS*MTcxNzYwMzYxMy4zNTEuMS4xNzE3NjAzNjUxLjIyLjAuMA..) is the fastest way to inject the intelligence of a language model into an existing search system. It can be accessed via the [Rerank](/reference/rerank-1) endpoint. - [Embed](https://cohere.com/models/embed?_gl=1*1t6ls4x*_ga*MTAxNTg1NTM1MS4xNjk1MjMwODQw*_ga_CRGS116RZS*MTcxNzYwMzYxMy4zNTEuMS4xNzE3NjAzNjUxLjIyLjAuMA..) improves the accuracy of search, classification, clustering, and RAG results. It also powers the [Embed](/reference/embed) and [Classify](/reference/classify) endpoints. - The [Aya](https://cohere.com/research/aya) family of models are aimed at expanding the number of languages covered by generative AI. Aya Expanse covers 23 languages, and Aya Vision is fully multimodal, allowing you to pass in images and text and get a single coherent response. Both are available on the [Chat](/reference/chat) endpoint. @@ -44,6 +44,8 @@ Command is Cohere's default generation model that takes a user instruction (or c | `command-a-translate-08-2025` | Command A Translate is Cohere’s state of the art machine translation model, excelling at a variety of translation tasks on 23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Chinese, Arabic, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian. | Text | 8K | 8k | [Chat](/reference/chat)| | `command-a-reasoning-08-2025` | Command A Reasoning is Cohere's first reasoning model, able to 'think' before generating an output in a way that allows it to perform well in certain kinds of nuanced problem-solving and agent-based tasks in 23 languages. | Text | 256k | 32k | [Chat](/reference/chat)| | `command-a-vision-07-2025` | Command A Vision is our first model capable of processing images, excelling in enterprise use cases such as analyzing charts, graphs, and diagrams, table understanding, OCR, document Q&A, and object detection. It officially supports English, Portuguese, Italian, French, German, and Spanish. | Text, Images | 128K | 8K | [Chat](/reference/chat)| +| `command-r-plus-08-2024` | Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use. | Text | 128k | 4k | [Chat](/reference/chat) | +| `command-r-08-2024` | `command-r-08-2024` is an update of the Command R model, delivered in August 2024. Find more information [here](https://docs.cohere.com/changelog/command-gets-refreshed) | Text | 128k | 4k | [Chat](/reference/chat) | ### Using Command Models on Different Platforms @@ -53,6 +55,7 @@ In this table, we provide some important context for using Cohere Command models | :------------------------------- | :------------------------------ | :-------------------- | :----------------------- | :------------------------------- | | `command-a-03-2025` | (Coming Soon) | Unique per deployment | Unique per deployment | `cohere.command-a-03-2025` | | `command-r7b-12-2024` | N/A | N/A | N/A | N/A | +| `command-r-plus` | `cohere.command-r-plus-v1:0` | Unique per deployment | Unique per deployment | `cohere.command-r-plus v1.2` | ## Embed diff --git a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx index b2aed2597..04b74b92d 100644 --- a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx @@ -2,7 +2,7 @@ title: Cohere's Command R+ Model subtitle: Command R+ model details and specifications slug: docs/command-r-plus -hidden: true +hidden: false description: >- Command R+ is Cohere's optimized for conversational interaction and long-context tasks, best suited for complex RAG workflows and multi-step tool use. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg @@ -24,8 +24,8 @@ For information on toxicity, safety, and using this model responsibly check out | Model Name | Description | Modality | Context Length | Maximum Output Tokens | Endpoints | |--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|----------------|-----------------------|------------------------| | `command-r-plus-08-2024` | `command-r-plus-08-2024` is an update of the Command R+ model, delivered in August 2024. | Text | 128k | 4k | [Chat](/reference/chat)| -| `command-r-plus-04-2024` | Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use. | Text | 128k | 4k | [Chat](/reference/chat)| -| `command-r-plus` | `command-r-plus` is an alias for `command-r-plus-04-2024`, so if you use `command-r-plus` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat)| +| `command-r-plus-04-2024` (deprecated 09/15/2025) | Command R+ was an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use. | Text | 128k | 4k | [Chat](/reference/chat)| +| `command-r-plus` (deprecated 09/15/2025) | `command-r-plus` was an alias for `command-r-plus-04-2024`, so if you use `command-r-plus` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat)| ## Command R+ August 2024 Release Cohere's flagship text-generation models, Command R and Command R+, received a substantial update in August 2024. We chose to designate these models with time stamps, so in the API Command R+ 08-2024 is accesible with `command-r-plus-08-2024`. diff --git a/fern/pages/models/the-command-family-of-models/command-r.mdx b/fern/pages/models/the-command-family-of-models/command-r.mdx index a9e79b750..646398b6c 100644 --- a/fern/pages/models/the-command-family-of-models/command-r.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r.mdx @@ -2,8 +2,8 @@ title: Cohere's Command R Model subtitle: Command R model details and specifications slug: docs/command-r -hidden: true -description: >- +hidden: +description:false >- Command R is a conversational model that excels in language tasks and supports multiple languages, making it ideal for coding use cases. image: ../../../assets/images/49841d1-cohere_meta_image.jpg keywords: >- @@ -25,9 +25,9 @@ For information on toxicity, safety, and using this model responsibly check out ### Model Details | Model Name | Description | Modality | Context Length | Maximum Output Tokens | Endpoints| |--------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------------|-----------------------|----------| -| `command-r-08-2024` | `command-r-08-2024` is an update of the Command R model, delivered in August 2024. | Text | 128k | 4k | [Chat](/reference/chat) | | -| `command-r-03-2024` | Command R is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents. | Text | 128k | 4k | [Chat](/reference/chat) | | -| `command-r` | `command-r` is an alias for `command-r-03-2024`, so if you use `command-r` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat) | | +| `command-r-08-2024` | `command-r-08-2024` is an update of the Command R model, delivered in August 2024. | Text | 128k | 4k | [Chat](/reference/chat) | | +| `command-r-03-2024` (deprecated 09/15/2025) | Command R was an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents. | Text | 128k | 4k | [Chat](/reference/chat) | | +| `command-r` (deprecated 09/15/2025) | `command-r` was an alias for `command-r-03-2024`, so if you use `command-r` in the API, that's the model you're pointing to. | Text | 128k | 4k | [Chat](/reference/chat) | | ## Command R August 2024 Release Cohere's flagship text-generation models, Command R and Command R+, received a substantial update in August 2024. We chose to designate these models with time stamps, so in the API Command R 08-2024 is accesible with `command-r-08-2024`. diff --git a/fern/pages/models/the-command-family-of-models/command-r7b.mdx b/fern/pages/models/the-command-family-of-models/command-r7b.mdx index e378bb1be..67af56716 100644 --- a/fern/pages/models/the-command-family-of-models/command-r7b.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r7b.mdx @@ -2,7 +2,7 @@ title: Cohere's Command R7B Model subtitle: Command R7B model details and specifications slug: docs/command-r7b -hidden: true +hidden: false description: >- Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, and agents. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg diff --git a/fern/pages/text-generation/connectors/connector-authentication.mdx b/fern/pages/text-generation/connectors/connector-authentication.mdx index 0e1ac9bb7..512cb7e5a 100644 --- a/fern/pages/text-generation/connectors/connector-authentication.mdx +++ b/fern/pages/text-generation/connectors/connector-authentication.mdx @@ -1,7 +1,7 @@ --- title: How to Authenticate a Connector slug: docs/connector-authentication -hidden: false +hidden: true description: >- The document outlines three methods for authentication and authorization in Cohere. image: ../../../assets/images/a8cf803-cohere_meta_image.jpg @@ -9,6 +9,10 @@ keywords: 'Cohere connectors, retrieval augmented generation' createdAt: 'Fri Dec 01 2023 17:20:54 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Thu May 30 2024 15:53:23 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's connector parameter was deprecated on September 15, 2025. For information on how to get connector-like functionality, check out our documentation on [multi-step tool use](https://docs.cohere.com/v1/docs/multi-step-tool-use). + + Cohere supports three methods for authentication and authorization to protect your connectors from unauthorized use. They are: 1. Service to Service Authentication diff --git a/fern/pages/text-generation/connectors/connector-faqs.mdx b/fern/pages/text-generation/connectors/connector-faqs.mdx index 24dabfe8a..8744568c1 100644 --- a/fern/pages/text-generation/connectors/connector-faqs.mdx +++ b/fern/pages/text-generation/connectors/connector-faqs.mdx @@ -1,7 +1,7 @@ --- title: Frequently Asked Questions About Connectors slug: docs/connector-faqs -hidden: false +hidden: true description: >- Get solutions to common issues when implementing connectors for Cohere's language models, including performance, relevance, and quality. @@ -10,6 +10,10 @@ keywords: 'Cohere connectors, retrieval augmented generation' createdAt: 'Wed Dec 06 2023 20:08:46 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Fri Mar 15 2024 04:38:53 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's connector parameter was deprecated on September 15, 2025. For information on how to get connector-like functionality, check out our documentation on [multi-step tool use](https://docs.cohere.com/v1/docs/multi-step-tool-use). + + Here, we'll address some common issues that users have encountered while implementing connectors, along with some solutions. This information should help you get started using connectors to customize your application of Cohere's language models. #### How can I stop the connector from returning duplicates? diff --git a/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx b/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx index 51578dd59..6c362d11d 100644 --- a/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx +++ b/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx @@ -1,7 +1,7 @@ --- title: Creating and Deploying a Connector slug: docs/creating-and-deploying-a-connector -hidden: false +hidden: true description: >- Learn how to implement a connector, from setup to deployment, to enable grounded generations with Cohere's Chat API. @@ -10,6 +10,10 @@ keywords: 'Cohere connectors, retrieval augmented generation' createdAt: 'Fri Dec 01 2023 17:20:28 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Mon May 06 2024 19:20:26 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's connector parameter was deprecated on September 15, 2025. For information on how to get connector-like functionality, check out our documentation on [multi-step tool use](https://docs.cohere.com/v1/docs/multi-step-tool-use). + + This section will provide a rough guide for implementing a connector. There are two major parts of this process: 1. Set up the connector and underlying data source, which requires: diff --git a/fern/pages/text-generation/connectors/managing-your-connector.mdx b/fern/pages/text-generation/connectors/managing-your-connector.mdx index 946e6091e..a9fca4ef2 100644 --- a/fern/pages/text-generation/connectors/managing-your-connector.mdx +++ b/fern/pages/text-generation/connectors/managing-your-connector.mdx @@ -1,7 +1,7 @@ --- title: How to Manage a Cohere Connector slug: docs/managing-your-connector -hidden: false +hidden: true description: >- Learn how to manage connectors, including listing, authorizing, updating settings, and debugging issues. @@ -10,6 +10,10 @@ keywords: 'Cohere connectors, generative AI, retrieval augmented generation' createdAt: 'Fri Dec 01 2023 17:20:38 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Thu May 30 2024 15:52:26 GMT+0000 (Coordinated Universal Time)' --- + +Cohere's connector parameter was deprecated on September 15, 2025. For information on how to get connector-like functionality, check out our documentation on [multi-step tool use](https://docs.cohere.com/v1/docs/multi-step-tool-use). + + Once your connector is deployed and registered, there are a couple of features that will help you to manage it. ### Listing your Connectors diff --git a/fern/pages/text-generation/connectors/overview-1.mdx b/fern/pages/text-generation/connectors/overview-1.mdx index 5bf8a9368..917fcba69 100644 --- a/fern/pages/text-generation/connectors/overview-1.mdx +++ b/fern/pages/text-generation/connectors/overview-1.mdx @@ -11,139 +11,7 @@ keywords: "Cohere, retrieval augmented generation" createdAt: "Thu May 23 2024 05:06:54 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 30 2024 15:51:51 GMT+0000 (Coordinated Universal Time)" --- -As the name implies, Connectors are ways of connecting to data sources. They enable you to combine Cohere large language models (LLMs), which power the [Chat API endpoint](/reference/chat), with data sources such as internal documents, document databases, the broader internet, or any other source of context which can inform the replies generated by the model. -Connectors enhance Cohere [retrieval augmented generation (RAG)](/docs/retrieval-augmented-generation-rag) offering and can respond to user questions and prompts with substantive, grounded generations that contain citations to external public or private knowledge bases. To see an example of grounded generations with citations, try out [the Cohere dashboard](https://dashboard.cohere.com/playground) after enabling web search grounding. - -The following graphic demonstrates the flow of information when using a connector: - - - - -## Using Connectors to Create Grounded Generations - -Connectors are specified when calling the Chat endpoint, which you can read more about [here](/docs/chat-api#connectors-mode). An example request specifying the managed web-search connector would look like this: - - -```python PYTHON -import cohere - -co = cohere.Client(api_key="Your API key") - -response = co.chat( - model="command-a-03-2025", - message="What is the chemical formula for glucose?", - connectors=[{"id": "web-search"}], -) -``` -```curl CURL -curl --location 'https://production.api.cohere.ai/v1/chat' \ ---header 'Content-Type: application/json' \ ---header 'Authorization: Bearer {Your API key}' \ ---data ' -{ - "message": "What is the chemical formula for glucose?", - "connectors": [{"id": "web-search"}] -} -``` -```typescript TYPESCRIPT -import { CohereClient } from "cohere-ai"; -const cohere = new CohereClient({ - token: "YOUR_API_KEY", -}); -(async () => { - const response = await cohere.chat({ - message:"What is the chemical formula for glucose?", - connectors:[{"id": "web-search"}], - }); - console.log("Received response", response); -})(); -``` -```go GO -import ( - cohere "github.com/cohere-ai/cohere-go/v2" - cohereclient "github.com/cohere-ai/cohere-go/v2/client" -) -client := cohereclient.NewClient(cohereclient.WithToken("")) -response, err := client.Chat( - context.TODO(), - &cohere.ChatRequest{ - Message: "What is the chemical formula for glucose?", - Connectors:[]*cohereclient.ChatConnector{{Id: "web-search"}}, -) -``` - - -If you or an administrator at your organization has created a new connector, you can add this connector id to the list. Here’s an example: - -```python PYTHON -connectors = [{"id": "web-search"}, {"id": "customer-connector-id"}] -``` - -The response will then contain the generated text with citation elements that link to the documents returned from the connector. For example, the formula `C6H12O6` below has a citation element that links to three websites. - -```json Example Response JSON -{ - "text": "The chemical formula for glucose is C6H12O6.", - "generation_id": "667f0844-e5c9-4108-8624-45b7687ca6f3", - "citations": [ - { - "start": 36, - "end": 44, - "text": "C6H12O6.", - "document_ids": [ - "web-search_3:0", - "web-search_3:4", - "web-search_4:0", - "web-search_4:1" - ] - } - ], - "documents": [ - { - "id": "web-search_3:0", - "snippet": "Chemical Compound Formulas\n\nGlucose is a simple sugar with six carbon atoms and one aldehyde group. This monosaccharide has a chemical formula C6H12O6.\n\nIt is also known as dextrose. It is referred to as aldohexose as it contains 6 carbon atoms and an aldehyde group. It exists in two forms, open-chain or ring structure. It is synthesized in the liver and kidneys of animals. In plants, it is found in fruits and in different parts of plants. D- glucose is the naturally occurring form of glucose. It can occur either in solid or liquid form. It is water-soluble and is also soluble in acetic acid.", - "title": "Glucose C6H12O6 - Chemical Formula, Structure, Composition, Properties, uses and FAQs of Glucose.", - "url": "https://byjus.com/chemistry/glucose/" - }, - { - "id": "web-search_3:4", - "snippet": "\n\nFrequently Asked Questions- FAQs\n\nHow do you represent glucose?\n\nThe chemical formula of Glucose is C6H12O6. Glucose is a monosaccharide containing an aldehyde group (-CHO). It is made of 6 carbon atoms, 12 hydrogen atoms and 6 oxygen atoms. Glucose is an aldohexose.\n\nIs glucose a reducing sugar?\n\nGlucose is a reducing sugar because it belongs to the category of an aldose meaning its open-chain form contains an aldehyde group. Generally, an aldehyde is quite easily oxidized to carboxylic acids.\n\nWhat are the 5 reducing sugars?\n\nThe 5 reducing sugars are ribose, glucose, galactose, glyceraldehyde, xylose.\n\nWhat are the elements of glucose?", - "title": "Glucose C6H12O6 - Chemical Formula, Structure, Composition, Properties, uses and FAQs of Glucose.", - "url": "https://byjus.com/chemistry/glucose/" - }, - { - "id": "web-search_4:0", - "snippet": "Science, Tech, Math › Science\n\nGlucose Molecular Formula and Facts\n\nChemical or Molecular Formula for Glucose\n\nScience Photo Library - MIRIAM MASLO. / Getty Images\n\nProjects & Experiments\n\nChemistry In Everyday Life\n\nAbbreviations & Acronyms\n\nAnne Marie Helmenstine, Ph.D.\n\nAnne Marie Helmenstine, Ph.D.\n\nPh.D., Biomedical Sciences, University of Tennessee at Knoxville\n\nB.A., Physics and Mathematics, Hastings College\n\nDr. Helmenstine holds a Ph.D. in biomedical sciences and is a science writer, educator, and consultant. She has taught science courses at the high school, college, and graduate levels.\n\nLearn about our Editorial Process\n\nUpdated on November 03, 2019\n\nThe molecular formula for glucose is C6H12O6 or H-(C=O)-(CHOH)5-H. Its empirical or simplest formula is CH2O, which indicates there are two hydrogen atoms for each carbon and oxygen atom in the molecule.", - "title": "Know the Chemical or Molecular Formula for Glucose", - "url": "https://www.thoughtco.com/glucose-molecular-formula-608477" - }, - ], - "search_results": [ - { - "search_query": { - "text": "chemical formula for glucose", - "generation_id": "66e388c8-d9a8-4d43-a711-0f17c3f0f82a" - }, - "document_ids": [ - "web-search_3:0", - "web-search_3:4", - "web-search_4:0", - ], - "connector": { - "id": "web-search" - } - } - ], - "search_queries": [ - { - "text": "chemical formula for glucose", - "generation_id": "66e388c8-d9a8-4d43-a711-0f17c3f0f82a" - } - ] -} -``` - -## A Caveat on Deploying Connectors - -Connector registration only works _natively_ on the Cohere platform. You can, however, register a connector for e.g. Azure or another platform using the [Cohere toolkit](/docs/coral-toolkit) (more technical detail is available [here](https://github.com/cohere-ai/cohere-toolkit/?tab=readme-ov-file#how-to-add-a-connector-to-the-toolkit).) You might also find it useful to read about Cohere deployments on [Amazon](/docs/cohere-on-aws), [Azure](/docs/cohere-on-microsoft-azure), and [single-container cloud environments](/docs/single-container-on-private-clouds). + +Cohere's connector parameter was deprecated on September 15, 2025. For information on how to get connector-like functionality, check out our documentation on [multi-step tool use](https://docs.cohere.com/v1/docs/multi-step-tool-use). + \ No newline at end of file diff --git a/fern/v1.yml b/fern/v1.yml index 08501a166..0994199a1 100644 --- a/fern/v1.yml +++ b/fern/v1.yml @@ -64,6 +64,10 @@ navigation: path: pages/models/the-command-family-of-models/command-a-reasoning.mdx - page: Command A Vision path: pages/models/the-command-family-of-models/command-a-vision.mdx + - page: Command R+ + path: pages/models/the-command-family-of-models/command-r-plus.mdx + - page: Command R + path: pages/models/the-command-family-of-models/command-r.mdx - page: Embed path: pages/models/cohere-embed.mdx - page: Rerank @@ -80,13 +84,13 @@ navigation: - page: Introduction to Text Generation at Cohere path: pages/text-generation/introduction-to-text-generation-at-cohere.mdx - page: Using the Chat API - path: pages/v2/text-generation/chat-api.mdx + path: pages/text-generation/chat-api.mdx - page: Streaming Responses path: pages/text-generation/streaming.mdx - page: Structured Outputs - path: pages/v2/text-generation/structured-outputs.mdx + path: pages/text-generation/structured-outputs.mdx - page: Predictable Outputs - path: pages/v2/text-generation/predictable-outputs.mdx + path: pages/text-generation/predictable-outputs.mdx - page: Advanced Generation Parameters path: pages/text-generation/advanced-generation-hyperparameters.mdx - page: Retrieval Augmented Generation (RAG) @@ -95,14 +99,6 @@ navigation: contents: - page: Overview of RAG Connectors path: pages/text-generation/connectors/overview-1.mdx - - page: Creating and Deploying a Connector - path: pages/text-generation/connectors/creating-and-deploying-a-connector.mdx - - page: Managing your Connector - path: pages/text-generation/connectors/managing-your-connector.mdx - - page: Connector Authentication - path: pages/text-generation/connectors/connector-authentication.mdx - - page: Connector FAQs - path: pages/text-generation/connectors/connector-faqs.mdx - section: Tool Use path: pages/text-generation/tools.mdx contents: @@ -169,6 +165,7 @@ navigation: - page: Rerank Best Practices path: pages/text-embeddings/reranking/reranking-best-practices.mdx - section: Fine-Tuning + hidden: true contents: - page: Introduction path: pages/fine-tuning/fine-tuning.mdx diff --git a/fern/v2.yml b/fern/v2.yml index a5a95ed4d..631be85af 100644 --- a/fern/v2.yml +++ b/fern/v2.yml @@ -64,6 +64,10 @@ navigation: path: pages/models/the-command-family-of-models/command-a-reasoning.mdx - page: Command A Vision path: pages/models/the-command-family-of-models/command-a-vision.mdx + - page: Command R+ + path: pages/models/the-command-family-of-models/command-r-plus.mdx + - page: Command R + path: pages/models/the-command-family-of-models/command-r.mdx - page: Embed path: pages/models/cohere-embed.mdx - page: Rerank @@ -168,6 +172,7 @@ navigation: - page: Rerank Best Practices path: pages/v2/text-embeddings/reranking/reranking-best-practices.mdx - section: Fine-Tuning + hidden: true contents: - page: Introduction path: pages/fine-tuning/fine-tuning.mdx From 56e8e4f775b64cae60bfcec8c35b54c93d7697e2 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:08:12 -0600 Subject: [PATCH 03/12] Making a change to a Command page. --- .../models/the-command-family-of-models/command-r-plus.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx index 04b74b92d..c8869deda 100644 --- a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx @@ -31,7 +31,7 @@ For information on toxicity, safety, and using this model responsibly check out Cohere's flagship text-generation models, Command R and Command R+, received a substantial update in August 2024. We chose to designate these models with time stamps, so in the API Command R+ 08-2024 is accesible with `command-r-plus-08-2024`. With the release, both models include the following feature improvements: -- For tool use, Command R and Command R+ have demonstrated improved decision-making around whether or not to use a tool. +- For tool use, Command R and Command R+ demonstrate improved decision-making around whether or not to use a tool. - The updated models are better able to follow instructions included in the request's system message. - Better structured data analysis for structured data manipulation. - Improved robustness to non-semantic prompt changes like white space or new lines. From e711d4013d9330404f8a7b878fbc1804148ff2b0 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:11:57 -0600 Subject: [PATCH 04/12] Bad image. --- fern/pages/models/the-command-family-of-models/command-r.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fern/pages/models/the-command-family-of-models/command-r.mdx b/fern/pages/models/the-command-family-of-models/command-r.mdx index 646398b6c..445a38c2a 100644 --- a/fern/pages/models/the-command-family-of-models/command-r.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r.mdx @@ -5,7 +5,7 @@ slug: docs/command-r hidden: description:false >- Command R is a conversational model that excels in language tasks and supports multiple languages, making it ideal for coding use cases. -image: ../../../assets/images/49841d1-cohere_meta_image.jpg +image: ../../../assets/images/edb3e49-cohere_meta_image.jpg keywords: >- Cohere, large language models, generative AI, command model, chat models, conversational AI From 82d122c352dc324ecae8cde84b1557eabdf35883 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:16:17 -0600 Subject: [PATCH 05/12] Fixing changelog problems. --- ...-announcing-major-command-deprecations.mdx | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx index 3377eb6b5..9b18bb381 100644 --- a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx +++ b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx @@ -1,32 +1,32 @@ --- title: "Announcing Major Command Deprecations" slug: "changelog/2025-09-15-major-command-deprecations" -createdAt: "Thu Aug 28 2025 00:00:00 (MST)" +createdAt: "Mon Sep 15 2025 00:00:00 (MST)" hidden: false description: >- - This announcement covers the release of Command A Translate, Cohere's most powerful translation model. + This announcement covers a series of major deprecations, including of classic Command models, several parameters, and entire endpoints. Today, we're announcing: we are deprecating the following: Command -- command -- command-r-03-2024 -- command-r-plus-04-2024 +- `command` +- `command-r-03-2024` +- `command-r-plus-04-2024` -Recommendation for users: Users must move over to newer models, i.e. command-r-08-2024, command-r-plus-08-2024, and/or command-a-03-2025command-light +Recommendation for users: Users must move over to newer models, i.e. command-r-08-2024, command-r-plus-08-2024, and/or command-a-03-2025 Endpoints: -- v1/connectors -- v1/chat -- v1/generate -- v1/summarize -- v1/classify +- `v1/connectors` +- `v1/chat` +- `v1/generate` +- `v1/summarize` +- `v1/classify` Recommendation for users: All platform users recommended to move to the v2 API with this change Other Products & Features: Fine-tuning capabilities across the platform This means fine-tuning on platform will become unavailable, and meaning that any command-light, command, command-r, rerank, and classify fine-tuned models will become unavailable command Slack app Coral Web (Coral.cohere.com) -V1 API Parameters: connectors and search_queries_only in /v1/chat. \ No newline at end of file +V1 API Parameters: `connectors` and `search_queries_only` in `/v1/chat`. \ No newline at end of file From 9246c045f2579b2a7ff4122cf7eafebe51e771d2 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:17:29 -0600 Subject: [PATCH 06/12] Fixing changelog problems. --- .../2025-09-15-announcing-major-command-deprecations.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx index 9b18bb381..f7e87542e 100644 --- a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx +++ b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx @@ -5,7 +5,7 @@ createdAt: "Mon Sep 15 2025 00:00:00 (MST)" hidden: false description: >- This announcement covers a series of major deprecations, including of classic Command models, several parameters, and entire endpoints. - +--- Today, we're announcing: From fd0123bdf5da2d66dfd8baf87e84f30cb22aeeb9 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:22:26 -0600 Subject: [PATCH 07/12] Errors. --- fern/pages/models/the-command-family-of-models/command-r.mdx | 5 +++-- .../models/the-command-family-of-models/command-r7b.mdx | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fern/pages/models/the-command-family-of-models/command-r.mdx b/fern/pages/models/the-command-family-of-models/command-r.mdx index 445a38c2a..65d38f56f 100644 --- a/fern/pages/models/the-command-family-of-models/command-r.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r.mdx @@ -2,8 +2,8 @@ title: Cohere's Command R Model subtitle: Command R model details and specifications slug: docs/command-r -hidden: -description:false >- +hidden: false +description: >- Command R is a conversational model that excels in language tasks and supports multiple languages, making it ideal for coding use cases. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg keywords: >- @@ -12,6 +12,7 @@ keywords: >- createdAt: 'Tue Mar 05 2024 18:50:03 GMT+0000 (Coordinated Universal Time)' updatedAt: 'Wed Dec 18 2024 14:16:00 GMT+0000 (Coordinated Universal Time)' --- + For most use cases we recommend our latest model [Command A](/docs/command-a) instead. diff --git a/fern/pages/models/the-command-family-of-models/command-r7b.mdx b/fern/pages/models/the-command-family-of-models/command-r7b.mdx index 67af56716..e47392114 100644 --- a/fern/pages/models/the-command-family-of-models/command-r7b.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r7b.mdx @@ -4,7 +4,7 @@ subtitle: Command R7B model details and specifications slug: docs/command-r7b hidden: false description: >- - Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, and agents. + Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, agents, and other use cases. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg keywords: 'generative AI, Cohere, large language models' createdAt: 'Wed Dec 18 2024 14:16:00 GMT+0000 (Coordinated Universal Time)' From 1c20de2cbcd9769b10a04b901c3784fe39e81f0a Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Fri, 12 Sep 2025 14:25:41 -0600 Subject: [PATCH 08/12] Errors. --- fern/pages/models/the-command-family-of-models/command-r7b.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fern/pages/models/the-command-family-of-models/command-r7b.mdx b/fern/pages/models/the-command-family-of-models/command-r7b.mdx index e47392114..67af56716 100644 --- a/fern/pages/models/the-command-family-of-models/command-r7b.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r7b.mdx @@ -4,7 +4,7 @@ subtitle: Command R7B model details and specifications slug: docs/command-r7b hidden: false description: >- - Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, agents, and other use cases. + Command R7B is the smallest, fastest, and final model in our R family of enterprise-focused large language models. It excels at RAG, tool use, and agents. image: ../../../assets/images/edb3e49-cohere_meta_image.jpg keywords: 'generative AI, Cohere, large language models' createdAt: 'Wed Dec 18 2024 14:16:00 GMT+0000 (Coordinated Universal Time)' From 306edac94a38212915b8f1c2b64cb84e2ced87c7 Mon Sep 17 00:00:00 2001 From: Trent Fowler Date: Mon, 15 Sep 2025 17:01:43 -0600 Subject: [PATCH 09/12] Release notes, deprecation guide update. --- ...-announcing-major-command-deprecations.mdx | 55 +++++++++++-------- .../going-to-production/deprecations.mdx | 22 ++++++++ 2 files changed, 54 insertions(+), 23 deletions(-) diff --git a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx index f7e87542e..6c0a21dda 100644 --- a/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx +++ b/fern/pages/changelog/2025-09-15-announcing-major-command-deprecations.mdx @@ -7,26 +7,35 @@ description: >- This announcement covers a series of major deprecations, including of classic Command models, several parameters, and entire endpoints. --- -Today, we're announcing: - -we are deprecating the following: -Command -- `command` -- `command-r-03-2024` -- `command-r-plus-04-2024` - -Recommendation for users: Users must move over to newer models, i.e. command-r-08-2024, command-r-plus-08-2024, and/or command-a-03-2025 -Endpoints: - -- `v1/connectors` -- `v1/chat` -- `v1/generate` -- `v1/summarize` -- `v1/classify` - -Recommendation for users: All platform users recommended to move to the v2 API with this change -Other Products & Features: Fine-tuning capabilities across the platform -This means fine-tuning on platform will become unavailable, and meaning that any command-light, command, command-r, rerank, and classify fine-tuned models will become unavailable -command Slack app -Coral Web (Coral.cohere.com) -V1 API Parameters: `connectors` and `search_queries_only` in `/v1/chat`. \ No newline at end of file +As part of our ongoing commitment to delivering advanced AI solutions, we are streamlining our offerings to focus on the best-performing tools. The following models, features, and API endpoints will be deprecated: + +Deprecated Models: +- command-light (legacy) → Use command-r-08-2024 or command-a-03-2025 instead. +- command → Use command-r-03-2024 or command-r-plus-04-2024. +- summarize → Refer to migration guide for alternatives. + +Retired Fine-Tuning Capabilities: +All fine-tuning options via dashboard and API for models including command-light, command, command-r, classify, and rerank are being retired. Previously fine-tuned models will no longer be accessible. + +Deprecated Features and API Endpoints: +- /v1/connectors (Managed connectors for RAG) +- /v1/chat parameters: connectors, search_queries_only +- /v1/generate (Legacy generative endpoint) +- /v1/summarize (Legacy summarization endpoint) +- /v1/classify +- Slack App integration +- Coral Web UI (chat.cohere.com) + +Why These Changes? +We are aligning with evolving market needs, enhancing performance, and optimizing resources. Newer models like Command A offer superior capabilities. This transition ensures we remain at the forefront of innovation. + +Support and Migration: +Our support team is ready to assist with your transition. For guidance, contact us at support@cohere.com or explore our documentation. We recommend assessing your current usage and planning your migration to the recommended alternatives. + +Thank you for your understanding and continued partnership. We look forward to delivering innovative AI solutions that drive your success. + +Best regards, + +The Cohere Support Team + +support@cohere.com \ No newline at end of file diff --git a/fern/pages/going-to-production/deprecations.mdx b/fern/pages/going-to-production/deprecations.mdx index 39348abe8..472a9f8a6 100644 --- a/fern/pages/going-to-production/deprecations.mdx +++ b/fern/pages/going-to-production/deprecations.mdx @@ -28,6 +28,28 @@ To ensure a smooth transition, we recommend thorough testing of your application ## Deprecation History All deprecations are listed below with the most recent announcements at the top. +### 2025-09-15: Various older command models, a number of endpoints, all of fine-tuning. +Effective September 15, 2025, the following deprecatations will roll out. + +Deprecated Models: +- `command-light (use `command-r-08-2024` or `command-a-03-2025` instead) +- `command (use `command-r-03-2024` or `command-r-plus-04-2024`) +- summarize (refer to migration guide for alternatives) + +Retired Fine-Tuning Capabilities: +All fine-tuning options via dashboard and API for models including command-light, command, command-r, classify, and rerank are being retired. Previously fine-tuned models will no longer be accessible. + +Deprecated Features and API Endpoints: +- `/v1/connectors` (Managed connectors for RAG) +- `/v1/chat` parameters: `connectors`, `search_queries_only` +- `/v1/generate` (Legacy generative endpoint) +- `/v1/summarize` (Legacy summarization endpoint) +- `/v1/classify` +- Slack App integration +- Coral Web UI (chat.cohere.com) + +These changes reflect our commitment to innovation and performance optimization. We encourage users to assess their current implementations and migrate to recommended alternatives. Our support team is available at support@cohere.com to assist with the transition. For detailed guidance, please refer to our migration resources and technical documentation. + ### 2025-03-08: Command-R-03-2024 Fine-tuned Models On March 08, 2025, we will sunset all models fine-tuned with Command-R-03-2024. As part of our ongoing efforts to enhance our services, we are making the following changes to our fine-tuning capabilities: From dd37ae1b24865c62162b3f310c926d9384901688 Mon Sep 17 00:00:00 2001 From: co-varun Date: Fri, 26 Sep 2025 15:06:57 -0400 Subject: [PATCH 10/12] Replace search_queries_only parameter with tool call examples Migrate all documentation from deprecated search_queries_only parameter to modern tool call approach for search query generation. Main Documentation: - fern/pages/text-generation/retrieval-augmented-generation-rag.mdx - fern/pages/text-generation/streaming.mdx - fern/pages/cookbooks/rag-with-chat-embed.mdx - fern/pages/cookbooks/analysis-of-financial-forms.mdx - fern/pages/cookbooks/agentic-rag-mixed-data.mdx Tutorials: - fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx - fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx Notebooks: - notebooks/guides/getting-started/tutorial_pt6.ipynb Scripts: - scripts/cookbooks-mdx/rag-with-chat-embed.mdx - scripts/cookbooks-mdx/analysis-of-financial-forms.mdx - scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx Changes: - Replace search_queries_only=True with tools parameter and force_single_step=True - Update response.search_queries to response.tool_calls[0].parameters["queries"] - Maintain original voice, tone and educational flow - Preserve all working code examples and explanations Impact: Users now see modern tool call examples instead of deprecated parameter across all primary RAG documentation. --- .../cookbooks/agentic-rag-mixed-data.mdx | 56 ++++++++++++++----- .../cookbooks/analysis-of-financial-forms.mdx | 27 +++++++-- fern/pages/cookbooks/rag-with-chat-embed.mdx | 32 ++++++++--- .../retrieval-augmented-generation-rag.mdx | 55 ++++++++---------- fern/pages/text-generation/streaming.mdx | 2 +- .../rag-with-cohere.mdx | 33 ++++++++--- .../cohere-on-azure/azure-ai-rag.mdx | 24 ++++++-- .../guides/getting-started/tutorial_pt6.ipynb | 24 ++++++-- .../cookbooks-mdx/agentic-rag-mixed-data.mdx | 54 ++++++++++++++---- .../analysis-of-financial-forms.mdx | 27 +++++++-- scripts/cookbooks-mdx/rag-with-chat-embed.mdx | 26 +++++++-- 11 files changed, 261 insertions(+), 99 deletions(-) diff --git a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx index 51fa6c7bd..81cdb7821 100644 --- a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx +++ b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx @@ -241,13 +241,28 @@ With our database in place, we can run queries against it. The query process can ```python PYTHON def process_query(query, retriever): """Runs query augmentation, retrieval, rerank and final generation in one call.""" - augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=retriever.invoke(itm.text) - temp_rerank = rerank_cohere(itm.text,docs) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=retriever.invoke(itm) + temp_rerank = rerank_cohere(itm,docs) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: @@ -307,14 +322,14 @@ The final answer is from the documents below: In the example below, we ask a follow up question that relies on the chat history, but does not require a rerun of the RAG pipeline. -We detect questions that do not require RAG by examining the `search_queries` object returned by calling `co.chat` to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question. +We detect questions that do not require RAG by examining the `tool_calls` object returned by calling `co.chat` with a search query generation tool to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question. In the example below, the `else` statement is invoked based on `query2`. We still pass in the chat history, allowing the question to be answered with only the prior context. ```python PYTHON query2='divide this by two' -augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, search_queries_only=True) -if augmented_queries.search_queries: +augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True) +if augmented_queries.tool_calls: print('RAG is needed') final_answer, final_answer_docs = process_query(query, retriever) print(final_answer) @@ -524,13 +539,28 @@ Unless the user asks for a different style of answer, you should answer in full def process_query(self,query): """Runs query augmentation, retrieval, rerank and generation in one call.""" - augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=self.retriever.invoke(itm.text) - temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=self.retriever.invoke(itm) + temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: diff --git a/fern/pages/cookbooks/analysis-of-financial-forms.mdx b/fern/pages/cookbooks/analysis-of-financial-forms.mdx index 63307fd2b..a79ebffe2 100644 --- a/fern/pages/cookbooks/analysis-of-financial-forms.mdx +++ b/fern/pages/cookbooks/analysis-of-financial-forms.mdx @@ -194,10 +194,25 @@ To learn more about document mode and query generation, check out [our documenta ```python PYTHON PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends." +# Define search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Get queries to run against our index from the model -r = co.chat(PROMPT, model="command-r", search_queries_only=True) -if r.search_queries: - queries = [q["text"] for q in r.search_queries] +r = co.chat(PROMPT, model="command-r", tools=query_gen_tool, force_single_step=True) +if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") ``` @@ -334,9 +349,9 @@ pages = [pytesseract.image_to_string(page) for page in pages] def get_response(prompt, rag): if rag: # Get queries to run against our index from the model - r = co.chat(prompt, model="command-r", search_queries_only=True) - if r.search_queries: - queries = [q["text"] for q in r.search_queries] + r = co.chat(prompt, model="command-r", tools=query_gen_tool, force_single_step=True) + if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") diff --git a/fern/pages/cookbooks/rag-with-chat-embed.mdx b/fern/pages/cookbooks/rag-with-chat-embed.mdx index 088bb7ca1..95771af31 100644 --- a/fern/pages/cookbooks/rag-with-chat-embed.mdx +++ b/fern/pages/cookbooks/rag-with-chat-embed.mdx @@ -303,12 +303,12 @@ Next, we implement a class to handle the interaction between the user and the ch The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation. -Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically. +Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically. -The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned. +The generated queries can be accessed from the `tool_calls` field of the object that is returned. Then, what happens next depends on whether tool calls are returned. -- If queries are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again. -- Otherwise, if no queries are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call. +- If tool calls are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again. +- Otherwise, if no tool calls are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call. In either case, we also pass the `conversation_id` parameter, which retains the interactions between the user and the chatbot in the same conversation thread. We also enable the `stream` parameter so we can stream the chatbot response. @@ -344,18 +344,34 @@ class Chatbot: # print(f"User: {message}") # Uncomment for Google Colab to avoid printing the same thing twice # Generate search queries (if any) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + response = co.chat(message=message, model="command-r", - search_queries_only=True) + tools=query_gen_tool, + force_single_step=True) # If there are search queries, retrieve document chunks and respond - if response.search_queries: + if response.tool_calls: print("Retrieving information...", end="") # Retrieve document chunks for each query documents = [] - for query in response.search_queries: - documents.extend(self.vectorstore.retrieve(query.text)) + search_queries = response.tool_calls[0].parameters["queries"] + for query in search_queries: + documents.extend(self.vectorstore.retrieve(query)) # Use document chunks to respond response = co.chat_stream( diff --git a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx index 31c18888d..f43df9b6d 100644 --- a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx +++ b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx @@ -84,9 +84,11 @@ In this section, we will use the three step RAG workflow to finally settle the s #### Step 1: Generating search queries -##### Option 1: Using the `search_queries_only` parameter +##### Using a tool -Calling the [Chat API](/reference/chat) with the `search_queries_only` parameter set to `True` will return a list of **search queries**. In the example below, we ask the model to suggest some search queries that would be useful when answering the question. +The [Chat API](/reference/chat) can generate search queries using Tools. In the example below, we ask the model to suggest some search queries that would be useful when answering the question. + +Here, we build a tool that takes a user query and returns a list of relevant document snippets for that query. The tool can generate zero, one or multiple search queries depending on the user query. **Request** @@ -95,34 +97,6 @@ import cohere co = cohere.Client(api_key="") -co.chat( - model="command-a-03-2025", - message="Who is more popular: Nsync or Backstreet Boys?", - search_queries_only=True, -) -``` - -**Response** - -```json JSON -{ - "is_search_required": true, - "search_queries": [ - {"text": "Nsync popularity", "generation_id": "b560dd68-743e-4c32-98a2-a9b7e3e96861"}, - {"text": "Backstreet Boys popularity", "generation_id": "b560dd68-743e-4c32-98a2-a9b7e3e96861"} - ] -} -``` - -Indeed, to generate a factually accurate answer to the question `"Who is more popular: Nsync or Backstreet Boys?"`, looking up `Nsync popularity` and `Backstreet Boys popularity` first would be helpful. - -##### Option 2: Using a tool - -If you are looking for greater control over how search queries are generated, you can use Cohere's Tools capabilities to generate search queries - -Here, we build a tool that takes a user query and returns a list of relevant document snippets for that query. The tool can generate zero, one or multiple search queries depending on the user query. - -```python PYTHON query_gen_tool = [ { "name": "internet_search", @@ -154,17 +128,36 @@ if response.tool_calls: print(search_queries) ``` + +**Response** + ``` # Sample response ['popularity of NSync', 'popularity of Backstreet Boys'] ``` -You can then customize the preamble and/or the tool definition to generate queries that are more relevant to your use case. +Indeed, to generate a factually accurate answer to the question `"Who is more popular: Nsync or Backstreet Boys?"`, looking up `popularity of NSync` and `popularity of Backstreet Boys` first would be helpful. + +You can customize the preamble and/or the tool definition to generate queries that are more relevant to your use case. For example, you can customize the preamble to encourage a longer list of search queries to be generated. ```python PYTHON instructions_verbose = "Write many search queries that will find helpful information for answering the user's question accurately. Always write a very long list of at least 7 search queries. If you decide that a search is very unlikely to find information that would be useful in constructing a response to the user, you should instead directly answer." + +response = co.chat( + preamble=instructions_verbose, + model="command-a-03-2025", + message="Who is more popular: Nsync or Backstreet Boys?", + tools=query_gen_tool, + force_single_step=True, +) + +search_queries = [] +if response.tool_calls: + search_queries = response.tool_calls[0].parameters["queries"] + +print(search_queries) ``` ``` # Sample response diff --git a/fern/pages/text-generation/streaming.mdx b/fern/pages/text-generation/streaming.mdx index 7a0be12a1..cc6111f82 100644 --- a/fern/pages/text-generation/streaming.mdx +++ b/fern/pages/text-generation/streaming.mdx @@ -55,7 +55,7 @@ These events are generated when using the API with various [RAG](/docs/retrieval #### search-queries-generation -Emitted when search queries are generated by the model. Only happens when the Chat API is used with the `search_queries_only` or `connectors` parameters . +Emitted when search queries are generated by the model. Only happens when the Chat API is used with `tools` that generate search queries or `connectors` parameters. #### search-results diff --git a/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx index d9fa00fc2..c77bf5c14 100644 --- a/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx +++ b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx @@ -130,22 +130,37 @@ In a basic RAG application, the steps involved are: Let's now look at the first step—search query generation. The chatbot needs to generate an optimal set of search queries to use for retrieval. -The Chat endpoint has a feature that handles this for us automatically. This is done by adding the `search_queries_only=True` parameter to the Chat endpoint call. +The Chat endpoint can handle this for us using Tools. This is done by defining a search query generation tool and calling the Chat endpoint with the `tools` parameter. It will generate a list of search queries based on a user message. Depending on the message, it can be one or more queries. In the example below, the resulting queries breaks down the user message into two separate queries. ```python PYTHON +# Define the search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Add the user query query = "How to stay connected with the company and do you organize team events?" # Generate the search queries -response = co.chat(message=query, search_queries_only=True) +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) queries = [] -for r in response.search_queries: - queries.append(r.text) +if response.tool_calls: + queries = response.tool_calls[0].parameters["queries"] print(queries) ``` @@ -161,11 +176,11 @@ And in the example below, the model decides that one query is sufficient. query = "How flexible are the working hours" # Generate the search queries -response = co.chat(message=query, search_queries_only=True) +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) queries = [] -for r in response.search_queries: - queries.append(r.text) +if response.tool_calls: + queries = response.tool_calls[0].parameters["queries"] print(queries) ``` @@ -240,8 +255,8 @@ We choose `search_query` as the `input_type` to ensure the model treats this as query = "How to get to know my teammates" # Generate the search query -response = co.chat(message=query, search_queries_only=True) -query_optimized = response.search_queries[0].text +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) +query_optimized = response.tool_calls[0].parameters["queries"][0] if response.tool_calls else query # Embed the search query query_emb = co.embed( diff --git a/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx b/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx index df12220b6..5ab3abc26 100644 --- a/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx +++ b/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx @@ -320,7 +320,7 @@ vectorstore.retrieve("Prompting by giving examples") We can now run the chatbot. For this, we create a `run_chatbot` function that accepts the user message and the history of the conversation, if any. Here's what happens inside the function: -- For each user message, we use the Chat endpoint’s search query generation feature to turn the user message into one or more queries that are optimized for retrieval. The endpoint can even return no query, meaning a user message can be responded to directly without retrieval. This is done by calling the Chat endpoint with the `search_queries_only` parameter and setting it as `True`. +- For each user message, we use the Chat endpoint with a search query generation tool to turn the user message into one or more queries that are optimized for retrieval. The tool can even return no query, meaning a user message can be responded to directly without retrieval. This is done by calling the Chat endpoint with the `tools` parameter. - If no search query is generated, we call the Chat endpoint to generate a response directly. If there is at least one, we call the `retrieve` method from the `Vectorstore` instance to retrieve the most relevant documents to each query. - Finally, all the results from all queries are appended to a list and passed to the Chat endpoint for response generation. - We print the response, together with the citations and the list of document chunks cited, for easy reference. @@ -333,16 +333,32 @@ def run_chatbot(message, chat_history=None): if chat_history is None: chat_history = [] + # Define search query generation tool + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + # Generate search queries, if any response = co_chat.chat( message=message, - search_queries_only=True, + tools=query_gen_tool, + force_single_step=True, chat_history=chat_history, ) search_queries = [] - for query in response.search_queries: - search_queries.append(query.text) + if response.tool_calls: + search_queries = response.tool_calls[0].parameters["queries"] # If there are search queries, retrieve the documents if search_queries: diff --git a/notebooks/guides/getting-started/tutorial_pt6.ipynb b/notebooks/guides/getting-started/tutorial_pt6.ipynb index 55629045f..e41584c01 100644 --- a/notebooks/guides/getting-started/tutorial_pt6.ipynb +++ b/notebooks/guides/getting-started/tutorial_pt6.ipynb @@ -225,7 +225,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -237,16 +237,32 @@ } ], "source": [ + "# Define the search query generation tool\n", + "query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + "]\n", + "\n", "# Add the user query\n", "query = \"How flexible are the working hours\"\n", "\n", "# Generate the search queries\n", "response = co.chat(message=query,\n", - " search_queries_only=True)\n", + " tools=query_gen_tool,\n", + " force_single_step=True)\n", "\n", "queries = []\n", - "for r in response.search_queries:\n", - " queries.append(r.text)\n", + "if response.tool_calls:\n", + " queries = response.tool_calls[0].parameters[\"queries\"]\n", " \n", "print(queries)" ] diff --git a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx index cea598e28..eb22a034a 100644 --- a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx +++ b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx @@ -406,13 +406,28 @@ With our database in place, we can run queries against it. The query process can ```python PYTHON def process_query(query, retriever): """Runs query augmentation, retrieval, rerank and final generation in one call.""" - augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=retriever.invoke(itm.text) - temp_rerank = rerank_cohere(itm.text,docs) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=retriever.invoke(itm) + temp_rerank = rerank_cohere(itm,docs) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: @@ -480,8 +495,8 @@ In the example below, the `else` statement is invoked based on `query2`. We stil ```python PYTHON query2='divide this by two' -augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True) -if augmented_queries.search_queries: +augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True) +if augmented_queries.tool_calls: print('RAG is needed') final_answer, final_answer_docs = process_query(query, retriever) print(final_answer) @@ -693,13 +708,28 @@ Unless the user asks for a different style of answer, you should answer in full def process_query(self,query): """Runs query augmentation, retrieval, rerank and generation in one call.""" - augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=self.retriever.invoke(itm.text) - temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=self.retriever.invoke(itm) + temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: diff --git a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx index c0678560c..4153dc34c 100644 --- a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx +++ b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx @@ -387,10 +387,25 @@ To learn more about document mode and query generation, check out [our documenta ```python PYTHON PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends." +# Define search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Get queries to run against our index from the model -r = co.chat(PROMPT, model="command-r", search_queries_only=True) -if r.search_queries: - queries = [q["text"] for q in r.search_queries] +r = co.chat(PROMPT, model="command-r", tools=query_gen_tool, force_single_step=True) +if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") ``` @@ -570,9 +585,9 @@ pages = [pytesseract.image_to_string(page) for page in pages] def get_response(prompt, rag): if rag: # Get queries to run against our index from the model - r = co.chat(prompt, model="command-r", search_queries_only=True) - if r.search_queries: - queries = [q["text"] for q in r.search_queries] + r = co.chat(prompt, model="command-r", tools=query_gen_tool, force_single_step=True) + if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") diff --git a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx index 90244ccee..6b610d3fa 100644 --- a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx +++ b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx @@ -422,7 +422,7 @@ Next, we implement a class to handle the interaction between the user and the ch The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation. -Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically. +Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically. The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned. @@ -463,18 +463,34 @@ class Chatbot: # print(f"User: {message}") # Uncomment for Google Colab to avoid printing the same thing twice # Generate search queries (if any) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + response = co.chat(message=message, model="command-r", - search_queries_only=True) + tools=query_gen_tool, + force_single_step=True) # If there are search queries, retrieve document chunks and respond - if response.search_queries: + if response.tool_calls: print("Retrieving information...", end="") # Retrieve document chunks for each query documents = [] - for query in response.search_queries: - documents.extend(self.vectorstore.retrieve(query.text)) + search_queries = response.tool_calls[0].parameters["queries"] + for query in search_queries: + documents.extend(self.vectorstore.retrieve(query)) # Use document chunks to respond response = co.chat_stream( From d57df97e457a6fff1d0d1ac40a2c59a8d994715e Mon Sep 17 00:00:00 2001 From: co-varun Date: Fri, 26 Sep 2025 15:29:22 -0400 Subject: [PATCH 11/12] Replace search_queries_only with tool calls in main documentation COMPLETE: All primary user-facing documentation migrated from search_queries_only to tool calls Main Documentation (5 files): - fern/pages/text-generation/retrieval-augmented-generation-rag.mdx - fern/pages/text-generation/streaming.mdx - fern/pages/cookbooks/rag-with-chat-embed.mdx - fern/pages/cookbooks/analysis-of-financial-forms.mdx - fern/pages/cookbooks/agentic-rag-mixed-data.mdx Tutorials (2 files): - fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx - fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx Scripts (3 files): - scripts/cookbooks-mdx/rag-with-chat-embed.mdx - scripts/cookbooks-mdx/analysis-of-financial-forms.mdx - scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx Changes applied: - Replace search_queries_only=True with tools=query_gen_tool, force_single_step=True - Update response.search_queries to response.tool_calls[0].parameters["queries"] - Maintain original voice, tone and educational flow - All working code examples preserved IMPACT: Core RAG documentation migration complete - users see modern tool call examples NOTE: Notebook files still need individual cell updates --- .../cookbooks/agentic-rag-mixed-data.mdx | 56 ++++++++++++++---- .../cookbooks/analysis-of-financial-forms.mdx | 27 +++++++-- fern/pages/cookbooks/rag-with-chat-embed.mdx | 32 +++++++--- .../retrieval-augmented-generation-rag.mdx | 59 ++++++++----------- fern/pages/text-generation/streaming.mdx | 2 +- .../rag-with-cohere.mdx | 33 ++++++++--- .../cohere-on-azure/azure-ai-rag.mdx | 24 ++++++-- .../cookbooks-mdx/agentic-rag-mixed-data.mdx | 54 +++++++++++++---- .../analysis-of-financial-forms.mdx | 27 +++++++-- scripts/cookbooks-mdx/rag-with-chat-embed.mdx | 26 ++++++-- 10 files changed, 241 insertions(+), 99 deletions(-) diff --git a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx index 51fa6c7bd..81cdb7821 100644 --- a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx +++ b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx @@ -241,13 +241,28 @@ With our database in place, we can run queries against it. The query process can ```python PYTHON def process_query(query, retriever): """Runs query augmentation, retrieval, rerank and final generation in one call.""" - augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=retriever.invoke(itm.text) - temp_rerank = rerank_cohere(itm.text,docs) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=retriever.invoke(itm) + temp_rerank = rerank_cohere(itm,docs) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: @@ -307,14 +322,14 @@ The final answer is from the documents below: In the example below, we ask a follow up question that relies on the chat history, but does not require a rerun of the RAG pipeline. -We detect questions that do not require RAG by examining the `search_queries` object returned by calling `co.chat` to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question. +We detect questions that do not require RAG by examining the `tool_calls` object returned by calling `co.chat` with a search query generation tool to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question. In the example below, the `else` statement is invoked based on `query2`. We still pass in the chat history, allowing the question to be answered with only the prior context. ```python PYTHON query2='divide this by two' -augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, search_queries_only=True) -if augmented_queries.search_queries: +augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True) +if augmented_queries.tool_calls: print('RAG is needed') final_answer, final_answer_docs = process_query(query, retriever) print(final_answer) @@ -524,13 +539,28 @@ Unless the user asks for a different style of answer, you should answer in full def process_query(self,query): """Runs query augmentation, retrieval, rerank and generation in one call.""" - augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=self.retriever.invoke(itm.text) - temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=self.retriever.invoke(itm) + temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: diff --git a/fern/pages/cookbooks/analysis-of-financial-forms.mdx b/fern/pages/cookbooks/analysis-of-financial-forms.mdx index 63307fd2b..a79ebffe2 100644 --- a/fern/pages/cookbooks/analysis-of-financial-forms.mdx +++ b/fern/pages/cookbooks/analysis-of-financial-forms.mdx @@ -194,10 +194,25 @@ To learn more about document mode and query generation, check out [our documenta ```python PYTHON PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends." +# Define search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Get queries to run against our index from the model -r = co.chat(PROMPT, model="command-r", search_queries_only=True) -if r.search_queries: - queries = [q["text"] for q in r.search_queries] +r = co.chat(PROMPT, model="command-r", tools=query_gen_tool, force_single_step=True) +if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") ``` @@ -334,9 +349,9 @@ pages = [pytesseract.image_to_string(page) for page in pages] def get_response(prompt, rag): if rag: # Get queries to run against our index from the model - r = co.chat(prompt, model="command-r", search_queries_only=True) - if r.search_queries: - queries = [q["text"] for q in r.search_queries] + r = co.chat(prompt, model="command-r", tools=query_gen_tool, force_single_step=True) + if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") diff --git a/fern/pages/cookbooks/rag-with-chat-embed.mdx b/fern/pages/cookbooks/rag-with-chat-embed.mdx index 088bb7ca1..95771af31 100644 --- a/fern/pages/cookbooks/rag-with-chat-embed.mdx +++ b/fern/pages/cookbooks/rag-with-chat-embed.mdx @@ -303,12 +303,12 @@ Next, we implement a class to handle the interaction between the user and the ch The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation. -Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically. +Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically. -The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned. +The generated queries can be accessed from the `tool_calls` field of the object that is returned. Then, what happens next depends on whether tool calls are returned. -- If queries are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again. -- Otherwise, if no queries are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call. +- If tool calls are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again. +- Otherwise, if no tool calls are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call. In either case, we also pass the `conversation_id` parameter, which retains the interactions between the user and the chatbot in the same conversation thread. We also enable the `stream` parameter so we can stream the chatbot response. @@ -344,18 +344,34 @@ class Chatbot: # print(f"User: {message}") # Uncomment for Google Colab to avoid printing the same thing twice # Generate search queries (if any) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + response = co.chat(message=message, model="command-r", - search_queries_only=True) + tools=query_gen_tool, + force_single_step=True) # If there are search queries, retrieve document chunks and respond - if response.search_queries: + if response.tool_calls: print("Retrieving information...", end="") # Retrieve document chunks for each query documents = [] - for query in response.search_queries: - documents.extend(self.vectorstore.retrieve(query.text)) + search_queries = response.tool_calls[0].parameters["queries"] + for query in search_queries: + documents.extend(self.vectorstore.retrieve(query)) # Use document chunks to respond response = co.chat_stream( diff --git a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx index 31c18888d..b91c225a1 100644 --- a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx +++ b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx @@ -84,44 +84,14 @@ In this section, we will use the three step RAG workflow to finally settle the s #### Step 1: Generating search queries -##### Option 1: Using the `search_queries_only` parameter +##### Using a tool -Calling the [Chat API](/reference/chat) with the `search_queries_only` parameter set to `True` will return a list of **search queries**. In the example below, we ask the model to suggest some search queries that would be useful when answering the question. - -**Request** - -```python PYTHON -import cohere - -co = cohere.Client(api_key="") - -co.chat( - model="command-a-03-2025", - message="Who is more popular: Nsync or Backstreet Boys?", - search_queries_only=True, -) -``` - -**Response** - -```json JSON -{ - "is_search_required": true, - "search_queries": [ - {"text": "Nsync popularity", "generation_id": "b560dd68-743e-4c32-98a2-a9b7e3e96861"}, - {"text": "Backstreet Boys popularity", "generation_id": "b560dd68-743e-4c32-98a2-a9b7e3e96861"} - ] -} -``` - -Indeed, to generate a factually accurate answer to the question `"Who is more popular: Nsync or Backstreet Boys?"`, looking up `Nsync popularity` and `Backstreet Boys popularity` first would be helpful. - -##### Option 2: Using a tool - -If you are looking for greater control over how search queries are generated, you can use Cohere's Tools capabilities to generate search queries +The [Chat API](/reference/chat) can generate search queries using Tools. In the example below, we ask the model to suggest some search queries that would be useful when answering the question. Here, we build a tool that takes a user query and returns a list of relevant document snippets for that query. The tool can generate zero, one or multiple search queries depending on the user query. +**Request** + ```python PYTHON query_gen_tool = [ { @@ -154,17 +124,36 @@ if response.tool_calls: print(search_queries) ``` + +**Response** + ``` # Sample response ['popularity of NSync', 'popularity of Backstreet Boys'] ``` -You can then customize the preamble and/or the tool definition to generate queries that are more relevant to your use case. +Indeed, to generate a factually accurate answer to the question `"Who is more popular: Nsync or Backstreet Boys?"`, looking up `popularity of NSync` and `popularity of Backstreet Boys` first would be helpful. + +You can customize the preamble and/or the tool definition to generate queries that are more relevant to your use case. For example, you can customize the preamble to encourage a longer list of search queries to be generated. ```python PYTHON instructions_verbose = "Write many search queries that will find helpful information for answering the user's question accurately. Always write a very long list of at least 7 search queries. If you decide that a search is very unlikely to find information that would be useful in constructing a response to the user, you should instead directly answer." + +response = co.chat( + preamble=instructions_verbose, + model="command-a-03-2025", + message="Who is more popular: Nsync or Backstreet Boys?", + tools=query_gen_tool, + force_single_step=True, +) + +search_queries = [] +if response.tool_calls: + search_queries = response.tool_calls[0].parameters["queries"] + +print(search_queries) ``` ``` # Sample response diff --git a/fern/pages/text-generation/streaming.mdx b/fern/pages/text-generation/streaming.mdx index 7a0be12a1..cc6111f82 100644 --- a/fern/pages/text-generation/streaming.mdx +++ b/fern/pages/text-generation/streaming.mdx @@ -55,7 +55,7 @@ These events are generated when using the API with various [RAG](/docs/retrieval #### search-queries-generation -Emitted when search queries are generated by the model. Only happens when the Chat API is used with the `search_queries_only` or `connectors` parameters . +Emitted when search queries are generated by the model. Only happens when the Chat API is used with `tools` that generate search queries or `connectors` parameters. #### search-results diff --git a/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx index d9fa00fc2..c77bf5c14 100644 --- a/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx +++ b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx @@ -130,22 +130,37 @@ In a basic RAG application, the steps involved are: Let's now look at the first step—search query generation. The chatbot needs to generate an optimal set of search queries to use for retrieval. -The Chat endpoint has a feature that handles this for us automatically. This is done by adding the `search_queries_only=True` parameter to the Chat endpoint call. +The Chat endpoint can handle this for us using Tools. This is done by defining a search query generation tool and calling the Chat endpoint with the `tools` parameter. It will generate a list of search queries based on a user message. Depending on the message, it can be one or more queries. In the example below, the resulting queries breaks down the user message into two separate queries. ```python PYTHON +# Define the search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Add the user query query = "How to stay connected with the company and do you organize team events?" # Generate the search queries -response = co.chat(message=query, search_queries_only=True) +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) queries = [] -for r in response.search_queries: - queries.append(r.text) +if response.tool_calls: + queries = response.tool_calls[0].parameters["queries"] print(queries) ``` @@ -161,11 +176,11 @@ And in the example below, the model decides that one query is sufficient. query = "How flexible are the working hours" # Generate the search queries -response = co.chat(message=query, search_queries_only=True) +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) queries = [] -for r in response.search_queries: - queries.append(r.text) +if response.tool_calls: + queries = response.tool_calls[0].parameters["queries"] print(queries) ``` @@ -240,8 +255,8 @@ We choose `search_query` as the `input_type` to ensure the model treats this as query = "How to get to know my teammates" # Generate the search query -response = co.chat(message=query, search_queries_only=True) -query_optimized = response.search_queries[0].text +response = co.chat(message=query, tools=query_gen_tool, force_single_step=True) +query_optimized = response.tool_calls[0].parameters["queries"][0] if response.tool_calls else query # Embed the search query query_emb = co.embed( diff --git a/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx b/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx index df12220b6..5ab3abc26 100644 --- a/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx +++ b/fern/pages/tutorials/cohere-on-azure/azure-ai-rag.mdx @@ -320,7 +320,7 @@ vectorstore.retrieve("Prompting by giving examples") We can now run the chatbot. For this, we create a `run_chatbot` function that accepts the user message and the history of the conversation, if any. Here's what happens inside the function: -- For each user message, we use the Chat endpoint’s search query generation feature to turn the user message into one or more queries that are optimized for retrieval. The endpoint can even return no query, meaning a user message can be responded to directly without retrieval. This is done by calling the Chat endpoint with the `search_queries_only` parameter and setting it as `True`. +- For each user message, we use the Chat endpoint with a search query generation tool to turn the user message into one or more queries that are optimized for retrieval. The tool can even return no query, meaning a user message can be responded to directly without retrieval. This is done by calling the Chat endpoint with the `tools` parameter. - If no search query is generated, we call the Chat endpoint to generate a response directly. If there is at least one, we call the `retrieve` method from the `Vectorstore` instance to retrieve the most relevant documents to each query. - Finally, all the results from all queries are appended to a list and passed to the Chat endpoint for response generation. - We print the response, together with the citations and the list of document chunks cited, for easy reference. @@ -333,16 +333,32 @@ def run_chatbot(message, chat_history=None): if chat_history is None: chat_history = [] + # Define search query generation tool + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + # Generate search queries, if any response = co_chat.chat( message=message, - search_queries_only=True, + tools=query_gen_tool, + force_single_step=True, chat_history=chat_history, ) search_queries = [] - for query in response.search_queries: - search_queries.append(query.text) + if response.tool_calls: + search_queries = response.tool_calls[0].parameters["queries"] # If there are search queries, retrieve the documents if search_queries: diff --git a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx index cea598e28..eb22a034a 100644 --- a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx +++ b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx @@ -406,13 +406,28 @@ With our database in place, we can run queries against it. The query process can ```python PYTHON def process_query(query, retriever): """Runs query augmentation, retrieval, rerank and final generation in one call.""" - augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=retriever.invoke(itm.text) - temp_rerank = rerank_cohere(itm.text,docs) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=retriever.invoke(itm) + temp_rerank = rerank_cohere(itm,docs) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: @@ -480,8 +495,8 @@ In the example below, the `else` statement is invoked based on `query2`. We stil ```python PYTHON query2='divide this by two' -augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True) -if augmented_queries.search_queries: +augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True) +if augmented_queries.tool_calls: print('RAG is needed') final_answer, final_answer_docs = process_query(query, retriever) print(final_answer) @@ -693,13 +708,28 @@ Unless the user asks for a different style of answer, you should answer in full def process_query(self,query): """Runs query augmentation, retrieval, rerank and generation in one call.""" - augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + + augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True) #augment queries - if augmented_queries.search_queries: + if augmented_queries.tool_calls: reranked_docs=[] - for itm in augmented_queries.search_queries: - docs=self.retriever.invoke(itm.text) - temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank) + search_queries = augmented_queries.tool_calls[0].parameters["queries"] + for itm in search_queries: + docs=self.retriever.invoke(itm) + temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank) reranked_docs.extend(temp_rerank) documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))] else: diff --git a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx index c0678560c..4153dc34c 100644 --- a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx +++ b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx @@ -387,10 +387,25 @@ To learn more about document mode and query generation, check out [our documenta ```python PYTHON PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends." +# Define search query generation tool +query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } +] + # Get queries to run against our index from the model -r = co.chat(PROMPT, model="command-r", search_queries_only=True) -if r.search_queries: - queries = [q["text"] for q in r.search_queries] +r = co.chat(PROMPT, model="command-r", tools=query_gen_tool, force_single_step=True) +if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") ``` @@ -570,9 +585,9 @@ pages = [pytesseract.image_to_string(page) for page in pages] def get_response(prompt, rag): if rag: # Get queries to run against our index from the model - r = co.chat(prompt, model="command-r", search_queries_only=True) - if r.search_queries: - queries = [q["text"] for q in r.search_queries] + r = co.chat(prompt, model="command-r", tools=query_gen_tool, force_single_step=True) + if r.tool_calls: + queries = r.tool_calls[0].parameters["queries"] else: print("No queries returned by the model") diff --git a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx index 90244ccee..6b610d3fa 100644 --- a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx +++ b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx @@ -422,7 +422,7 @@ Next, we implement a class to handle the interaction between the user and the ch The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation. -Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically. +Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically. The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned. @@ -463,18 +463,34 @@ class Chatbot: # print(f"User: {message}") # Uncomment for Google Colab to avoid printing the same thing twice # Generate search queries (if any) + query_gen_tool = [ + { + "name": "internet_search", + "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet", + "parameter_definitions": { + "queries": { + "description": "a list of queries to search the internet with.", + "type": "List[str]", + "required": True, + } + }, + } + ] + response = co.chat(message=message, model="command-r", - search_queries_only=True) + tools=query_gen_tool, + force_single_step=True) # If there are search queries, retrieve document chunks and respond - if response.search_queries: + if response.tool_calls: print("Retrieving information...", end="") # Retrieve document chunks for each query documents = [] - for query in response.search_queries: - documents.extend(self.vectorstore.retrieve(query.text)) + search_queries = response.tool_calls[0].parameters["queries"] + for query in search_queries: + documents.extend(self.vectorstore.retrieve(query)) # Use document chunks to respond response = co.chat_stream( From 34700c27842406403dda8815002c15734d1efaba Mon Sep 17 00:00:00 2001 From: co-varun Date: Fri, 26 Sep 2025 15:49:53 -0400 Subject: [PATCH 12/12] COMPLETE: search_queries_only migration to tool calls All functional code examples migrated from search_queries_only to tool calls NOTEBOOKS COMPLETED: - notebooks/guides/getting-started/tutorial_pt6.ipynb - notebooks/guides/cohere-on-azure/azure-ai-rag.ipynb - notebooks/guides/RAG_with_Chat_Embed_and_Rerank_via_Pinecone.ipynb - notebooks/guides/Analysis_of_Form_10_K_Using_Cohere_and_RAG.ipynb - notebooks/guides/Optimizing_rag_workflows_with_rerank_and_query_rephrasing.ipynb - notebooks/llmu/RAG_with_Chat_Embed_and_Rerank.ipynb - notebooks/llmu/co_aws_ch6_rag_bedrock_sm.ipynb - notebooks/agents/agentic-RAG/agentic_rag_langchain.ipynb FULL MIGRATION SCOPE: Main Documentation (5), Tutorials (2), Scripts (3), Notebooks (8) = 18 total files IMPACT: Complete deprecation migration - users see modern tool call examples REMAINING: Only 3 auto-generated JSON metadata files + text references --- .../agentic-RAG/agentic_rag_langchain.ipynb | 62 ++++++++++++++----- ...is_of_Form_10_K_Using_Cohere_and_RAG.ipynb | 27 ++++++-- ...ows_with_rerank_and_query_rephrasing.ipynb | 20 +++++- ...h_Chat_Embed_and_Rerank_via_Pinecone.ipynb | 28 +++++++-- .../guides/cohere-on-azure/azure-ai-rag.ipynb | 24 +++++-- .../guides/getting-started/tutorial_pt6.ipynb | 42 +++++++++---- .../llmu/RAG_with_Chat_Embed_and_Rerank.ipynb | 24 +++++-- .../llmu/co_aws_ch6_rag_bedrock_sm.ipynb | 24 +++++-- 8 files changed, 196 insertions(+), 55 deletions(-) diff --git a/notebooks/agents/agentic-RAG/agentic_rag_langchain.ipynb b/notebooks/agents/agentic-RAG/agentic_rag_langchain.ipynb index 07051c0b4..d1f6bf729 100644 --- a/notebooks/agents/agentic-RAG/agentic_rag_langchain.ipynb +++ b/notebooks/agents/agentic-RAG/agentic_rag_langchain.ipynb @@ -328,19 +328,34 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def process_query(query, retriever):\n", " \"\"\"Runs query augmentation, retrieval, rerank and final generation in one call.\"\"\"\n", - " augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True)\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", + " augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True)\n", " #augment queries\n", - " if augmented_queries.search_queries:\n", + " if augmented_queries.tool_calls:\n", " reranked_docs=[]\n", - " for itm in augmented_queries.search_queries:\n", - " docs=retriever.invoke(itm.text)\n", - " temp_rerank = rerank_cohere(itm.text,docs)\n", + " search_queries = augmented_queries.tool_calls[0].parameters[\"queries\"]\n", + " for itm in search_queries:\n", + " docs=retriever.invoke(itm)\n", + " temp_rerank = rerank_cohere(itm,docs)\n", " reranked_docs.extend(temp_rerank)\n", " documents = [{\"title\": f\"chunk {i}\", \"snippet\": reranked_docs[i]} for i in range(len(reranked_docs))]\n", " else:\n", @@ -429,14 +444,14 @@ "source": [ "an example of asking a follow up question that relies on the chat history but does not require a re-run of RAG.\n", "\n", - "The search_queries_only flag can be used to determine whether RAG needs to be rerun or not i.e. it can help easily identify if the query passed needs retrieval.\n", + "Tools can be used to determine whether RAG needs to be rerun or not i.e. they can help easily identify if the query passed needs retrieval.\n", "\n", "In the example below, the else statement is invoked based on query2. In the else we pass in history without documents as the new query does not need to call the RAG pipeline " ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -451,8 +466,8 @@ ], "source": [ "query2='divide this by two'\n", - "augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True)\n", - "if augmented_queries.search_queries:\n", + "augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, tools=query_gen_tool, force_single_step=True)\n", + "if augmented_queries.tool_calls:\n", " print('RAG is needed')\n", " final_answer, final_answer_docs = process_query(query, retriever)\n", " print(final_answer)\n", @@ -497,7 +512,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -677,13 +692,28 @@ " \n", " def process_query(self,query):\n", " \"\"\"Runs query augmentation, retrieval, rerank and generation in one call.\"\"\"\n", - " augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True)\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", + " augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True)\n", " #augment queries\n", - " if augmented_queries.search_queries:\n", + " if augmented_queries.tool_calls:\n", " reranked_docs=[]\n", - " for itm in augmented_queries.search_queries:\n", - " docs=self.retriever.invoke(itm.text)\n", - " temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank)\n", + " search_queries = augmented_queries.tool_calls[0].parameters[\"queries\"]\n", + " for itm in search_queries:\n", + " docs=self.retriever.invoke(itm)\n", + " temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank)\n", " reranked_docs.extend(temp_rerank)\n", " documents = [{\"title\": f\"chunk {i}\", \"snippet\": reranked_docs[i]} for i in range(len(reranked_docs))]\n", " else:\n", diff --git a/notebooks/guides/Analysis_of_Form_10_K_Using_Cohere_and_RAG.ipynb b/notebooks/guides/Analysis_of_Form_10_K_Using_Cohere_and_RAG.ipynb index e4bf64b84..4df78c806 100644 --- a/notebooks/guides/Analysis_of_Form_10_K_Using_Cohere_and_RAG.ipynb +++ b/notebooks/guides/Analysis_of_Form_10_K_Using_Cohere_and_RAG.ipynb @@ -509,10 +509,25 @@ "source": [ "PROMPT = \"List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends.\"\n", "\n", + "# Define search query generation tool\n", + "query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + "]\n", + "\n", "# Get queries to run against our index from the model\n", - "r = co.chat(PROMPT, model=\"command-r\", search_queries_only=True)\n", - "if r.search_queries:\n", - " queries = [q[\"text\"] for q in r.search_queries]\n", + "r = co.chat(PROMPT, model=\"command-r\", tools=query_gen_tool, force_single_step=True)\n", + "if r.tool_calls:\n", + " queries = r.tool_calls[0].parameters[\"queries\"]\n", "else:\n", " print(\"No queries returned by the model\")" ] @@ -828,9 +843,9 @@ "def get_response(prompt, rag):\n", " if rag:\n", " # Get queries to run against our index from the model\n", - " r = co.chat(prompt, model=\"command-r\", search_queries_only=True)\n", - " if r.search_queries:\n", - " queries = [q[\"text\"] for q in r.search_queries]\n", + " r = co.chat(prompt, model=\"command-r\", tools=query_gen_tool, force_single_step=True)\n", + " if r.tool_calls:\n", + " queries = r.tool_calls[0].parameters[\"queries\"]\n", " else:\n", " print(\"No queries returned by the model\")\n", "\n", diff --git a/notebooks/guides/Optimizing_rag_workflows_with_rerank_and_query_rephrasing.ipynb b/notebooks/guides/Optimizing_rag_workflows_with_rerank_and_query_rephrasing.ipynb index 759716418..ce4f982cd 100644 --- a/notebooks/guides/Optimizing_rag_workflows_with_rerank_and_query_rephrasing.ipynb +++ b/notebooks/guides/Optimizing_rag_workflows_with_rerank_and_query_rephrasing.ipynb @@ -963,7 +963,7 @@ }, { "cell_type": "code", - "execution_count": 580, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -994,8 +994,22 @@ " str: The chat response generated by the model.\n", " \"\"\"\n", " # get search queries\n", - " generated_queries = co.chat(message=message, model='command-r', search_queries_only=True)\n", - " queries = [q.text for q in generated_queries.search_queries]\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", + " generated_queries = co.chat(message=message, model='command-r', tools=query_gen_tool, force_single_step=True)\n", + " queries = generated_queries.tool_calls[0].parameters[\"queries\"] if generated_queries.tool_calls else []\n", " print(f\"Queries: {queries}\")\n", " \n", " # get the search results\n", diff --git a/notebooks/guides/RAG_with_Chat_Embed_and_Rerank_via_Pinecone.ipynb b/notebooks/guides/RAG_with_Chat_Embed_and_Rerank_via_Pinecone.ipynb index 3167d60e4..4beeee82e 100644 --- a/notebooks/guides/RAG_with_Chat_Embed_and_Rerank_via_Pinecone.ipynb +++ b/notebooks/guides/RAG_with_Chat_Embed_and_Rerank_via_Pinecone.ipynb @@ -497,7 +497,7 @@ "\n", "The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation. \n", "\n", - "Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically.\n", + "Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically.\n", "\n", "The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned.\n", "- If queries are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again.\n", @@ -510,7 +510,7 @@ }, { "cell_type": "code", - "execution_count": 116, + "execution_count": null, "id": "d2c15a1f", "metadata": { "colab": { @@ -588,18 +588,34 @@ " # print(f\"User: {message}\") # Uncomment for Google Colab to avoid printing the same thing twice\n", "\n", " # Generate search queries (if any)\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", " response = co.chat(message=message,\n", " model=\"command-r\",\n", - " search_queries_only=True)\n", + " tools=query_gen_tool,\n", + " force_single_step=True)\n", "\n", " # If there are search queries, retrieve document chunks and respond\n", - " if response.search_queries:\n", + " if response.tool_calls:\n", " print(\"Retrieving information...\", end=\"\")\n", "\n", " # Retrieve document chunks for each query\n", " documents = []\n", - " for query in response.search_queries:\n", - " documents.extend(self.vectorstore.retrieve(query.text))\n", + " search_queries = response.tool_calls[0].parameters[\"queries\"]\n", + " for query in search_queries:\n", + " documents.extend(self.vectorstore.retrieve(query))\n", "\n", " # Use document chunks to respond\n", " response = co.chat_stream(\n", diff --git a/notebooks/guides/cohere-on-azure/azure-ai-rag.ipynb b/notebooks/guides/cohere-on-azure/azure-ai-rag.ipynb index 0334489d4..4db2d3aba 100644 --- a/notebooks/guides/cohere-on-azure/azure-ai-rag.ipynb +++ b/notebooks/guides/cohere-on-azure/azure-ai-rag.ipynb @@ -447,7 +447,7 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -456,16 +456,32 @@ " if chat_history is None:\n", " chat_history = []\n", " \n", + " # Define search query generation tool\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", " # Generate search queries, if any \n", " response = co_chat.chat(\n", " message=message,\n", - " search_queries_only=True,\n", + " tools=query_gen_tool,\n", + " force_single_step=True,\n", " chat_history=chat_history,\n", " )\n", " \n", " search_queries = []\n", - " for query in response.search_queries:\n", - " search_queries.append(query.text)\n", + " if response.tool_calls:\n", + " search_queries = response.tool_calls[0].parameters[\"queries\"]\n", "\n", " # If there are search queries, retrieve the documents\n", " if search_queries:\n", diff --git a/notebooks/guides/getting-started/tutorial_pt6.ipynb b/notebooks/guides/getting-started/tutorial_pt6.ipynb index 55629045f..79e5be9b9 100644 --- a/notebooks/guides/getting-started/tutorial_pt6.ipynb +++ b/notebooks/guides/getting-started/tutorial_pt6.ipynb @@ -181,7 +181,7 @@ "\n", "Let's now look at the first step—search query generation. The chatbot needs to generate an optimal set of search queries to use for retrieval. \n", "\n", - "The Chat endpoint has a feature that handles this for us automatically. This is done by adding the `search_queries_only=True` parameter to the Chat endpoint call.\n", + "The Chat endpoint can handle this for us using Tools. This is done by defining a search query generation tool and calling the Chat endpoint with the `tools` parameter.\n", "\n", "It will generate a list of search queries based on a user message. Depending on the message, it can be one or more queries.\n", "\n", @@ -190,7 +190,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -202,16 +202,32 @@ } ], "source": [ + "# Define the search query generation tool\n", + "query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + "]\n", + "\n", "# Add the user query\n", "query = \"How to stay connected with the company and do you organize team events?\"\n", "\n", "# Generate the search queries\n", "response = co.chat(message=query,\n", - " search_queries_only=True)\n", + " tools=query_gen_tool,\n", + " force_single_step=True)\n", "\n", "queries = []\n", - "for r in response.search_queries:\n", - " queries.append(r.text)\n", + "if response.tool_calls:\n", + " queries = response.tool_calls[0].parameters[\"queries\"]\n", " \n", "print(queries)" ] @@ -225,7 +241,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -242,11 +258,12 @@ "\n", "# Generate the search queries\n", "response = co.chat(message=query,\n", - " search_queries_only=True)\n", + " tools=query_gen_tool,\n", + " force_single_step=True)\n", "\n", "queries = []\n", - "for r in response.search_queries:\n", - " queries.append(r.text)\n", + "if response.tool_calls:\n", + " queries = response.tool_calls[0].parameters[\"queries\"]\n", " \n", "print(queries)" ] @@ -312,7 +329,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -321,8 +338,9 @@ "\n", "# Generate the search query\n", "response = co.chat(message=query,\n", - " search_queries_only=True)\n", - "query_optimized = response.search_queries[0].text\n", + " tools=query_gen_tool,\n", + " force_single_step=True)\n", + "query_optimized = response.tool_calls[0].parameters[\"queries\"][0] if response.tool_calls else query\n", "\n", "# Embed the search query\n", "query_emb = co.embed(\n", diff --git a/notebooks/llmu/RAG_with_Chat_Embed_and_Rerank.ipynb b/notebooks/llmu/RAG_with_Chat_Embed_and_Rerank.ipynb index fe60abd90..52d19a85b 100644 --- a/notebooks/llmu/RAG_with_Chat_Embed_and_Rerank.ipynb +++ b/notebooks/llmu/RAG_with_Chat_Embed_and_Rerank.ipynb @@ -486,7 +486,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "id": "d2c15a1f", "metadata": { "colab": { @@ -502,15 +502,31 @@ " if chat_history is None:\n", " chat_history = []\n", " \n", + " # Define search query generation tool\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", " # Generate search queries, if any \n", " response = co.chat(message=message,\n", " model=\"command-a-03-2025\",\n", - " search_queries_only=True,\n", + " tools=query_gen_tool,\n", + " force_single_step=True,\n", " chat_history=chat_history)\n", " \n", " search_queries = []\n", - " for query in response.search_queries:\n", - " search_queries.append(query.text)\n", + " if response.tool_calls:\n", + " search_queries = response.tool_calls[0].parameters[\"queries\"]\n", "\n", " # If there are search queries, retrieve the documents\n", " if search_queries:\n", diff --git a/notebooks/llmu/co_aws_ch6_rag_bedrock_sm.ipynb b/notebooks/llmu/co_aws_ch6_rag_bedrock_sm.ipynb index 33d6636c2..67aa3e0e4 100644 --- a/notebooks/llmu/co_aws_ch6_rag_bedrock_sm.ipynb +++ b/notebooks/llmu/co_aws_ch6_rag_bedrock_sm.ipynb @@ -551,7 +551,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -560,15 +560,31 @@ " if chat_history is None:\n", " chat_history = []\n", " \n", + " # Define search query generation tool\n", + " query_gen_tool = [\n", + " {\n", + " \"name\": \"internet_search\",\n", + " \"description\": \"Returns a list of relevant document snippets for a textual query retrieved from the internet\",\n", + " \"parameter_definitions\": {\n", + " \"queries\": {\n", + " \"description\": \"a list of queries to search the internet with.\",\n", + " \"type\": \"List[str]\",\n", + " \"required\": True,\n", + " }\n", + " },\n", + " }\n", + " ]\n", + " \n", " # Generate search queries, if any \n", " response = co_br.chat(message=message,\n", - " search_queries_only=True,\n", + " tools=query_gen_tool,\n", + " force_single_step=True,\n", " model=\"cohere.command-r-plus-v1:0\",\n", " chat_history=chat_history)\n", " \n", " search_queries = []\n", - " for query in response.search_queries:\n", - " search_queries.append(query.text)\n", + " if response.tool_calls:\n", + " search_queries = response.tool_calls[0].parameters[\"queries\"]\n", "\n", " # If there are search queries, retrieve the documents\n", " if search_queries:\n",