cohere-ai · trentfowlercohere · Sep 11, 2025 · Sep 12, 2025 · Sep 12, 2025 · Sep 12, 2025
@@ -0,0 +1,41 @@
+---
+title: "Announcing Major Command Deprecations"
+slug: "changelog/2025-09-15-major-command-deprecations"
+createdAt: "Mon Sep 15 2025 00:00:00 (MST)"
+hidden: false
+description: >-
+ This announcement covers a series of major deprecations, including of classic Command models, several parameters, and entire endpoints.
+---
+
+As part of our ongoing commitment to delivering advanced AI solutions, we are streamlining our offerings to focus on the best-performing tools. The following models, features, and API endpoints will be deprecated:
+
+Deprecated Models:
+- command-light (legacy) → Use command-r-08-2024 or command-a-03-2025 instead.
+- command → Use command-r-03-2024 or command-r-plus-04-2024.
+- summarize → Refer to migration guide for alternatives.
+
+Retired Fine-Tuning Capabilities:
+All fine-tuning options via dashboard and API for models including command-light, command, command-r, classify, and rerank are being retired. Previously fine-tuned models will no longer be accessible.
+
+Deprecated Features and API Endpoints:
+- /v1/connectors (Managed connectors for RAG)
+- /v1/chat parameters: connectors, search_queries_only
+- /v1/generate (Legacy generative endpoint)
+- /v1/summarize (Legacy summarization endpoint)
+- /v1/classify
+- Slack App integration
+- Coral Web UI (chat.cohere.com)
+
+Why These Changes?
+We are aligning with evolving market needs, enhancing performance, and optimizing resources. Newer models like Command A offer superior capabilities. This transition ensures we remain at the forefront of innovation.
+
+Support and Migration:
+Our support team is ready to assist with your transition. For guidance, contact us at support@cohere.com or explore our documentation. We recommend assessing your current usage and planning your migration to the recommended alternatives.
+
+Thank you for your understanding and continued partnership. We look forward to delivering innovative AI solutions that drive your success.
+
+Best regards,
+
+The Cohere Support Team
+
+support@cohere.com
@@ -241,13 +241,28 @@ With our database in place, we can run queries against it. The query process can
 ```python PYTHON
 def process_query(query, retriever):
     """Runs query augmentation, retrieval, rerank and final generation in one call."""
-    augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, search_queries_only=True)
+    query_gen_tool = [
+        {
+            "name": "internet_search",
+            "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
+            "parameter_definitions": {
+                "queries": {
+                    "description": "a list of queries to search the internet with.",
+                    "type": "List[str]",
+                    "required": True,
+                }
+            },
+        }
+    ]
+
+    augmented_queries=co.chat(message=query,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True)
         #augment queries
-    if augmented_queries.search_queries:
+    if augmented_queries.tool_calls:
         reranked_docs=[]
-        for itm in augmented_queries.search_queries:
-            docs=retriever.invoke(itm.text)
-            temp_rerank = rerank_cohere(itm.text,docs)
+        search_queries = augmented_queries.tool_calls[0].parameters["queries"]
+        for itm in search_queries:
+            docs=retriever.invoke(itm)
+            temp_rerank = rerank_cohere(itm,docs)
             reranked_docs.extend(temp_rerank)
         documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))]
     else:
@@ -307,14 +322,14 @@ The final answer is from the documents below:
 
 In the example below, we ask a follow up question that relies on the chat history, but does not require a rerun of the RAG pipeline.
 
-We detect questions that do not require RAG by examining the `search_queries` object returned by calling `co.chat` to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question.
+We detect questions that do not require RAG by examining the `tool_calls` object returned by calling `co.chat` with a search query generation tool to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question.
 
 In the example below, the `else` statement is invoked based on `query2`. We still pass in the chat history, allowing the question to be answered with only the prior context.
 
 ```python PYTHON
 query2='divide this by two'
-augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, search_queries_only=True)
-if augmented_queries.search_queries:
+augmented_queries=co.chat(message=query2,model='command-a-03-2025',temperature=0.2, tools=query_gen_tool, force_single_step=True)
+if augmented_queries.tool_calls:
     print('RAG is needed')
     final_answer, final_answer_docs = process_query(query, retriever)
     print(final_answer)
@@ -524,13 +539,28 @@ Unless the user asks for a different style of answer, you should answer in full
 
     def process_query(self,query):
         """Runs query augmentation, retrieval, rerank and generation in one call."""
-        augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True)
+        query_gen_tool = [
+            {
+                "name": "internet_search",
+                "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
+                "parameter_definitions": {
+                    "queries": {
+                        "description": "a list of queries to search the internet with.",
+                        "type": "List[str]",
+                        "required": True,
+                    }
+                },
+            }
+        ]
+
+        augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, tools=query_gen_tool, force_single_step=True)
         #augment queries
-        if augmented_queries.search_queries:
+        if augmented_queries.tool_calls:
             reranked_docs=[]
-            for itm in augmented_queries.search_queries:
-                docs=self.retriever.invoke(itm.text)
-                temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank)
+            search_queries = augmented_queries.tool_calls[0].parameters["queries"]
+            for itm in search_queries:
+                docs=self.retriever.invoke(itm)
+                temp_rerank = self.rerank_cohere(itm,docs,model=self.rerank_model,top_n=self.top_k_rerank)
                 reranked_docs.extend(temp_rerank)
             documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))]
         else:

@@ -194,10 +194,25 @@ To learn more about document mode and query generation, check out [our documenta
 ```python PYTHON
 PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends."
 
+# Define search query generation tool
+query_gen_tool = [
+    {
+        "name": "internet_search",
+        "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
+        "parameter_definitions": {
+            "queries": {
+                "description": "a list of queries to search the internet with.",
+                "type": "List[str]",
+                "required": True,
+            }
+        },
+    }
+]
+
 # Get queries to run against our index from the model
-r = co.chat(PROMPT, model="command-r", search_queries_only=True)
-if r.search_queries:
-    queries = [q["text"] for q in r.search_queries]
+r = co.chat(PROMPT, model="command-r", tools=query_gen_tool, force_single_step=True)
+if r.tool_calls:
+    queries = r.tool_calls[0].parameters["queries"]
 else:
     print("No queries returned by the model")
 ```
@@ -334,9 +349,9 @@ pages = [pytesseract.image_to_string(page) for page in pages]
 def get_response(prompt, rag):
     if rag:
         # Get queries to run against our index from the model
-        r = co.chat(prompt, model="command-r", search_queries_only=True)
-        if r.search_queries:
-            queries = [q["text"] for q in r.search_queries]
+        r = co.chat(prompt, model="command-r", tools=query_gen_tool, force_single_step=True)
+        if r.tool_calls:
+            queries = r.tool_calls[0].parameters["queries"]
         else:
             print("No queries returned by the model")
 

@@ -303,12 +303,12 @@ Next, we implement a class to handle the interaction between the user and the ch
 
 The `run()` method will be used to run the chatbot application. It begins with the logic for getting the user message, along with a way for the user to end the conversation.
 
-Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with `search_queries_only=True`, the Chat endpoint handles this for us automatically.
+Based on the user message, the chatbot needs to decide if it needs to consult external information before responding. If so, the chatbot determines an optimal set of search queries to use for retrieval. When we call `co.chat()` with a search query generation tool, the Chat endpoint handles this for us automatically.
 
-The generated queries can be accessed from the `search_queries` field of the object that is returned. Then, what happens next depends on how many queries are returned.
+The generated queries can be accessed from the `tool_calls` field of the object that is returned. Then, what happens next depends on whether tool calls are returned.
 
-- If queries are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again.
-- Otherwise, if no queries are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call.
+- If tool calls are returned, we call the `retrieve()` method of the Vectorstore object for the retrieval step. The retrieved document chunks are then passed to the Chat endpoint by adding a `documents` parameter when we call `co.chat()` again.
+- Otherwise, if no tool calls are returned, we call the Chat endpoint another time, passing the user message and without needing to add any documents to the call.
 
 In either case, we also pass the `conversation_id` parameter, which retains the interactions between the user and the chatbot in the same conversation thread. We also enable the `stream` parameter so we can stream the chatbot response.
 
@@ -344,18 +344,34 @@ class Chatbot:
               # print(f"User: {message}") # Uncomment for Google Colab to avoid printing the same thing twice
 
             # Generate search queries (if any)
+            query_gen_tool = [
+                {
+                    "name": "internet_search",
+                    "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
+                    "parameter_definitions": {
+                        "queries": {
+                            "description": "a list of queries to search the internet with.",
+                            "type": "List[str]",
+                            "required": True,
+                        }
+                    },
+                }
+            ]
+
             response = co.chat(message=message,
                                model="command-r",
-                               search_queries_only=True)
+                               tools=query_gen_tool,
+                               force_single_step=True)
 
             # If there are search queries, retrieve document chunks and respond
-            if response.search_queries:
+            if response.tool_calls:
                 print("Retrieving information...", end="")
 
                 # Retrieve document chunks for each query
                 documents = []
-                for query in response.search_queries:
-                    documents.extend(self.vectorstore.retrieve(query.text))
+                search_queries = response.tool_calls[0].parameters["queries"]
+                for query in search_queries:
+                    documents.extend(self.vectorstore.retrieve(query))
 
                 # Use document chunks to respond
                 response = co.chat_stream(

@@ -2,12 +2,16 @@
 title: "Fine-tuning for Cohere's Chat Model"
 slug: "docs/chat-fine-tuning"
 
-hidden: False
+hidden: true
 description: "This document provides guidance on fine-tuning, evaluating, and improving chat models."
 image: "../../assets/images/6ff1f01-cohere_meta_image.jpg"  
 keywords: "chat models, fine-tuning language models, fine-tuning, fine-tuning chat models"
 
 createdAt: "Fri Nov 10 2023 18:20:28 GMT+0000 (Coordinated Universal Time)"
 updatedAt: "Fri Mar 15 2024 04:42:37 GMT+0000 (Coordinated Universal Time)"
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 This section contains information on [fine-tuning](/docs/chat-starting-the-training), [evaluating](/docs/chat-understanding-the-results), and [improving](/docs/chat-improving-the-results) chat models.
@@ -1,7 +1,7 @@
 ---
 title: Improving the Chat Fine-tuning Results
 slug: docs/chat-improving-the-results
-hidden: false
+hidden: true
 description: >-
   Learn how to refine data, iterate on hyperparameters, and troubleshoot to
   fine-tune your Chat model effectively.
@@ -10,6 +10,10 @@ keywords: 'fine-tuning, fine-tuning language models, chat models'
 createdAt: 'Mon Nov 13 2023 17:30:43 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Fri Mar 15 2024 04:43:11 GMT+0000 (Coordinated Universal Time)'
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 There are several things you need to take into account to achieve the best fine-tuned model for Chat:
 
 ## Refining data quality

@@ -1,7 +1,7 @@
 ---
 title: Preparing the Chat Fine-tuning Data
 slug: docs/chat-preparing-the-data
-hidden: false
+hidden: true
 description: >-
   Prepare your data for fine-tuning a Command model for Chat with this
   step-by-step guide, including data formatting, requirements, and best
@@ -11,6 +11,10 @@ keywords: 'fine-tuning, fine-tuning language models'
 createdAt: 'Thu Nov 16 2023 02:53:26 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Tue May 07 2024 19:35:14 GMT+0000 (Coordinated Universal Time)'
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 In this section, we will walk through how you can prepare your data for fine-tuning a one of the Command family of models for Chat.
 
 ### Data format

@@ -1,7 +1,7 @@
 ---
 title: Starting the Chat Fine-Tuning Run
 slug: docs/chat-starting-the-training
-hidden: false
+hidden: true
 description: >-
   Learn how to fine-tune a Command model for chat with the Cohere Web UI or
   Python SDK, including data requirements, pricing, and calling your model.
@@ -10,6 +10,10 @@ keywords: 'fine-tuning, fine-tuning language models'
 createdAt: 'Fri Nov 10 2023 18:22:10 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Wed Jun 12 2024 00:17:37 GMT+0000 (Coordinated Universal Time)'
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 In this section, we will walk through how you can start training a fine-tuning model for Chat on both the Web UI and the Python SDK.
 
 ## Cohere Dashboard

@@ -1,7 +1,7 @@
 ---
 title: Understanding the Chat Fine-tuning Results
 slug: docs/chat-understanding-the-results
-hidden: false
+hidden: true
 description: >-
   Learn how to evaluate and troubleshoot a fine-tuned chat model with accuracy
   and loss metrics.
@@ -10,6 +10,10 @@ keywords: 'chat models, fine-tuning, fine-tuning language models'
 createdAt: 'Fri Nov 10 2023 18:22:54 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Fri Mar 15 2024 04:43:03 GMT+0000 (Coordinated Universal Time)'
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 The outputs of a fine-tuned model for Chat are often best evaluated qualitatively. While the performance metrics are a good place to start, you'll still have to assess whether it _feels_ right to arrive at a comprehensive understanding of the model’s performance.
 
 When you create a fine-tuned model for Chat, you will see metrics that look like this:

@@ -2,12 +2,16 @@
 title: "Fine-tuning for Cohere's Classify Model"
 slug: "docs/classify-fine-tuning"
 
-hidden: false
+hidden: true
 description: "This document provides guidance on fine-tuning, evaluating, and improving classification models."
 image: "../../assets/images/4aa4671-cohere_meta_image.jpg"  
 keywords: "classification, classification models, fine-tuning large language models"
 
 createdAt: "Fri Nov 10 2023 18:12:45 GMT+0000 (Coordinated Universal Time)"
 updatedAt: "Fri Mar 15 2024 04:41:11 GMT+0000 (Coordinated Universal Time)"
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 This section contains information on [fine-tuning](/docs/classify-starting-the-training), [evaluating](/docs/classify-understanding-the-results), and [improving](/docs/classify-improving-the-results) classification models.
@@ -1,7 +1,7 @@
 ---
 title: Improving the Classify Fine-tuning Results
 slug: docs/classify-improving-the-results
-hidden: false
+hidden: true
 description: >-
   Troubleshoot your fine-tuned classification model with these tips for refining
   data quality and improving results.
@@ -10,6 +10,10 @@ keywords: 'classification models, fine-tuning, fine-tuning classification models
 createdAt: 'Fri Nov 10 2023 20:16:25 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Fri Mar 15 2024 04:41:45 GMT+0000 (Coordinated Universal Time)'
 ---
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 There are several things you need to take into account to achieve the best fine-tuned model for Classification, all of which are based on giving the model higher-quality data.
 
 ## Refining data quality

@@ -1,15 +1,21 @@
 ---
 title: Preparing the Classify Fine-tuning data
 slug: docs/classify-preparing-the-data
-hidden: false
+hidden: true
 description: >-
   Learn how to prepare your data for fine-tuning classification models,
   including single-label and multi-label data formats and dataset cleaning tips.
 image: ../../../assets/images/033184f-cohere_meta_image.jpg
 keywords: 'classification models, fine-tuning, fine-tuning language models'
 createdAt: 'Wed Nov 15 2023 22:21:51 GMT+0000 (Coordinated Universal Time)'
 updatedAt: 'Wed Apr 03 2024 15:23:42 GMT+0000 (Coordinated Universal Time)'
+
 ---
+
+<Warning>
+Cohere's fine-tuning feature was deprecated on September 15, 2025
+</Warning>
+
 In this section, we will walk through how you can prepare your data for fine-tuning models for Classification.
 
 For classification fine-tuning jobs we can choose between two types of datasets: