You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 02_Overview_Cosmos_DB/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ The focus for this developer guide is [Azure Cosmos DB for NoSQL](https://learn.
14
14
15
15
The [RU architecture](https://learn.microsoft.com/azure/cosmos-db/request-units) for Azure Cosmos DB for NoSQL offers instantaneous scalability with zero warmup period, automatic and transparent sharding, and 99.999% availability. It supports active-active databases across multiple regions, cost-efficient, granular, unlimited scalability, real-time analytics, and serverless deployments paying only per operation.
16
16
17
-
[vCore-based Azure Cosmos DB for NoSQL architecture](https://learn.microsoft.com/azure/cosmos-db/convert-vcore-to-request-unit) integrates AI-based applications with private organizational data, with text indexing for easy querying. Simplify the development process with high-capacity vertical scaling and free 35-day backups with a point-in-time restore (PITR).
17
+
[Azure Cosmos DB for NoSQL architecture](https://learn.microsoft.com/azure/cosmos-db/convert-vcore-to-request-unit) integrates AI-based applications with private organizational data, with text indexing for easy querying. Simplify the development process with high-capacity vertical scaling and free 35-day backups with a point-in-time restore (PITR).
18
18
19
19
The [choice between vCore and Request Units (RU)](hhttps://learn.microsoft.com/azure/cosmos-db/convert-vcore-to-request-unit) in Azure Cosmos DB for NoSQL API depends on the workload. A list of [compatibility and feature support between RU and vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/compatibility) is available.
Copy file name to clipboardExpand all lines: 03_Overview_Azure_OpenAI/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,7 @@ Developers can use Azure AI Services, along with other Azure services, to build
46
46
47
47
### Azure AI Services
48
48
49
-
While this guide focuses on building intelligent apps using Azure OpenAI combined with vCore-based Azure Cosmos DB for NoSQL, the Azure AI Platform consists of many additional AI services. Each AI service is built to fit a specific AI and/or Machine Learning (ML) need.
49
+
While this guide focuses on building intelligent apps using Azure OpenAI combined with Azure Cosmos DB for NoSQL, the Azure AI Platform consists of many additional AI services. Each AI service is built to fit a specific AI and/or Machine Learning (ML) need.
50
50
51
51
Here's a list of the AI services within the [Azure AI platform](https://learn.microsoft.com/azure/ai-services/what-are-ai-services):
Copy file name to clipboardExpand all lines: 07_Create_First_Cosmos_DB_Project/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ Learn more about the pre-requisites and installation of the emulator [here](http
14
14
15
15
>**NOTE**: When using the Azure CosmosDB emulator using the API for MongoDB it must be started with the [MongoDB endpoint options enabled](https://learn.microsoft.com/azure/cosmos-db/how-to-develop-emulator?tabs=windows%2Cpython&pivots=api-mongodb#start-the-emulator) at the command-line.
16
16
17
-
**The Azure Cosmos DB emulator does not support vector search. To complete the vector search and AI-related labs, a vCore-based Azure Cosmos DB for NoSQL account in Azure is required.**
17
+
**The Azure Cosmos DB emulator does not support vector search. To complete the vector search and AI-related labs, a Azure Cosmos DB for NoSQL account in Azure is required.**
Copy file name to clipboardExpand all lines: 09_Vector_Search_Cosmos_DB/README.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,12 @@
1
1
# Use vector search on embeddings in Azure Cosmos DB for NoSQL
2
2
3
-
>**NOTE**: vCore-based Azure Cosmos DB for NoSQL supports vector search on embeddings. This functionality is not supported on RUs-based accounts.
3
+
>**NOTE**: Azure Cosmos DB for NoSQL supports vector search on embeddings. This functionality is not supported on RUs-based accounts.
4
4
5
5
## Embeddings and vector search
6
6
7
7
Embedding is a way of serializing the semantic meaning of data into a vector representation. Because the generated vector embedding represents the semantic meaning, it means that when it is searched, it can find similar data based on the semantic meaning of the data rather than exact text. Data can come from many sources, including text, images, audio, and video. Because the data is represented as a vector, vector search can, therefore, find similar data across all different types of data.
8
8
9
-
Embeddings are created by sending data to an embedding model, where it is transformed into a vector, which then can be stored as a vector field within its source document in vCore-based Azure Cosmos DB for NoSQL. vCore-based Azure Cosmos DB for NoSQL supports the creation of vector search indexes on top of these vector fields. A vector search index is a collection of vectors in [latent space](https://idl.cs.washington.edu/papers/latent-space-cartography/) that enables a semantic similarity search across all data (vectors) contained within.
9
+
Embeddings are created by sending data to an embedding model, where it is transformed into a vector, which then can be stored as a vector field within its source document in Azure Cosmos DB for NoSQL. Azure Cosmos DB for NoSQL supports the creation of vector search indexes on top of these vector fields. A vector search index is a collection of vectors in [latent space](https://idl.cs.washington.edu/papers/latent-space-cartography/) that enables a semantic similarity search across all data (vectors) contained within.
10
10
11
11

12
12
@@ -16,15 +16,15 @@ Vector search is an important RAG (Retrieval Augmented Generation) pattern compo
16
16
17
17
A vector index search allows for a prompt pre-processing step where information can be semantically retrieved from an index and then used to generate a factually accurate prompt for the LLM to reason over. This provides the knowledge augmentation and focus (attention) to the LLM.
18
18
19
-
In this example, assume textual data is vectorized and stored within an vCore-based Azure Cosmos DB for NoSQL database. The text data and embeddings/vector field are stored in the same document. A vector search index has been created on the vector field. When a message is received from a chat application, this message is also vectorized using the same embedding model (ex., Azure OpenAI text-embedding-ada-002), which is then used as input to the vector search index. The vector search index returns a list of documents whose vector field is semantically similar to the incoming message. The unvectorized text stored within the same document is then used to augment the LLM prompt. The LLM receives the prompt and responds to the requestor based on the information it has been given.
19
+
In this example, assume textual data is vectorized and stored within an Azure Cosmos DB for NoSQL database. The text data and embeddings/vector field are stored in the same document. A vector search index has been created on the vector field. When a message is received from a chat application, this message is also vectorized using the same embedding model (ex., Azure OpenAI text-embedding-ada-002), which is then used as input to the vector search index. The vector search index returns a list of documents whose vector field is semantically similar to the incoming message. The unvectorized text stored within the same document is then used to augment the LLM prompt. The LLM receives the prompt and responds to the requestor based on the information it has been given.
20
20
21
21

22
22
23
-
## Why use vCore-based Azure Cosmos DB for NoSQL as a vector store?
23
+
## Why use Azure Cosmos DB for NoSQL as a vector store?
24
24
25
-
It is common practice to store vectorized data in a dedicated vector store as vector search indexing is not a common capability of most databases. However, this introduces additional complexity to the solution as the data must be stored in two different locations. vCore-based Azure Cosmos DB for NoSQL supports vector search indexing, which means that the vectorized data can be stored in the same document as the original data. This reduces the complexity of the solution and allows for a single database to be used for both the vector store and the original data.
25
+
It is common practice to store vectorized data in a dedicated vector store as vector search indexing is not a common capability of most databases. However, this introduces additional complexity to the solution as the data must be stored in two different locations. Azure Cosmos DB for NoSQL supports vector search indexing, which means that the vectorized data can be stored in the same document as the original data. This reduces the complexity of the solution and allows for a single database to be used for both the vector store and the original data.
26
26
27
-
## Lab - Use vector search on embeddings in vCore-based Azure Cosmos DB for NoSQL
27
+
## Lab - Use vector search on embeddings in Azure Cosmos DB for NoSQL
28
28
29
29
In this lab, a notebook demonstrates how to add an embedding field to a document, create a vector search index, and perform a vector search query. The notebook ends with a demonstration of utilizing vector search with an LLM in a RAG scenario using Azure OpenAI.
30
30
@@ -36,7 +36,7 @@ On the **Settings** screen, select the **Resource** tab, then copy and record th
36
36
37
37

38
38
39
-
>**NOTE**: This lab can only be completed using a deployed vCore-based Azure Cosmos DB for NoSQL account due to the use of vector search. The Azure Cosmos DB Emulator does not support vector search.
39
+
>**NOTE**: This lab can only be completed using a deployed Azure Cosmos DB for NoSQL account due to the use of vector search. The Azure Cosmos DB Emulator does not support vector search.
40
40
41
41
This lab also requires the data provided in the previous lab titled [Load data into Azure Cosmos DB API for NoSQL collections](../08_Load_Data/README.md#lab---load-data-into-azure-cosmos-db-api-for-mongodb-collections). Run all cells in this notebook to prepare the data for use in this lab.
Copy file name to clipboardExpand all lines: Labs/lab_3_mongodb_vector_search.ipynb
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
"cell_type": "markdown",
5
5
"metadata": {},
6
6
"source": [
7
-
"# Vector Search using vCore-based Azure Cosmos DB for NoSQL\n",
7
+
"# Vector Search using Azure Cosmos DB for NoSQL\n",
8
8
"\n",
9
9
"This notebook demonstrates using an Azure OpenAI embedding model to vectorize documents already stored in Azure Cosmos DB API for MongoDB, storing the embedding vectors and the creation of a vector index. Lastly, the notebook will demonstrate how to query the vector index to find similar documents.\n",
10
10
"\n",
@@ -270,9 +270,9 @@
270
270
"cell_type": "markdown",
271
271
"metadata": {},
272
272
"source": [
273
-
"## Use vector search in vCore-based Azure Cosmos DB for NoSQL\n",
273
+
"## Use vector search in Azure Cosmos DB for NoSQL\n",
274
274
"\n",
275
-
"Now that each document has its associated vector embedding and the vector indexes have been created on each collection, we can now use the vector search capabilities of vCore-based Azure Cosmos DB for NoSQL."
275
+
"Now that each document has its associated vector embedding and the vector indexes have been created on each collection, we can now use the vector search capabilities of Azure Cosmos DB for NoSQL."
Copy file name to clipboardExpand all lines: Labs/lab_4_langchain.ipynb
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -76,9 +76,9 @@
76
76
"source": [
77
77
"## Vector search with LangChain\n",
78
78
"\n",
79
-
"In the previous lab, the `pymongo` library was used to perform a vector search through a db command to find product documents that were most similar to the user's input. In this lab, you will use the `langchain` library to perform the same search. LangChain has a vector store class named **AzureCosmosDBVectorSearch**, a community contribution, that supports vector search in vCore-based Azure Cosmos DB for NoSQL.\n",
79
+
"In the previous lab, the `pymongo` library was used to perform a vector search through a db command to find product documents that were most similar to the user's input. In this lab, you will use the `langchain` library to perform the same search. LangChain has a vector store class named **AzureCosmosDBVectorSearch**, a community contribution, that supports vector search in Azure Cosmos DB for NoSQL.\n",
80
80
"\n",
81
-
"When establishing the connection to the vector store (vCore-based Azure Cosmos DB for NoSQL), recall that in previous labs the products collection was populated and a contentVector field added that contains the vectorized embeddings of the document itself. Finally, a vector index was also created on the contentVector field to enable vector search. The vector index in each collection is named `VectorSearchIndex`.\n",
81
+
"When establishing the connection to the vector store (Azure Cosmos DB for NoSQL), recall that in previous labs the products collection was populated and a contentVector field added that contains the vectorized embeddings of the document itself. Finally, a vector index was also created on the contentVector field to enable vector search. The vector index in each collection is named `VectorSearchIndex`.\n",
82
82
"\n",
83
83
"The return value of a vector search in LangChain is a list of `Document` objects. The LangChain `Document` class contains two properties: `page_content`, that represents the textual content that is typically used to augment the prompt, and `metadata` that contains all other attributes of the document. In the cell below, we'll use the `_id` field as the page_content, and the rest of the fields are returned as metadata.\n",
84
84
"\n",
@@ -272,7 +272,7 @@
272
272
"metadata": {},
273
273
"outputs": [],
274
274
"source": [
275
-
"# Create tools that will use vector search in vCore-based Azure Cosmos DB for NoSQL collections\n",
275
+
"# Create tools that will use vector search in Azure Cosmos DB for NoSQL collections\n",
276
276
"\n",
277
277
"# create a chain on the retriever to format the documents as JSON\n",
0 commit comments