You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+38-13Lines changed: 38 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,6 +62,30 @@ There are two options in this workshop to generate vector embeddings from data:
62
62
1. Use the `/embed` endpoint provided in this repository to transform the data. *You need an OpenAI API key to use this option.*
63
63
2. Import directly the data with *already generated embeddings* into the Couchbase bucket. You can use the data provided in the `./data/individual_items_with_embedding` directory.
64
64
65
+
### Using Local Embeddings vs OpenAI API
66
+
67
+
This workshop gives you the flexibility to choose between generating embeddings locally or using the OpenAI API.
68
+
69
+
- If you have pre-generated embeddings (provided in the repository), you can use the `useLocalEmbedding` flag to avoid using the OpenAI API.
70
+
- If you want to generate embeddings dynamically from the text, you need to provide your OpenAI API key and set the `useLocalEmbedding` flag to `false`.
71
+
72
+
#### Setting the `USE_LOCAL_EMBEDDING` Flag
73
+
74
+
In the `.env` file, set the `USE_LOCAL_EMBEDDING` flag to control the mode:
75
+
76
+
```bash
77
+
USE_LOCAL_EMBEDDING=true
78
+
```
79
+
80
+
*`true`: Use pre-generated embeddings (no OpenAI API key required).
81
+
*`false`: Use OpenAI API to generate embeddings (OpenAI API key required).
82
+
83
+
Make sure to set the `OPENAI_API_KEY` in the `.env` file if you set `USE_LOCAL_EMBEDDING` to `false`.
84
+
85
+
```bash
86
+
OPENAI_API_KEY=your_openai_api_key
87
+
```
88
+
65
89
Follow the instructions below for the option you choose.
66
90
67
91
### Option 1: Use the `/embed` Endpoint
@@ -74,27 +98,27 @@ The Codespace environment already has all the dependencies installed. You can st
74
98
node server.js
75
99
```
76
100
77
-
The repository also has a sample set of data in the `./data/individual_items` directory. You can transform this data by making a POST request to the `/embed` endpoint providing the paths to the data files as an array in the request body.
101
+
The repository also has a sample set of data in the `./data/individual_items` directory. You can transform this data by making a `POST` request to the `/embed` endpoint providing the paths to the data files as an array in the request body.
78
102
79
103
```bash
80
104
curl -X POST http://localhost:3000/embed -H "Content-Type: application/json" -d '["./data/data1.json", "./data/data2.json"]'
81
105
```
82
106
83
107
The data has now been converted into vector embeddings and stored in the Couchbase bucket that you created earlier.
84
108
85
-
### Option 2: Import Data with Embeddings
109
+
### Option 2: Import Data with Pre-Generated Embeddings
86
110
87
111
If you choose to import the data directly, you can use the data provided in the `./data/individual_items_with_embedding` directory. The data is already in the format required to enable vector search on it.
88
112
89
-
Once you have opened this repositority in a [GitHub Codespace](https://codespaces.new/hummusonrails/vector-search-nodejs-workshop), you can import the data with the generated embeddings using the [Couchbase shell](https://couchbase.sh/docs/#_importing_data) from the command line.
113
+
Once you have opened this repository in a [GitHub Codespace](https://codespaces.new/hummusonrails/vector-search-nodejs-workshop), you can import the data with the generated embeddings using the [Couchbase shell](https://couchbase.sh/docs/#_importing_data) from the command line.
90
114
91
115
#### Edit the Config File
92
116
93
117
First, edit the `./config_file/config` file with your Couchbase Capella information.
94
118
95
119
You can find a pre-filled config file in the Couchbase Capella dashboard under the "Connect" tab.
96
120
97
-
Once you click on the "Connect" tab, you will see a section called "Couchbase Shell" among the options on the left-hand menu. You can choose the access credentials for the shell and copy the config file contet provided and paste it in the `./config_file/config` file.
121
+
Once you click on the "Connect" tab, you will see a section called "Couchbase Shell" among the options on the left-hand menu. You can choose the access credentials for the shell and copy the config file content provided and paste it in the ./config_file/config file.
Replace the `name_of_your_bucket` with the name of your bucket you created.
158
182
159
183
You can perform a santity check to ensure the index was created by querying forall the indexes and you should see the `vector_search_index`in the list:
160
184
161
185
```bash
162
-
>query indexes
186
+
query indexes
163
187
```
164
188
165
189
## Search Data
@@ -178,9 +202,9 @@ Once the server is running, you can either search using the provided query with
178
202
179
203
### Search with the provided query
180
204
181
-
You can search for similar items based on the provided query item by making a POST request to the `/search` endpoint.
205
+
You can search for similar items based on the provided query item by making a `POST` request to the `/search` endpoint.
182
206
183
-
Here is an example cURL command to search for similar items based on the provided query item:
207
+
Here is an example `cURL`command to search for similar items based on the provided query item:
184
208
185
209
```bash
186
210
curl -X POST http://localhost:3000/search \
@@ -194,12 +218,13 @@ As you can see, we use the `useLocalEmbedding` flag to indicate that we want to
194
218
195
219
If you want to search forsimilar items based on your own query item, you can provide the query itemin the request body.
196
220
197
-
The query will be automatically converted into a vector embedding using the OpenAI API. You need to provide your OpenAI API key in the `.env` file before starting the Express.js application.
221
+
The query will be automatically converted into a vector embedding using the OpenAI API. You need to provide your OpenAI API key in the `.env file` before starting the Express.js application.
198
222
199
223
Here is an example cURL command to search for similar items based on your own query item:
0 commit comments