Skip to content

Commit 4415c0d

Browse files
committed
refactor: allow bypassing table creation in BaseDuckDBStore.hpp;
chores: update docs
1 parent cb08e17 commit 4415c0d

File tree

5 files changed

+49
-98
lines changed

5 files changed

+49
-98
lines changed

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,16 @@ For library itself:
3939

4040
Complete project plan is tracked at [Project kanban](https://github.com/users/RobinQu/projects/1/views/1?layout=board).
4141

42-
| Milestone | Features | DDL |
43-
|---------------------------------------------------------------|------------------------------------------------------------------------------------------------------|------|
44-
| v0.1.0 | Long-short memory, PDF/TXT/DOCX ingestor, `Chain` programing paradigm, RAG reference app `doc-agent` | 3.29 |
45-
| [v0.1.1](https://github.com/RobinQu/instinct.cpp/milestone/1) | Performance tuning, RAG evaluation, Function calling agent | 4.16 |
46-
| [v0.1.2](https://github.com/RobinQu/instinct.cpp/milestone/2) | OpenAI Assistant API initial implementation, single-binary reference app `mini-assistant` | 4.30 |
47-
| [v0.1.3](https://github.com/RobinQu/instinct.cpp/releases/tag/v0.1.3) | * `mini-assistant`: tool calls with opensourced LLMs<br> | 5.17 |
48-
| v0.1.4 | * `doc-agent` : rerank model<br>* `mini-assistant`: `file-search` tool support. | 6.18 |
42+
| Milestone | Features | DDL |
43+
|--------------------------------------------------------------|--------------------------------------------------------------|---------------|
44+
| v0.1.0 | Long-short memory, PDF/TXT/DOCX ingestor, `Chain` programing paradigm, RAG reference app `doc-agent` | 3.29 |
45+
| [v0.1.1](https://github.com/RobinQu/instinct.cpp/milestone/1) | Performance tuning, RAG evaluation, Function calling agent | 4.16 |
46+
| [v0.1.2](https://github.com/RobinQu/instinct.cpp/milestone/2) | OpenAI Assistant API initial implementation, single-binary reference app `mini-assistant` | 4.30 |
47+
| [v0.1.3](https://github.com/RobinQu/instinct.cpp/releases/tag/v0.1.3) | * `mini-assistant`: tool calls with opensourced LLMs<br> | 5.17 |
48+
| [v0.1.4](https://github.com/RobinQu/instinct.cpp/milestone/4) | * `doc-agent` : rerank model<br>* `mini-assistant`: `file-search` tool support. | ~~6.18~~ 6.14 |
49+
| [v0.1.5](https://github.com/RobinQu/instinct.cpp/milestone/5) | Overall optimization | 6.30 |
50+
| [v0.1.6](https://github.com/RobinQu/instinct.cpp/milestone/6) | `code-interpreter` in `mini-assistant` | 7.15 |
51+
4952

5053

5154

docs/assistant_api.md

Lines changed: 28 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -47,109 +47,50 @@ In first release of `mini-assistant`, following endpoints are supported:
4747
* DELETE `/files/:file_id`
4848
* GET `/files/:file_id`
4949
* GET `/files/:file_id/content`
50+
* VectorStore
51+
* POST `/v1/vector_stores/`
52+
* GET `/v1/vector_stores/`
53+
* GET `/v1/vector_stores/:vector_store_id`
54+
* POST `/v1/vector_stores/:vector_store_id`
55+
* DELETE `/v1/vector_stores/:vector_store_id`
56+
* GET `/v1/vector_stores/:vector_store_id/files`
57+
* POST `/v1/vector_stores/:vector_store_id/files`
58+
* GET `/v1/vector_stores/:vector_store_id/files/:file_id`
59+
* DELETE `/v1/vector_stores/:vector_store_id/files/:file_id`
60+
* POST `/v1/vector_stores/:vector_store_id/file_batches`
61+
* GET `/v1/vector_stores/:vector_store_id/file_batches/:batch_id`
62+
* POST `/v1/vector_stores/:vector_store_id/file_batches/:batch_id/cancel`
63+
* GET `/v1/vector_stores/:vector_store_id/file_batches/:batch_id/files`
5064

51-
## task-queue
52-
53-
Task scheduling is needed in following sections:
54-
55-
* instinct-assistant
56-
* Periodic task to check status of run objects and run step objects in background.
57-
* FIFO queue for execution of run object, which handles agent execution. Only one running task is allowed for single run object.
58-
59-
### Worker queue for run objects
60-
61-
#### Preconditions
62-
63-
* run object is `queued` or `required_action`.
64-
* all file resources needed are alive
65-
66-
#### Procedures
67-
68-
* Create `IAgentExecutor` instance with given tools setup
69-
* Recover `AgentState` from database
70-
* Run `IAgentExecutor::Stream` loop.
71-
* If run object is `cancelling` or `expired`, then stop.
72-
* Update status of run object to `in_progress`.
73-
* if resolved step is agent thought, then
74-
* create a message with thought text.
75-
* create run step with status of `completed` and `step_details` with `message_creation` type.
76-
* create another run step object with status of `in_progress`, create a message with `step_details` with `tool_call` type.
77-
* if thought contains actions for function calls **stop**.
78-
* if thought contains actions for other tool uses, then continue. As `code_interpreter` and `file_search` is triggered automatically.
79-
* if resolved agent step is agent observation, then
80-
* update `step_details` of last run step object.
81-
* if resolved agent step is final message, then
82-
* create a message object containing the final message text.
83-
* create a run step with `step_details` of `message_creation` type and related `message_id`.
84-
* and **stop**
85-
* if stopped and
86-
* status of run object is `cancelling`, then update status of run object to `canclled`.
87-
* final message is generated, update status of run object to `completed`.
88-
* function tool calls are required, then update status of run object to `requires_action`.
89-
* error occurs during loop, update status of run object to `failed`.
90-
91-
92-
#### Outcomes
93-
* run object should be in intermediate status other than `queued`.
94-
* run steps and generated messages are saved to database.
9565

96-
### Background queue for run objects
97-
98-
#### Preconditions
99-
100-
* Only one thread is running this task across entire cluster.
101-
* Variables:
102-
* `RUN_TIMEOUT` defaults to 10min.
103-
104-
105-
#### Procedures
106-
107-
* Find all run objects that matches `modified_at < now() - RUN_TIMEOUT`.
108-
* Loop run objects found
109-
* Find last run step that are `in_progress` and update them to status of `expired`.
110-
* Update run object to status of `expired`.
111-
112-
113-
### Implementation notes
66+
## task-queue
11467

115-
* An in-memory local queue is preferred in `mini-assistant`. [cameron314/concurrentqueue](https://github.com/cameron314/concurrentqueue) seems to be a good option.
116-
* Uniform interface for task queue is needed for more scalable implementation on the cloud, where dedicated task scheduler will be used with multiple worker nodes setup. Possible options:
117-
* Celery or other task-focused frameworks.
118-
* Queue facilities in distributed compute frameworks like `ray`, e.g. [ray.util.queue.Queue](https://docs.ray.io/en/latest/ray-core/api/doc/ray.util.queue.Queue.html).
119-
* Custom implementation based on message broker like Kafka.
68+
In `mini-assistant`, an in-process, multi-consumer task queue is created. See [ThreadPoolTaskScheduler.hpp](../modules/instinct-data/include/task_scheduler/ThreadPoolTaskScheduler.hpp) for more details.
12069

121-
122-
### Outcomes
70+
Primary task handler classes are:
12371

124-
* task and task steps that are timeout have been marked as `expired`.
72+
* [FileObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/FileObjectTaskHandler.hpp): To process uploaded file.
73+
* [RunObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/RunObjectTaskHandler.hpp): To execute a run request for threads.
12574

126-
## `tool-server`
75+
`ThreadPoolTaskScheduler` is kind of `ILifeCycle` and it's bootstrap in main.
12776

128-
### `file-search`
12977

130-
Considerations:
78+
## tool-server
13179

132-
* duckdb implementation, mainly used for `mini-assistant`.
133-
* For Each file object we will generate one document table and embedding table.
134-
* Given only one `file_id` can be assigned to thread currently and only one process is accessing database files, we can manage all `VectorStorePtr` dynamically in memory. e.g a map from `file_id` to `VectorStorePtr`.
135-
* a more scalable solution involves standalone vector database.
136-
* A file object can be embedded and linked to a `collection`.
137-
* Mapping from `file_id` and `collection`'s id is required.
80+
### file-search
13881

82+
Primary classes:
13983

140-
Primary workflows:
84+
* [SummaryGuidedFileSearch.hpp](../modules/instinct-assistant/include/assistant/v2/toolkit/SummaryGuidedFileSearch.hpp): Actual implementation of search tool
85+
* [RunObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/RunObjectTaskHandler.hpp): Bring the search tool to user's run requests.
14186

142-
1. File ingestion: operations in `FileBatch` and `File` endpoints will trigger `FileIngestionTaskHandler`, where file is split and transformed into embeddings.
143-
2. Online search: `file-search` as built-in tools in run objects if explicitly requested.
14487

145-
Primary classes:
88+
Search pipeline:
14689

147-
* `VectorStoreController`: manage multiple `IVectorStore` instances.
148-
* `FileIngestionTaskHandler`: ingest uploaded file and update corresponding `IVectorStore`.
149-
* `FileSearchTool`: gather user query and search against given `IVectorStore`.
90+
![file_search_pipeline.png](file_search_pipeline.png)
15091

15192

152-
### `code-interpreter`
93+
### code-interpreter
15394

15495
Prompting is straightforward. The challenge would the sandbox for Python scripts.
15596

docs/file_search_pipeline.png

153 KB
Loading

modules/instinct-retrieval/include/store/duckdb/BaseDuckDBStore.hpp

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,11 @@ namespace INSTINCT_RETRIEVAL_NS {
5454
*/
5555
bool create_or_replace_table = false;
5656

57+
/**
58+
* A flag to bypass table checking
59+
*/
60+
bool bypass_table_check = false;
61+
5762
/**
5863
* Optional instance id
5964
*/
@@ -316,10 +321,10 @@ namespace INSTINCT_RETRIEVAL_NS {
316321

317322
const auto sql = details::make_create_table_sql(options_.table_name, options_.dimension, metadata_schema_, options_.create_or_replace_table);
318323
LOG_DEBUG("create document table with SQL if necessary: {}", sql);
319-
320-
const auto create_table_result = connection_.Query(sql);
321-
assert_query_ok(create_table_result);
322-
324+
if (!options_.bypass_table_check) {
325+
const auto create_table_result = connection_.Query(sql);
326+
assert_query_ok(create_table_result);
327+
}
323328
prepared_count_all_statement_ = connection_.Prepare(details::make_prepared_count_sql(options_.table_name));
324329
assert_prepared_ok(prepared_count_all_statement_, "Failed to prepare count statement");
325330
}

modules/instinct-retrieval/include/store/duckdb/DuckDBVectorStoreOperator.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,8 @@ namespace INSTINCT_RETRIEVAL_NS {
8585
auto metadata_schema = std::make_shared<MetadataSchema>(instance->metadata_schema());
8686
const auto embedding_model = std::invoke(embedding_model_selector_, instance_id, metadata_schema);
8787
DuckDBStoreOptions options;
88+
// skip table creating as it should be already provisioned when calling this function
89+
options.bypass_table_check = true;
8890
options.instance_id = instance_id;
8991
ConfigureDuckDBOptions(options, embedding_model);
9092
return CreateDuckDBVectorStore(duck_db_, embedding_model, options, metadata_schema);

0 commit comments

Comments
 (0)