refactor: allow bypassing table creation in BaseDuckDBStore.hpp;

RobinQu · RobinQu · commit 4415c0d9e2ee · 2024-06-16T21:48:29.000+08:00
chores: update docs
diff --git a/README.md b/README.md
@@ -39,13 +39,16 @@ For library itself:
 
 Complete project plan is tracked at [Project kanban](https://github.com/users/RobinQu/projects/1/views/1?layout=board).
 
-| Milestone                                                     | Features                                                                                             | DDL  |
-|---------------------------------------------------------------|------------------------------------------------------------------------------------------------------|------|
-| v0.1.0                                                        | Long-short memory, PDF/TXT/DOCX ingestor, `Chain` programing paradigm, RAG reference app `doc-agent` | 3.29 |
-| [v0.1.1](https://github.com/RobinQu/instinct.cpp/milestone/1) | Performance tuning, RAG evaluation,  Function calling agent                                          | 4.16 |
-| [v0.1.2](https://github.com/RobinQu/instinct.cpp/milestone/2) | OpenAI Assistant API initial implementation, single-binary reference app `mini-assistant`            | 4.30 |
-| [v0.1.3](https://github.com/RobinQu/instinct.cpp/releases/tag/v0.1.3)                                                        | * `mini-assistant`:  tool calls with opensourced LLMs<br>                                            | 5.17 |
-| v0.1.4                                                        | * `doc-agent` : rerank model<br>* `mini-assistant`: `file-search` tool support.                      | 6.18 |
+| Milestone                                                    | Features                                                     | DDL           |
+|--------------------------------------------------------------|--------------------------------------------------------------|---------------|
+| v0.1.0                                                       | Long-short memory, PDF/TXT/DOCX ingestor, `Chain` programing paradigm, RAG reference app `doc-agent` | 3.29          |
+| [v0.1.1](https://github.com/RobinQu/instinct.cpp/milestone/1) | Performance tuning, RAG evaluation,  Function calling agent  | 4.16          |
+| [v0.1.2](https://github.com/RobinQu/instinct.cpp/milestone/2) | OpenAI Assistant API initial implementation, single-binary reference app `mini-assistant` | 4.30          |
+| [v0.1.3](https://github.com/RobinQu/instinct.cpp/releases/tag/v0.1.3) | * `mini-assistant`:  tool calls with opensourced LLMs<br>    | 5.17          |
+| [v0.1.4](https://github.com/RobinQu/instinct.cpp/milestone/4) | * `doc-agent` : rerank model<br>* `mini-assistant`: `file-search` tool support. | ~~6.18~~ 6.14 |
+| [v0.1.5](https://github.com/RobinQu/instinct.cpp/milestone/5) | Overall optimization                                         | 6.30          |
+| [v0.1.6](https://github.com/RobinQu/instinct.cpp/milestone/6) | `code-interpreter` in `mini-assistant`                       | 7.15          |
+
 
 
 
diff --git a/docs/assistant_api.md b/docs/assistant_api.md
@@ -47,109 +47,50 @@ In first release of `mini-assistant`, following endpoints are supported:
   * DELETE `/files/:file_id`
   * GET `/files/:file_id`
   * GET `/files/:file_id/content`
+* VectorStore
+  * POST `/v1/vector_stores/`
+  * GET `/v1/vector_stores/`
+  * GET `/v1/vector_stores/:vector_store_id`
+  * POST `/v1/vector_stores/:vector_store_id`
+  * DELETE `/v1/vector_stores/:vector_store_id`
+  * GET `/v1/vector_stores/:vector_store_id/files`
+  * POST `/v1/vector_stores/:vector_store_id/files`
+  * GET `/v1/vector_stores/:vector_store_id/files/:file_id`
+  * DELETE `/v1/vector_stores/:vector_store_id/files/:file_id`
+  * POST `/v1/vector_stores/:vector_store_id/file_batches`
+  * GET `/v1/vector_stores/:vector_store_id/file_batches/:batch_id`
+  * POST `/v1/vector_stores/:vector_store_id/file_batches/:batch_id/cancel`
+  * GET `/v1/vector_stores/:vector_store_id/file_batches/:batch_id/files`
 
-## task-queue 
-
-Task scheduling is needed in following sections:
-
-* instinct-assistant
-  * Periodic task to check status of run objects and run step objects in background.
-  * FIFO queue for execution of run object, which handles agent execution. Only one running task is allowed for single run object.
-
-### Worker queue for run objects
-
-#### Preconditions
-
-* run object is `queued` or `required_action`.
-* all file resources needed are alive
-
-#### Procedures
-
-* Create `IAgentExecutor` instance with given tools setup
-* Recover `AgentState` from database
-* Run `IAgentExecutor::Stream` loop.
-  * If run object is `cancelling` or `expired`, then stop. 
-  * Update status of run object to `in_progress`.
-  * if resolved step is agent thought, then
-    * create a message with thought text.
-    * create run step with status of `completed` and `step_details` with `message_creation` type.
-    * create another run step object with status of `in_progress`, create a message with `step_details` with `tool_call` type.
-      * if thought contains actions for function calls **stop**.
-      * if thought contains actions for other tool uses, then continue. As `code_interpreter` and `file_search` is triggered automatically.
-  * if resolved agent step is agent observation, then
-    * update `step_details` of last run step object.
-  * if resolved agent step is final message, then
-    * create a message object containing the final message text.
-    * create a run step with `step_details` of `message_creation` type and related `message_id`.
-    * and **stop**
-* if stopped and
-  * status of run object is `cancelling`, then update status of run object to `canclled`.
-  * final message is generated, update status of run object to `completed`. 
-  * function tool calls are required, then update status of run object to `requires_action`.
-  * error occurs during loop, update status of run object to `failed`.
-
-
-#### Outcomes
-* run object should be in intermediate status other than `queued`.
-* run steps and generated messages are saved to database. 
 
-### Background queue for run objects
-
-#### Preconditions
-
-* Only one thread is running this task across entire cluster.
-* Variables: 
-  * `RUN_TIMEOUT` defaults to 10min.
-
-
-#### Procedures 
-
-* Find all run objects that matches `modified_at < now() - RUN_TIMEOUT`.
-* Loop run objects found
-  * Find last run step that are `in_progress` and update them to status of `expired`.
-  * Update run object to status of `expired`.
-
-
-### Implementation notes
+## task-queue 
 
-* An in-memory local queue is preferred in `mini-assistant`. [cameron314/concurrentqueue](https://github.com/cameron314/concurrentqueue) seems to be a good option.
-* Uniform interface for task queue is needed for more scalable implementation on the cloud, where dedicated task scheduler will be used with multiple worker nodes setup. Possible options: 
-  * Celery or other task-focused frameworks.
-  * Queue facilities in distributed compute frameworks like `ray`, e.g. [ray.util.queue.Queue](https://docs.ray.io/en/latest/ray-core/api/doc/ray.util.queue.Queue.html).
-  * Custom implementation based on message broker like Kafka.  
+In `mini-assistant`, an in-process, multi-consumer task queue is created. See [ThreadPoolTaskScheduler.hpp](../modules/instinct-data/include/task_scheduler/ThreadPoolTaskScheduler.hpp) for more details.
 
-  
-### Outcomes
+Primary task handler classes are:
 
-* task and task steps that are timeout have been marked as `expired`.
+* [FileObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/FileObjectTaskHandler.hpp): To process uploaded file.
+* [RunObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/RunObjectTaskHandler.hpp): To execute a run request for threads.
 
-## `tool-server`
+`ThreadPoolTaskScheduler` is kind of  `ILifeCycle` and it's bootstrap in main.
 
-### `file-search`
 
-Considerations:
+## tool-server
 
-* duckdb implementation, mainly used for `mini-assistant`.
-  * For Each file object we will generate one document table and embedding table.
-  * Given only one `file_id` can be assigned to thread currently and only one process is accessing database files, we can manage all `VectorStorePtr` dynamically in memory.  e.g a map from `file_id` to `VectorStorePtr`.
-* a more scalable solution involves standalone vector database.
-  * A file object can be embedded and linked to a `collection`.
-  * Mapping from `file_id` and `collection`'s id is required.
+### file-search
 
+Primary classes:
 
-Primary workflows:
+* [SummaryGuidedFileSearch.hpp](../modules/instinct-assistant/include/assistant/v2/toolkit/SummaryGuidedFileSearch.hpp): Actual implementation of search tool
+* [RunObjectTaskHandler.hpp](../modules/instinct-assistant/include/assistant/v2/task_handler/RunObjectTaskHandler.hpp): Bring the search tool to user's run requests.
 
-1. File ingestion: operations in `FileBatch` and `File` endpoints will trigger `FileIngestionTaskHandler`, where file is split and transformed into embeddings.
-2. Online search: `file-search` as built-in tools in run objects if explicitly requested.
 
-Primary classes:
+Search pipeline:
 
-* `VectorStoreController`: manage multiple `IVectorStore` instances.
-* `FileIngestionTaskHandler`: ingest uploaded file and update corresponding `IVectorStore`.
-* `FileSearchTool`: gather user query and search against given `IVectorStore`.
+![file_search_pipeline.png](file_search_pipeline.png)
 
 
-### `code-interpreter`
+### code-interpreter
 
 Prompting is straightforward. The challenge would the sandbox for Python scripts.
 
diff --git a/docs/file_search_pipeline.png b/docs/file_search_pipeline.png
diff --git a/modules/instinct-retrieval/include/store/duckdb/BaseDuckDBStore.hpp b/modules/instinct-retrieval/include/store/duckdb/BaseDuckDBStore.hpp
@@ -54,6 +54,11 @@ namespace INSTINCT_RETRIEVAL_NS {
          */
         bool create_or_replace_table = false;
 
+        /**
+        * A flag to bypass table checking
+        */
+        bool bypass_table_check = false;
+
         /**
          * Optional instance id
          */
@@ -316,10 +321,10 @@ namespace INSTINCT_RETRIEVAL_NS {
 
             const auto sql = details::make_create_table_sql(options_.table_name, options_.dimension, metadata_schema_, options_.create_or_replace_table);
             LOG_DEBUG("create document table with SQL if necessary: {}", sql);
-
-            const auto create_table_result = connection_.Query(sql);
-            assert_query_ok(create_table_result);
-
+            if (!options_.bypass_table_check) {
+                const auto create_table_result = connection_.Query(sql);
+                assert_query_ok(create_table_result);
+            }
             prepared_count_all_statement_ = connection_.Prepare(details::make_prepared_count_sql(options_.table_name));
             assert_prepared_ok(prepared_count_all_statement_, "Failed to prepare count statement");
         }
diff --git a/modules/instinct-retrieval/include/store/duckdb/DuckDBVectorStoreOperator.hpp b/modules/instinct-retrieval/include/store/duckdb/DuckDBVectorStoreOperator.hpp
@@ -85,6 +85,8 @@ namespace INSTINCT_RETRIEVAL_NS {
             auto metadata_schema = std::make_shared<MetadataSchema>(instance->metadata_schema());
             const auto embedding_model = std::invoke(embedding_model_selector_, instance_id, metadata_schema);
             DuckDBStoreOptions options;
+            // skip table creating as it should be already provisioned when calling this function
+            options.bypass_table_check = true;
             options.instance_id = instance_id;
             ConfigureDuckDBOptions(options, embedding_model);
             return CreateDuckDBVectorStore(duck_db_, embedding_model, options, metadata_schema);