-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add OpenAI-compatible embeddings endpoint #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add /v1/embeddings POST endpoint that proxies to the OpenSecret backend with E2EE via TEE attestation. Supports single string and array inputs. - Update opensecret SDK to 0.2.7 for embeddings support - Add create_embeddings handler following chat completions pattern - Register /v1/embeddings route Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR adds OpenAI-compatible embeddings support to the proxy by introducing a new POST /v1/embeddings endpoint, exposing the corresponding handler function in public re-exports, and bumping the opensecret dependency from 0.2.3 to 0.2.7. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Proxy as Proxy Handler<br/>(create_embeddings)
participant Auth as Auth Layer
participant Backend as Backend Client
Client->>Proxy: POST /v1/embeddings<br/>+ API Key + EmbeddingRequest
Proxy->>Auth: Extract & validate API key
alt Auth Success
Auth-->>Proxy: API key valid
Proxy->>Backend: Initialize client with auth
Proxy->>Backend: Forward EmbeddingRequest
Backend-->>Proxy: EmbeddingResponse
Proxy-->>Client: 200 OK<br/>EmbeddingResponse (JSON)
else Auth Failure
Auth-->>Proxy: Auth error
Proxy-->>Client: 401 Unauthorized<br/>OpenAIError
else Backend Error
Backend-->>Proxy: Internal error
Proxy-->>Client: 500 Internal Server Error<br/>OpenAIError
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
Cargo.tomlsrc/lib.rssrc/proxy.rs
🧰 Additional context used
📓 Path-based instructions (4)
Cargo.toml
📄 CodeRabbit inference engine (CLAUDE.md)
Depend on opensecret version 0.2.0
Files:
Cargo.toml
src/lib.rs
📄 CodeRabbit inference engine (CLAUDE.md)
src/lib.rs: lib.rs exports a create_app function that builds the Axum router
The Axum router must include health check endpoints at / and /health
Expose OpenAI-compatible endpoints at /v1/models and /v1/chat/completions
Support optional CORS in the application router
Enable request tracing in the application router
Files:
src/lib.rs
**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.rs: Format Rust code with rustfmt
Run clippy with strict warnings and address all lints
Files:
src/lib.rssrc/proxy.rs
src/proxy.rs
📄 CodeRabbit inference engine (CLAUDE.md)
src/proxy.rs: Extract API keys from the Authorization header or fall back to the default key
Create the OpenSecret client and perform TEE attestation before forwarding
Forward requests to the configured TEE backend
Handle streaming responses for chat completions
Transform backend responses to OpenAI-compatible format
Files:
src/proxy.rs
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/lib.rs : Expose OpenAI-compatible endpoints at /v1/models and /v1/chat/completions
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/proxy.rs : Transform backend responses to OpenAI-compatible format
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to Cargo.toml : Depend on opensecret version 0.2.0
Applied to files:
Cargo.toml
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/lib.rs : Expose OpenAI-compatible endpoints at /v1/models and /v1/chat/completions
Applied to files:
src/lib.rssrc/proxy.rs
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/lib.rs : lib.rs exports a create_app function that builds the Axum router
Applied to files:
src/lib.rs
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/proxy.rs : Transform backend responses to OpenAI-compatible format
Applied to files:
src/lib.rssrc/proxy.rs
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/proxy.rs : Handle streaming responses for chat completions
Applied to files:
src/lib.rssrc/proxy.rs
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/proxy.rs : Create the OpenSecret client and perform TEE attestation before forwarding
Applied to files:
src/lib.rssrc/proxy.rs
📚 Learning: 2025-09-12T18:29:41.802Z
Learnt from: CR
Repo: OpenSecretCloud/maple-proxy PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T18:29:41.802Z
Learning: Applies to src/proxy.rs : Forward requests to the configured TEE backend
Applied to files:
src/lib.rs
🧬 Code graph analysis (2)
src/lib.rs (1)
src/proxy.rs (1)
create_embeddings(202-232)
src/proxy.rs (1)
src/config.rs (1)
server_error(113-115)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Build windows-x86_64
- GitHub Check: Build linux-aarch64
- GitHub Check: build-and-push (linux/arm64, ubuntu-22.04-arm64-4core)
- GitHub Check: Build macos-aarch64
- GitHub Check: build-and-push (linux/amd64, ubuntu-latest)
- GitHub Check: security_audit
🔇 Additional comments (2)
src/proxy.rs (1)
202-232: LGTM! Embeddings endpoint follows established patterns.The implementation correctly mirrors the existing endpoints with proper authentication, client creation with TEE attestation, error handling (401 for auth, 500 for server errors), and debug logging. The non-streaming design is appropriate for embeddings.
src/lib.rs (1)
5-5: LGTM! Route registration follows existing conventions.The embeddings endpoint is correctly imported and registered at
/v1/embeddingswith a POST handler, consistent with the/v1/chat/completionspattern. This properly exposes the new OpenAI-compatible embeddings endpoint.Also applies to: 31-31
Summary
Adds support for text embeddings via the OpenAI-compatible
/v1/embeddingsendpoint, proxying to the OpenSecret backend with E2EE via TEE attestation.Changes
create_embeddingshandler inproxy.rsfollowing the same pattern as chat completions/v1/embeddingsPOST route inlib.rsAPI Format
Follows standard OpenAI embeddings API:
Testing
Tested against dev environment with both single and multiple input embeddings - working correctly with 768-dimension vectors from
nomic-embed-textmodel.Summary by CodeRabbit
New Features
/v1/embeddingsendpoint to the API, enabling users to generate embeddings for text inputs in addition to existing chat completion and model listing capabilities.Dependencies
✏️ Tip: You can customize this high-level summary in your review settings.