Skip to content

Conversation

@Akhil-Pathivada
Copy link
Contributor

@Akhil-Pathivada Akhil-Pathivada commented Dec 24, 2025

Summary

Adds support for Lucene Scalar Quantization (SQ) and FAISS 16-bit Scalar Quantization (FP16) for OSS OpenSearch, enabling users to reduce memory usage for in-memory vector indexes.

Changes

Backend

  • Refactored quantization enum: fp32/fp16None/LuceneSQ/FaissSQfp16
  • Added new fields:
    • confidence_interval (float, optional): For Lucene SQ quantile calculation
    • clip (bool, optional): For FAISS FP16 out-of-range value handling
  • Implemented validator: Converts UI/CLI string values to enum with backward compatibility
  • Updated encoder logic: Engine-aware configuration for Lucene SQ and FAISS FP16

Frontend

  • Engine-aware UI: Separate quantization dropdowns for Lucene and FAISS engines
    • Lucene: ["None", "LuceneSQ"]
    • FAISS: ["None", "FaissSQfp16"]
  • Progressive disclosure: Optional parameters appear only when relevant
    • confidence_interval shown for Lucene SQ
    • clip shown for FAISS FP16
  • Prevents invalid combinations: UI logic ensures engine/quantization compatibility

CLI

  • Updated options: --quantization-type now accepts None, LuceneSQ, FaissSQfp16
  • New parameters: --confidence-interval and --clip for fine-tuning

Technical Details

Lucene SQ

  • Converts 32-bit float vectors to 7-bit integers (4x memory reduction)
  • Optional confidence_interval (0.0-1.0) controls quantile calculation
  • OpenSearch API: encoder: {name: "sq", parameters: {confidence_interval: <value>}}

FAISS FP16

  • Converts 32-bit float vectors to 16-bit floats (2x memory reduction)
  • Optional clip parameter handles out-of-range values ([-65504, 65504])
  • OpenSearch API: encoder: {name: "sq", parameters: {type: "fp16", clip: true}}

Testing

  • ✅ All linter checks passed
  • ✅ Backward compatible with existing configs
  • ✅ Engine-aware UI prevents invalid configurations
  • ✅ CLI and UI both functional

Screenshots

Screenshot 2025-12-24 at 16 46 01 Screenshot 2025-12-24 at 16 46 21

@Akhil-Pathivada Akhil-Pathivada force-pushed the feature/quantization-support branch from c81a1ef to f528fdc Compare December 24, 2025 11:11
@Akhil-Pathivada Akhil-Pathivada marked this pull request as ready for review December 24, 2025 11:17
@Akhil-Pathivada
Copy link
Contributor Author

/assign @alwayslove2013

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Akhil-Pathivada, alwayslove2013
To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed.
You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alwayslove2013 alwayslove2013 merged commit eb79134 into zilliztech:main Dec 25, 2025
4 checks passed
@Akhil-Pathivada Akhil-Pathivada deleted the feature/quantization-support branch December 25, 2025 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants