Skip to content

Conversation

@viragtripathi
Copy link
Contributor

Two critical fixes for multi-node CockroachDB clusters:

  1. Connection Timeout Handling: On multi-node v25.4 clusters, CREATE VECTOR INDEX from subprocess contexts experiences a 30-second connection timeout. The index creation continues successfully in the background. This fix detects the timeout and polls for completion (up to 5 minutes).

  2. Vector Index Usage: Fixed vector_search_beam_size not being set on pooled connections, causing queries to use full table scan instead of the vector index. Now configures every connection from the pool with proper beam size.

Testing:

  • Single-node: Works without timeout (178s index creation)
  • Multi-node: Successfully handles timeout and completes (131s total)
  • Vector index: Now properly used for all searches (verified with EXPLAIN)
  • Both achieve ~83% recall with good QPS

Fixes issues where:

  • Benchmarks would fail despite successful index creation
  • Searches were slow due to full table scans instead of index usage

@viragtripathi viragtripathi force-pushed the fix/cockroachdb-subprocess-timeout branch from df89d8d to de2f2a3 Compare November 26, 2025 04:07
Two critical fixes for multi-node CockroachDB clusters:

1. Connection Timeout Handling:
   On multi-node v25.4 clusters, CREATE VECTOR INDEX from subprocess
   contexts experiences a 30-second connection timeout. The index
   creation continues successfully in the background. This fix detects
   the timeout and polls for completion (up to 5 minutes).

2. Vector Index Usage:
   Fixed vector_search_beam_size not being set on pooled connections,
   causing queries to use full table scan instead of the vector index.
   Now configures every connection from the pool with proper beam size.

Testing:
- Single-node: Works without timeout (178s index creation)
- Multi-node: Successfully handles timeout and completes (131s total)
- Vector index: Now properly used for all searches (verified with EXPLAIN)
- Both achieve ~83% recall with good QPS

Fixes issues where:
- Benchmarks would fail despite successful index creation
- Searches were slow due to full table scans instead of index usage
@viragtripathi viragtripathi force-pushed the fix/cockroachdb-subprocess-timeout branch from e64eaba to e65f330 Compare November 26, 2025 04:15
@viragtripathi
Copy link
Contributor Author

/assign @XuanYang-cn

@viragtripathi
Copy link
Contributor Author

@alwayslove2013 Please take a look and merge this fix/PR. Thank you!

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alwayslove2013, viragtripathi
To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed.
You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alwayslove2013 alwayslove2013 merged commit 16c4472 into zilliztech:main Nov 27, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants