Skip to content

Conversation

@howlger
Copy link
Contributor

@howlger howlger commented Feb 5, 2025

In org.eclipse.help.internal.search.SearchIndex, refactor the openSearcher() method to return an AutoCloseable object so that it can be used in a try-with-resources statement that implicitly registers and unregisters the search.

Minor performance improvement: Use double-checked locking idiom for lazy initialization of the IndexSearcher.

Grant access to the searches and analyzerDescriptor fields via SearcherInfo#getIndexSearcher() and SearcherInfo#getAnalyzerDescriptor(). These two getter methods are not (yet) used internally. Their only purpose is to be able to implement a hybrid search that combines a semantic search with a sparse keyword search. And the sparse keyword search uses the existing SearchIndex, which requires access to these fields.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Test Results

 1 758 files  ±0   1 758 suites  ±0   1h 27m 43s ⏱️ -10s
 4 174 tests ±0   4 151 ✅ ±0   23 💤 ±0  0 ❌ ±0 
13 119 runs  ±0  12 952 ✅ ±0  167 💤 ±0  0 ❌ ±0 

Results for commit e26c56d. ± Comparison against base commit 4a4172c.

♻️ This comment has been updated with latest results.

@howlger
Copy link
Contributor Author

howlger commented Feb 5, 2025

@iloveeclipse This is a work for Advantest.

The only purpose of this change is to add a method to an internal class. We want to use this and #1725 to enhance the local help search with the help of semantic search. More precisely, a hybrid search consisting of a semantic search and a sparse keyword search based on the existing internal local search to enhance the top search results by adding hits and/or changing the ranking. The limitation of hybrid search is that help content needs to be shipped with a pre-built index of the embeddings, also called a vector store, because indexing takes significantly longer for semantic search than for keyword search, where a pre-built index is optional.

Advantest intends to open source and contribute the hybrid search to Eclipse, for which some work still needs to be done.

@iloveeclipse
Copy link
Member

The build failed for no obvious reason in unrelated code. I've retriggered build now.

@iloveeclipse
Copy link
Member

The build failed for no obvious reason in unrelated code.

It is because your PR is FAR behind master! I will rebase now.

@iloveeclipse iloveeclipse force-pushed the patch-for-hybrid-search branch from f11951a to 46620d3 Compare February 5, 2025 10:43
@iloveeclipse
Copy link
Member

Same as on other PR: could you provide a simple test that uses this new method (even if it is experimental)?
Could you please check if there are any related tests that could be updated in org.eclipse.ua.tests.help.search.AllSearchTests ?

@iloveeclipse
Copy link
Member

The last build failed due heap space problem, not related to this PR change. I've re-triggered that again.

@howlger
Copy link
Contributor Author

howlger commented Feb 5, 2025

Same as on other PR: could you provide a simple test that uses this new method (even if it is experimental)? Could you please check if there are any related tests that could be updated in org.eclipse.ua.tests.help.search.AllSearchTests ?

I'm not sure if this is really required as the new experimental method customSearch(...) is basically a copy of the search(...) where for the inner part a custom implementation can be used. But if it is wanted, I will try to add a test for it.

@iloveeclipse
Copy link
Member

But if it is wanted, I will try to add a test for it.

The point is: there is an internal public method without uses. As a responsible maintainer I would tend to remove it, simply because any extra unused code is wasting our time/resources. If there is an example test that shows that the code does something useful, I would not touch that and even fix that if it should later fail.

Copy link
Contributor

@jukzi jukzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to add this unreferenced code but to make the needed methods more public. is "protected" enough?

@howlger howlger force-pushed the patch-for-hybrid-search branch from 46620d3 to 6372909 Compare February 7, 2025 09:41
@iloveeclipse
Copy link
Member

@howlger : please if you push your updated changes, before doing that rebase your branch on latest state. It just wastes time to run tests on ancient platform state that doesn't compile anymore. I will rebase now.

In `org.eclipse.help.internal.search.SearchIndex`, refactor the
`openSearcher()` method to return an `AutoCloseable` object so that it
can be used in a try-with-resources statement that implicitly registers
and unregisters the search.

Minor performance improvement: Use double-checked locking idiom for lazy
initialization of the `IndexSearcher`.

Grant access to the `searches` and `analyzerDescriptor` fields via
`SearcherInfo#getIndexSearcher()` and
`SearcherInfo#getAnalyzerDescriptor()`. These two getter methods are not
(yet) used internally. Their only purpose is to be able to implement a
hybrid search that combines a semantic search with a sparse keyword
search. And the sparse keyword search uses the existing `SearchIndex`,
which requires access to these fields.
@iloveeclipse iloveeclipse force-pushed the patch-for-hybrid-search branch from 6372909 to e26c56d Compare February 7, 2025 09:55
@howlger
Copy link
Contributor Author

howlger commented Feb 7, 2025

I would prefer not to add this unreferenced code but to make the needed methods more public. is "protected" enough?

Your feedback gave me an idea and I have removed the customSearch(...) method and instead extended the public openSearcher() method to return an AutoCloseable. This way it can be used in a try-with-resources statement that registers and unregisters the search. openSearcher() has been optimized by the double-checked locking idiom and the existing search(...) has been refactored to use openSearcher() in a try-with-resources statement.

@howlger howlger changed the title Add org.eclipse.help.internal.search.SearchIndex#customSearch(...) Refactor org.eclipse.help.internal.search.SearchIndex#openSearcher() Feb 7, 2025
@howlger
Copy link
Contributor Author

howlger commented Feb 7, 2025

@howlger : please if you push your updated changes, before doing that rebase your branch on latest state. It just wastes time to run tests on ancient platform state that doesn't compile anymore. I will rebase now.

@iloveeclipse Thanks, I missed that. Please note that I have now chosen a different approach which IMHO speeds up the existing search a bit and improves the readability of the code a bit.

@iloveeclipse
Copy link
Member

Build failed again due heap space issues (unrelated to the changes here)... I've restarted it.

@iloveeclipse
Copy link
Member

Your feedback gave me an idea and I have removed the customSearch(...) method and instead extended the public openSearcher() method to return an AutoCloseable. This way it can be used in a try-with-resources statement that registers and unregisters the search. openSearcher() has been optimized by the double-checked locking idiom and the existing search(...) has been refactored to use openSearcher() in a try-with-resources statement.

Indeed, the code is cleaner now. Thanks.

@iloveeclipse iloveeclipse merged commit 0fc54b8 into eclipse-platform:master Feb 7, 2025
17 checks passed
@howlger
Copy link
Contributor Author

howlger commented Feb 11, 2025

Thanks everyone for your feedback and for applying.

@iloveeclipse iloveeclipse added this to the 4.35 M3 milestone Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants