Skip to content

Comments

Add q param to algolia search, fix search term highlighting across html nodes#14080

Open
vezwork wants to merge 2 commits intomainfrom
feature/robust-search-highlighting
Open

Add q param to algolia search, fix search term highlighting across html nodes#14080
vezwork wants to merge 2 commits intomainfrom
feature/robust-search-highlighting

Conversation

@vezwork
Copy link
Contributor

@vezwork vezwork commented Feb 20, 2026

Examples

I highlighted and copied some text on various pages that spans decorated and undecorated text, then pasted it into algolia search, and these are the successful matches:

image --- image --- image ---

Description

This PR

  1. replaces text fragments (from PR Add text fragments to quarto-search.js items #14003) in algolia search urls with a q={search term} param, enabling multiple match highlighting
  2. replaces the previous method of highlighting search terms in the page which only worked when the search term was fully contained within a single node. The previous method worked great for matches in paragraphs that did not have any decoration, but did not work for matches that contained both decorated and undecorated text, and did not work well for code cells with syntax highlighting.

I referenced @cderv's PR #14053 while finishing this PR. Building off of that, I found a method for opening containing tabsets that requires significantly less code. See openAllTabsetsContainingEl

@vezwork vezwork requested review from cderv and cscheid February 20, 2026 19:44
@posit-snyk-bot
Copy link
Collaborator

posit-snyk-bot commented Feb 20, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@vezwork vezwork removed the request for review from cscheid February 20, 2026 19:44
activateTabsWithMatches(mainEl);
// Let the browser settle layout after Bootstrap tab transitions
// before calculating scroll position.
requestAnimationFrame(() => scrollToFirstMatch(mainEl));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was moved inside of highlight

Comment on lines -365 to -369
const fullMatches = item.text.matchAll(/<mark class='search-match'>(.*?)<\/mark>/g)
// extract capture group with the search match
// result e.g. ["def fiz"]
const searchMatches = [...fullMatches].map(match => match[1])
if (searchMatches[0]) {
Copy link
Contributor Author

@vezwork vezwork Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer use the matching text in the search, like we did for text fragments. Instead, for the q param, we use the user's full query, exactly as they typed it, whether algolia found a match or not. Algolia doesn't always find matches that the highlighting can, so this should help the user find anything that matches what they typed.

// After search highlighting, activate any tabs whose panes contain <mark> matches.
// This ensures that search results inside inactive Bootstrap tabs become visible.
// Handles nested tabsets by walking up ancestor panes and activating outermost first.
function activateTabsWithMatches(mainEl) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with openAllTabsetsContainingEl

@vezwork vezwork changed the title Add q param to algolia search, fix highlighting for terms across multiple html nodes Add q param to algolia search, fix search term highlighting across html nodes Feb 20, 2026
@cscheid
Copy link
Collaborator

cscheid commented Feb 21, 2026

Code LGTM but I believe we have a stale playwright test now.

@cderv
Copy link
Collaborator

cderv commented Feb 23, 2026

Thanks for the work on cross-node highlighting — the searchMatches() approach that handles text spanning decorated elements is a real improvement over the per-text-node regex matching.

So I am looking into this more, but the failing test is not stale, it just shows that this PR brings a regression compare to what I have tried to do in previous improvement. Here is my analysis for now

These tests were added in #14053 and cover specific edge cases in how search interacts with Bootstrap tabsets. They pass on current main (https://github.com/quarto-dev/quarto-cli/actions/runs/22241123723, after #14053 merged).

What the tests cover

The test fixture has multiple tabsets with search terms placed strategically to exercise different scenarios. The two failing tests are:

  1. "Search keeps active tab when it already has a match"

When a search term (gamma-both-tabs) exists in both the R tab (active by default) and the Python tab, the expected behavior is to leave the R tab active — it already shows a match, so switching tabs would be disruptive. The previous activateTabsWithMatches() grouped marks by their parent .tab-content container and checked whether the active pane already had a match before deciding to switch. openAllTabsetsContainingEl() doesn't have this check — it activates every .tab-pane ancestor of every match node, so the last one processed wins and the mark in the other tab becomes hidden.

  1. "Search activation overrides localStorage tab preference" (line 79)

When the user has a stored tab preference in localStorage (e.g., "R" for grouped language tabs) but the search term (python-only-content) only matches in the Python tab, search should override the stored preference to show the match. This depends on event timing: tabsets.js registers a pageshowhandler that readsquarto-persistent-tabsets-datafromlocalStorageand callstoggleAll()to restore previously-selected tabs. The previous code in #14053 handled this by registering tab activation as apageshowlistener duringDOMContentLoaded— listener ordering guarantees it runs aftertabsets.jsrestores stored state, so search activation wins. This PR runs tab activation synchronously during highlight() inDOMContentLoaded, so tabsets.js's pageshowhandler fires afterward and overwrites the search-driven tab state with thelocalStorage` preference.

What passes

The other 4 tests pass, which means the core mechanics work: activating an inactive tab for a match, nested tabset activation, not touching tabs when the match is outside tabs, and scrolling to the first match.

Suggestion

The cross-node searchMatches() / markMatches() approach is the right direction — the old per-text-node regex couldn't handle matches spanning decorated elements. The tab activation logic just needs to be brought forward. Specifically:

  1. Before activating a tab, check whether the currently active pane in that tabset already contains a mark. If so, skip it.
  2. Move tab activation to a pageshow listener (registered during DOMContentLoaded) so it runs after tabsets.js restores localStorage state.

I haven't verified locally whether there are other behavioral differences beyond what the tests catch (e.g., how the setTimeout scroll interacts with tab transitions vs the previous requestAnimationFrame approach), but these two are the confirmed regressions from CI.

Happy to discuss it. The demo website is there to show current behavior: https://cderv.github.io/quarto-search-highlight-demo/

@cderv
Copy link
Collaborator

cderv commented Feb 23, 2026

@vezwork I created #14089 to show a possible fix that I think keep your simplified logic, while making the tab handling worked as I initially tried. THe tests should be passing there.

TBD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants