Skip to content

Commit a24d363

Browse files
rdhyeeclaude
andauthored
Add progressive globe with H3 aggregation and sample drill-down (#50)
* Add H3 spatial indexing, two-tier facet loading, and benchmark optimizations (#5) Add H3 spatial indexing, two-tier facet loading, and benchmark optimizations ## Changes - isamples_explorer.qmd: Two-tier facet loading (2KB summary for instant counts) - parquet_cesium_isamples_wide.qmd: Zoom-adaptive H3 clustering with LOD - zenodo_isamples_analysis.qmd: Data-driven H3 regional analysis - narrow_vs_wide_performance.qmd: Added geospatial and facet benchmarks ## Fixes Applied (Codex review) - Fixed MODE(n) → MODE(source) for cluster coloring - Added camera listener cleanup to prevent leaks - Added NaN guard for cluster label parsing - Added user-facing warning for facet summary failures Closes #1, #2, #3, #4 * Add progressive globe demo with H3 aggregated loading Loads 580KB H3 res4 summary for instant globe render (<1s), then switches to res6/res8 on zoom with viewport filtering. Click triggers sample detail query from full 280MB parquet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix progressive globe: render stats bar from OJS cells DOM elements created in raw HTML aren't available when OJS cells execute. Move legend, stats bar, and phase indicator into OJS cells and add null guards on all getElementById calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Click cluster dot to fly-to and drill down to next H3 resolution Clicking an H3 cluster now flies the camera to that location at an altitude that triggers the next resolution level (res4→res6→res8). The zoom watcher then automatically loads finer detail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Redesign progressive globe: side panel + global data + info-only clicks - Side-by-side layout: globe left, live info panel right (always visible) - Load full H3 files globally (no viewport filtering) — no gaps when panning - Click shows cluster info + nearby samples in side panel (no camera fly-to) - Zoom watcher switches resolution automatically: res4 → res6 → res8 - Stats, legend, cluster card, and sample list all in side panel Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix infinite loop: replace OJS reactivity with imperative DOM updates The side panel was causing a reactive cycle: globeStatus change → sidePanel re-render → layout re-render → viewer re-create → phase1 re-run → globeStatus change → loop Fix: all side panel content is static HTML. Stats, cluster card, and sample list are updated via getElementById/innerHTML only. No OJS mutable variables, no reactive cascade. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add sub-res8 individual sample drill-down to progressive globe - New 4th zoom tier: below 120km altitude, switches from H3 clusters to individual sample points loaded from lite parquet (60MB vs 280MB) - Two-stage sample card: instant metadata from lite file, lazy-loaded description from full wide parquet on click - Viewport caching with 30% padding for smooth panning - Stale-request guards for async camera/query flows - Hysteresis thresholds (120km enter / 180km exit) to prevent flicker - Separate PointPrimitiveCollection for samples vs clusters - Cluster click queries now use lite parquet instead of wide (5x faster) Data files on R2: - isamples_202601_samples_map_lite.parquet (60MB, 6M rows, 9 columns) - Still uses H3 summary files for res4/6/8 cluster view Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix bugs from Codex review: deadlock, schema mismatch, timing - loadRes: wrap in try/catch/finally so `loading` flag always resets on query failure (was permanent deadlock — finding #2) - Schema fix: cluster-click query used `n as source` but the lite parquet has column named `source` (finding #4) - Remove unnecessary ORDER BY on H3 loads (finding #8) - Use .pop() instead of [0] for performance timing entries (finding #11) - Add rel="noopener noreferrer" to target="_blank" link (finding #7) Deferred: XSS escaping (trusted data), antimeridian handling, detail click caching, startup error fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add progressive globe to sidebar navigation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix cluster-click query: remove description column missing from lite parquet The samples_map_lite.parquet doesn't have a description column. Use place_name for nearby sample cards instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add URL state encoding for shareable deep links - Hash-based URL state: lat, lng, alt, heading, pitch, mode, pid - v=1 schema versioning for future compatibility - parseNum with Number.isFinite (avoids lat=0 bug from Codex review) - replaceState for continuous camera movement, pushState for mode transitions and sample/cluster selection - Browser back/forward via hashchange listener with flight animation - Suppress flag prevents hash write loops during navigation restore - Deep-link startup: fly to position and restore sample card from pid - Share View button copies current URL to clipboard with toast - pid takes precedence over h3 (canonicalized on write) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix startup crash: move _initialHash before globalRect block that reads it v._initialHash was set after the once() closure that references it, causing undefined.lat TypeError on page load. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 4cd3164 commit a24d363

File tree

6 files changed

+1794
-238
lines changed

6 files changed

+1794
-238
lines changed

_quarto.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@ website:
5454
href: tutorials/zenodo_isamples_analysis.qmd
5555
- text: "3D Globe Visualization"
5656
href: tutorials/parquet_cesium_isamples_wide.qmd
57+
- text: "Progressive Globe (H3 + Samples)"
58+
href: tutorials/progressive_globe.qmd
5759
- text: "Technical: Narrow vs Wide"
5860
href: tutorials/narrow_vs_wide_performance.qmd
5961

tutorials/isamples_explorer.qmd

Lines changed: 145 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Search and explore **6.7 million physical samples** from scientific collections
1212

1313
::: {.callout-note}
1414
### Serverless Architecture
15-
This app queries a ~280 MB Parquet file directly in your browser using DuckDB-WASM. No server required!
15+
This app uses a **two-tier loading strategy**: a 2KB pre-computed summary loads instantly for facet counts (source, material, context, specimen type), while the full ~280 MB Parquet file is only queried when drilling into records. All powered by DuckDB-WASM in your browser -- no server required!
1616
:::
1717

1818
## Setup
@@ -28,6 +28,9 @@ duckdbModule = import("https://cdn.jsdelivr.net/npm/@duckdb/duckdb-wasm@1.28.0/+
2828
// Data source configuration
2929
parquet_url = "https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_wide.parquet"
3030
31+
// Pre-computed facet summaries (2KB - loads instantly)
32+
facet_summaries_url = "https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_facet_summaries.parquet"
33+
3134
// Source color scheme (consistent with iSamples conventions)
3235
SOURCE_COLORS = ({
3336
'SESAR': '#3366CC', // Blue
@@ -79,14 +82,18 @@ viewof searchInput = Inputs.text({
7982

8083
### Filters
8184

85+
```{ojs}
86+
facetSummariesWarning
87+
```
88+
8289
**Source**
8390

8491
```{ojs}
8592
//| code-fold: true
86-
// Source checkboxes with counts
93+
// Source checkboxes with counts - uses pre-computed summaries for instant load
8794
viewof sourceCheckboxes = {
88-
// Get source counts based on current search
89-
const counts = await sourceCounts;
95+
// Use pre-computed facet summaries (instant) instead of scanning full parquet
96+
const counts = facetsByType.source;
9097
const options = counts.map(r => r.value);
9198
9299
return Inputs.checkbox(options, {
@@ -104,6 +111,69 @@ viewof sourceCheckboxes = {
104111
}
105112
```
106113

114+
**Material**
115+
116+
```{ojs}
117+
//| code-fold: true
118+
// Material filter - loaded from pre-computed summaries
119+
viewof materialCheckboxes = {
120+
const counts = facetsByType.material;
121+
const options = counts.map(r => r.value);
122+
return Inputs.checkbox(options, {
123+
value: [],
124+
format: (x) => {
125+
const r = counts.find(s => s.value === x);
126+
const count = r ? Number(r.count).toLocaleString() : "0";
127+
return html`<span style="display: inline-flex; align-items: center; gap: 4px;">
128+
${x} <span style="color: #888; font-size: 11px;">(${count})</span>
129+
</span>`;
130+
}
131+
});
132+
}
133+
```
134+
135+
**Sampled Feature**
136+
137+
```{ojs}
138+
//| code-fold: true
139+
// Context filter - loaded from pre-computed summaries
140+
viewof contextCheckboxes = {
141+
const counts = facetsByType.context;
142+
const options = counts.map(r => r.value);
143+
return Inputs.checkbox(options, {
144+
value: [],
145+
format: (x) => {
146+
const r = counts.find(s => s.value === x);
147+
const count = r ? Number(r.count).toLocaleString() : "0";
148+
return html`<span style="display: inline-flex; align-items: center; gap: 4px;">
149+
${x} <span style="color: #888; font-size: 11px;">(${count})</span>
150+
</span>`;
151+
}
152+
});
153+
}
154+
```
155+
156+
**Specimen Type**
157+
158+
```{ojs}
159+
//| code-fold: true
160+
// Object type filter - loaded from pre-computed summaries
161+
viewof objectTypeCheckboxes = {
162+
const counts = facetsByType.object_type;
163+
const options = counts.map(r => r.value);
164+
return Inputs.checkbox(options, {
165+
value: [],
166+
format: (x) => {
167+
const r = counts.find(s => s.value === x);
168+
const count = r ? Number(r.count).toLocaleString() : "0";
169+
return html`<span style="display: inline-flex; align-items: center; gap: 4px;">
170+
${x} <span style="color: #888; font-size: 11px;">(${count})</span>
171+
</span>`;
172+
}
173+
});
174+
}
175+
```
176+
107177
```{ojs}
108178
//| code-fold: true
109179
html`<a href="?" style="font-size: 13px;">Clear All Filters</a>`
@@ -131,6 +201,9 @@ viewof maxSamples = Inputs.range([1000, 100000], {
131201
const params = new URLSearchParams();
132202
if (searchInput) params.set("q", searchInput);
133203
if (sourceCheckboxes?.length) params.set("sources", sourceCheckboxes.join(","));
204+
if (materialCheckboxes?.length) params.set("material", materialCheckboxes.join(","));
205+
if (contextCheckboxes?.length) params.set("context", contextCheckboxes.join(","));
206+
if (objectTypeCheckboxes?.length) params.set("object_type", objectTypeCheckboxes.join(","));
134207
if (viewMode !== "globe") params.set("view", viewMode);
135208
136209
const newUrl = params.toString() ? `?${params.toString()}` : window.location.pathname;
@@ -264,7 +337,50 @@ async function runQuery(sql) {
264337

265338
```{ojs}
266339
//| code-fold: true
267-
// Build WHERE clause from current filters
340+
// Tier 1: Load pre-computed facet summaries (2KB, instant)
341+
facetSummaries = {
342+
facetSummariesError = null;
343+
try {
344+
const rows = await runQuery(`SELECT * FROM read_parquet('${facet_summaries_url}')`);
345+
return rows;
346+
} catch (e) {
347+
console.error("Facet summaries load error:", e);
348+
facetSummariesError = e;
349+
return [];
350+
}
351+
}
352+
353+
```
354+
355+
```{ojs}
356+
//| code-fold: true
357+
facetSummariesWarning = {
358+
if (!facetSummariesError) return null;
359+
return html`<div style="margin: 6px 0 10px; padding: 8px 10px; border: 1px solid #f0b429; background: #fff7e6; border-radius: 6px; color: #7a4b00; font-size: 12px;">
360+
Facet summaries failed to load. Filter counts may be missing. Try refreshing.
361+
</div>`;
362+
}
363+
364+
// Extract facet counts by type from pre-computed summaries
365+
facetsByType = {
366+
const grouped = { source: [], material: [], context: [], object_type: [] };
367+
for (const row of facetSummaries) {
368+
const ft = row.facet_type;
369+
if (grouped[ft]) {
370+
grouped[ft].push({ value: row.facet_value, count: Number(row.count), scheme: row.scheme });
371+
}
372+
}
373+
// Sort each by count descending
374+
for (const key of Object.keys(grouped)) {
375+
grouped[key].sort((a, b) => b.count - a.count);
376+
}
377+
return grouped;
378+
}
379+
```
380+
381+
```{ojs}
382+
//| code-fold: true
383+
// Build WHERE clause from current filters (Tier 2: queries full parquet only when filtering)
268384
whereClause = {
269385
const conditions = [
270386
"otype = 'MaterialSampleRecord'",
@@ -288,40 +404,36 @@ whereClause = {
288404
conditions.push(`n IN (${sourceList})`);
289405
}
290406
407+
// Material filter
408+
const materials = Array.from(materialCheckboxes || []);
409+
if (materials.length > 0) {
410+
const matList = materials.map(m => `'${m.replace(/'/g, "''")}'`).join(", ");
411+
conditions.push(`has_material_category IN (${matList})`);
412+
}
413+
414+
// Context (sampled feature) filter
415+
const contexts = Array.from(contextCheckboxes || []);
416+
if (contexts.length > 0) {
417+
const ctxList = contexts.map(c => `'${c.replace(/'/g, "''")}'`).join(", ");
418+
conditions.push(`has_context_category IN (${ctxList})`);
419+
}
420+
421+
// Object type (specimen type) filter
422+
const objectTypes = Array.from(objectTypeCheckboxes || []);
423+
if (objectTypes.length > 0) {
424+
const otList = objectTypes.map(o => `'${o.replace(/'/g, "''")}'`).join(", ");
425+
conditions.push(`has_specimen_category IN (${otList})`);
426+
}
427+
291428
return conditions.join(" AND ");
292429
}
293430
```
294431

295432
```{ojs}
296433
//| code-fold: true
297-
// Get source facet counts (respects text search but not source filter)
298-
sourceCounts = {
299-
let baseWhere = "otype = 'MaterialSampleRecord' AND latitude IS NOT NULL";
300-
301-
if (searchInput?.trim()) {
302-
const term = searchInput.trim().replace(/'/g, "''");
303-
baseWhere += ` AND (
304-
label ILIKE '%${term}%'
305-
OR description ILIKE '%${term}%'
306-
OR CAST(place_name AS VARCHAR) ILIKE '%${term}%'
307-
)`;
308-
}
309-
310-
const query = `
311-
SELECT n as value, COUNT(*) as count
312-
FROM samples
313-
WHERE ${baseWhere}
314-
GROUP BY n
315-
ORDER BY count DESC
316-
`;
317-
318-
try {
319-
return await runQuery(query);
320-
} catch (e) {
321-
console.error("Facet query error:", e);
322-
return [];
323-
}
324-
}
434+
// Source counts now come from pre-computed facet summaries (Tier 1)
435+
// No longer scans the full parquet file on every page load
436+
sourceCounts = facetsByType.source
325437
```
326438

327439
```{ojs}

0 commit comments

Comments
 (0)