Skip to content

Conversation

@spuckhafte
Copy link

@spuckhafte spuckhafte commented Dec 30, 2025

Fixes #1498.

Description


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • New Features

    • New API endpoint to list resources affected by a log stream (filters, dashboards, alerts, roles).
    • New permission to authorize access to affected-resource information.
  • Bug Fixes / Improvements

    • Improved validation and error handling for affected-resource queries, with clearer responses for missing streams and parse failures.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 30, 2025

Walkthrough

Added a GET endpoint /logstream/{logstream}/affected-resources that verifies a stream exists, aggregates dependent resources (filters, dashboards, alerts, roles) via a new LogstreamAffectedResources utility, and enforces a new RBAC action GetLogstreamAffectedResources.

Changes

Cohort / File(s) Summary
HTTP Endpoint Handler
src/handlers/http/logstream.rs
New public async handler get_affected_resources(stream_name: Path<String>) that ensures the stream exists (load-if-needed) and returns LogstreamAffectedResources JSON or an error.
Server Route Registration
src/handlers/http/modal/ingest_server.rs, src/handlers/http/modal/query_server.rs, src/handlers/http/modal/server.rs
Registers GET /logstream/{logstream}/affected-resources routed to logstream::get_affected_resources; route requires Action::GetLogstreamAffectedResources authorization across servers.
Core Resource Dependency Logic
src/handlers/http/modal/utils/logstream_utils.rs
New types LogstreamAffectedResources, LogstreamAffectedDashboard; async methods load, fetch_affected_filters, fetch_affected_dashboards, fetch_affected_alerts, fetch_affected_roles; new errors (LogstreamAffectedResourcesError, Bytes2JSONError); helper bytes_to_json and dashboard/alert/role parsing logic.
RBAC Authorization
src/rbac/role.rs
Added Action::GetLogstreamAffectedResources; mapped as a resource-scoped permission in RoleBuilder and included in default privilege sets (Admin, Editor, Writer, Reader).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler as HTTP Handler\n(logstream::get_affected_resources)
    participant Auth as RBAC Authorization
    participant Metastore as Metastore / ObjectStorage
    participant Parser as JSON Parser & Utils

    Client->>Handler: GET /logstream/{name}/affected-resources
    Handler->>Auth: authorize Action::GetLogstreamAffectedResources
    Auth-->>Handler: allow / deny
    alt allowed
        Handler->>Metastore: ensure stream exists (load if needed)
        Metastore-->>Handler: stream metadata / existence
        Handler->>Metastore: fetch raw bytes for filters/dashboards/alerts/roles
        Metastore-->>Parser: return raw bytes
        Parser-->>Handler: deserialize & filter -> LogstreamAffectedResources
        Handler-->>Client: 200 OK + JSON payload
    else denied
        Handler-->>Client: 403 Forbidden
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • nitisht
  • nikhilsinhaparseable

Poem

🐰 I hopped through bytes and dashboard tiles,
Searched alerts and roles across the files.
One endpoint now reveals the ties,
Filters, dashboards, alerts — a friendly prize. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description references the linked issue (#1498) but lacks implementation details, testing confirmation, and documentation status. Key sections are empty with unchecked boxes. Fill in the description sections with implementation details, confirm testing status, and document whether comments and documentation were added for the new features.
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely summarizes the main change: adding functionality to fetch logstream affected resources.
Linked Issues check ✅ Passed The PR implementation fully addresses all coding requirements from issue #1498: API endpoint for fetching affected resources including filters [#1498], alerts [#1498], dashboards [#1498], and roles [#1498].
Out of Scope Changes check ✅ Passed All changes are scoped to implementing the dependency graph API for logstreams. No unrelated modifications to other features or systems are present in the changeset.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
src/handlers/http/logstream.rs (1)

498-516: Fix typo in variable name.

The variable affecred_resources on line 511 has a typo — should be affected_resources.

🔎 Suggested fix
     match LogstreamAffectedResources::load(&stream_name).await {
-        Ok(affecred_resources) 
-            => Ok((web::Json(affecred_resources), StatusCode::OK)),
+        Ok(affected_resources) 
+            => Ok((web::Json(affected_resources), StatusCode::OK)),
         Err(err) 
             => Err(err.into())
     }
src/handlers/http/modal/utils/logstream_utils.rs (3)

122-131: Stale doc comment.

The doc comment mentions returning "A tuple where: First element: true if no resources are affected..." but the function returns Result<Self, LogstreamAffectedResourcesError>. Update or remove the outdated documentation.

🔎 Suggested fix
     /// Load all resources that will be affected if the given logstream is deleted.
     ///
     /// ### Arguments
     /// - `stream_name` - The name of the logstream to check for dependencies
     ///
     /// ### Returns
-    /// A tuple where:
-    /// - First element: `true` if no resources are affected (empty loaded struct), `false` otherwise
-    /// - Second element: The populated `LogstreamAffectedResources` struct
- 
+    /// A `LogstreamAffectedResources` struct populated with all dependent resources,
+    /// or an error if the stream doesn't exist or fetching fails.
     pub async fn load(stream_name: &str) -> Result<Self, LogstreamAffectedResourcesError> {

141-156: Redundant stream existence checks across fetch methods.

Each fetch_affected_* method independently checks PARSEABLE.streams.contains(stream_name). Since load() calls all four methods sequentially, consider checking stream existence once in load() before calling the fetch methods, or make the individual methods private and trust the caller.

This would reduce redundant lookups and centralize the validation logic.


341-357: Simplify bytes_to_json to avoid double deserialization.

The function first deserializes to serde_json::Value, then deserializes again to T. This is inefficient — use serde_json::from_slice::<T>() directly.

🔎 Suggested fix
 fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> {
     if json_bytes.is_empty() {
         return Err(Bytes2JSONError::ZeroSizedBytes);
     }
 
-    let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) {
-        Ok(value) => value,
-        Err(err) => {
-            return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err)))
-        }
-    };
-
-    return match serde_json::from_value::<T>(json_bytes_value.clone()) {
-        Ok(parsed_object) => Ok(parsed_object),
-        Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e)))
-    };
+    serde_json::from_slice::<T>(&json_bytes)
+        .map_err(|e| Bytes2JSONError::FailedToParse(format!("{:#?}", e)))
 }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cab9fc0 and 7def6f7.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/ingest_server.rs
  • src/handlers/http/modal/query_server.rs
  • src/handlers/http/modal/server.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
  • src/rbac/role.rs
🧰 Additional context used
🧠 Learnings (11)
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-20T17:48:53.444Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-08-18T19:10:11.941Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/ingest.rs:163-164
Timestamp: 2025-08-18T19:10:11.941Z
Learning: Field statistics calculation in src/storage/field_stats.rs uses None for the time_partition parameter when calling flatten_and_push_logs(), as field stats generation does not require time partition functionality.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
🧬 Code graph analysis (5)
src/handlers/http/modal/server.rs (1)
src/handlers/http/logstream.rs (1)
  • get_affected_resources (498-516)
src/handlers/http/logstream.rs (1)
src/handlers/http/modal/utils/logstream_utils.rs (1)
  • load (132-139)
src/handlers/http/modal/query_server.rs (2)
src/rbac/map.rs (1)
  • get (241-243)
src/handlers/http/logstream.rs (1)
  • get_affected_resources (498-516)
src/handlers/http/modal/ingest_server.rs (1)
src/handlers/http/logstream.rs (1)
  • get_affected_resources (498-516)
src/handlers/http/modal/utils/logstream_utils.rs (1)
src/metadata.rs (1)
  • new (99-136)
🔇 Additional comments (9)
src/handlers/http/modal/query_server.rs (1)

316-323: LGTM!

The new endpoint follows the established routing pattern for logstream-scoped resources and correctly uses resource-based authorization with Action::GetLogstreamAffectedResources.

src/handlers/http/modal/server.rs (1)

491-498: LGTM!

The endpoint is consistently wired with the same handler and authorization as in query_server.rs, maintaining uniformity across server modes.

src/handlers/http/modal/ingest_server.rs (1)

275-282: LGTM!

Consistent endpoint implementation across all server modes. The read-only nature of this endpoint makes it appropriate to expose on the ingest server.

src/rbac/role.rs (3)

35-35: LGTM!

The new GetLogstreamAffectedResources action is correctly added and follows the existing pattern for stream-related read operations.


168-168: LGTM!

Correctly classified as a resource-scoped permission, consistent with similar read actions like GetStats and GetSchema.


242-242: LGTM!

Appropriately granted to Editor, Writer, and Reader roles. This is a read-only operation that provides dependency visibility, which all these roles should have access to.

Also applies to: 279-279, 320-320

src/handlers/http/logstream.rs (1)

23-23: LGTM!

Import correctly added for the new LogstreamAffectedResources utility type.

src/handlers/http/modal/utils/logstream_utils.rs (2)

313-327: LGTM!

The From<LogstreamAffectedResourcesError> for StreamError implementation correctly maps error variants, preserving structured errors where possible and falling back to Anyhow for others.


91-104: LGTM!

Clean data structures with appropriate Serialize derives for JSON responses. The use of Ulid for IDs is consistent with the codebase patterns.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/handlers/http/modal/utils/logstream_utils.rs (2)

189-193: Fragile stream name matching in dashboard tiles.

Using chart_query.contains(stream_name) for matching can produce false positives. For example, a stream named "log" would match a query referencing "catalog" or "logstream_backup".

Consider using a more precise matching approach, such as word-boundary matching or parsing the query to extract the actual stream reference.


198-201: Generating new ULID for missing dashboard_id masks data issues.

When dashboard.dashboard_id is None, generating a new Ulid::new() produces a random ID that won't match the actual dashboard. This could confuse API consumers and mask underlying data integrity problems.

Consider returning an error or skipping dashboards with missing IDs instead.

🔎 Suggested alternative
             if !affected_tile_ids.is_empty() {
-                affected_dashboards.push(LogstreamAffectedDashboard { 
-                    dashboard_id: dashboard.dashboard_id.unwrap_or_else(|| {
-                        tracing::warn!("dashboard {}: [id] is missing -- for logstream {}", dash_i, stream_name);
-                        Ulid::new() // default to a new ULID if missing -- what else?
-                    }), 
-                    affected_tile_ids
-                });
+                if let Some(dashboard_id) = dashboard.dashboard_id {
+                    affected_dashboards.push(LogstreamAffectedDashboard { 
+                        dashboard_id, 
+                        affected_tile_ids
+                    });
+                } else {
+                    tracing::warn!("dashboard {}: [id] is missing, skipping -- for logstream {}", dash_i, stream_name);
+                }
             }
🧹 Nitpick comments (3)
src/handlers/http/modal/utils/logstream_utils.rs (3)

131-146: Consider logging when filters lack IDs.

Filters without filter_id are silently dropped (line 143-144), whereas fetch_affected_dashboards and fetch_affected_alerts log warnings when parsing fails. For consistency and debuggability, consider logging when filters are skipped.

🔎 Suggested enhancement
         Ok(PARSEABLE.metastore.get_filters().await?
             .into_iter()
             .filter_map(|filter| {
                 if filter.stream_name == stream_name && 
                     let Some(f_id) = filter.filter_id { 
                         Some(f_id) 
-                    } else { None }
+                    } else {
+                        if filter.stream_name == stream_name {
+                            tracing::warn!("filter for stream {} has no filter_id, skipping", stream_name);
+                        }
+                        None
+                    }
             }).collect())

242-300: Logic is correct, consider refactoring repetitive pattern matching.

The role fetching logic is sound and handles all privilege types correctly. The pattern matching at lines 262-285 is repetitive but explicit.

🔎 Optional refactoring to reduce repetition

The three privilege arms (Ingestor, Reader, Writer) have identical structure. Consider extracting a helper:

fn extract_stream_from_resource(resource: &ParseableResourceType) -> Option<&str> {
    match resource {
        ParseableResourceType::Stream(stream) => Some(stream.as_str()),
        _ => None
    }
}

Then simplify:

                 let associated_stream = match privilege {
-                    DefaultPrivilege::Ingestor { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
-                    DefaultPrivilege::Reader { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
-                    DefaultPrivilege::Writer { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
+                    DefaultPrivilege::Ingestor { resource } |
+                    DefaultPrivilege::Reader { resource } |
+                    DefaultPrivilege::Writer { resource } => {
+                        match extract_stream_from_resource(resource) {
+                            Some(s) => s,
+                            None => continue
+                        }
+                    },
                     _ => continue
                 };

331-347: Consider simplifying the JSON parsing logic.

The function works correctly but has some style issues:

  1. The return keyword at line 343 is redundant in Rust
  2. The two-step parsing (lines 336-341) is unnecessary overhead
🔎 Suggested simplification
 fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> {
     if json_bytes.is_empty() {
         return Err(Bytes2JSONError::ZeroSizedBytes);
     }
 
-    let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) {
-        Ok(value) => value,
-        Err(err) => {
-            return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err)))
-        }
-    };
-
-    return match serde_json::from_value::<T>(json_bytes_value.clone()) {
-        Ok(parsed_object) => Ok(parsed_object),
-        Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e)))
-    };
+    serde_json::from_slice::<T>(&json_bytes)
+        .map_err(|e| Bytes2JSONError::FailedToParse(format!("{:#?}", e)))
 }

This is more idiomatic Rust and avoids the unnecessary intermediate Value allocation.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7def6f7 and cfb0ef1.

📒 Files selected for processing (2)
  • src/handlers/http/logstream.rs
  • src/handlers/http/modal/utils/logstream_utils.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/handlers/http/logstream.rs
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-09-18T10:08:05.101Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-05T09:18:44.813Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1425
File: src/query/mod.rs:484-495
Timestamp: 2025-09-05T09:18:44.813Z
Learning: In the Parseable system, stream names and column names cannot contain quotes, which eliminates SQL injection concerns when interpolating these identifiers directly into SQL queries in src/query/mod.rs.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:38.114Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:418-418
Timestamp: 2025-09-09T14:08:38.114Z
Learning: In the resources/ingest_demo_data.sh demo script, the hardcoded stream name "demodata" should be kept as-is rather than replaced with the $P_STREAM variable, as this is intentional for demo consistency.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-06-15T18:18:14.590Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1348
File: src/prism/home/mod.rs:366-368
Timestamp: 2025-06-15T18:18:14.590Z
Learning: In the Parseable dashboard system, dashboard_id is guaranteed to always be present (never None) because IDs are generated at dashboard creation time, and there are no pre-v1 dashboards that could lack IDs.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
🧬 Code graph analysis (1)
src/handlers/http/modal/utils/logstream_utils.rs (3)
src/parseable/mod.rs (2)
  • storage (284-286)
  • new (180-194)
src/storage/mod.rs (3)
  • from (182-188)
  • new (207-209)
  • new (220-226)
src/metadata.rs (1)
  • new (99-136)
🔇 Additional comments (4)
src/handlers/http/modal/utils/logstream_utils.rs (4)

19-42: LGTM!

The imports and constant definition are appropriate for the new affected resources functionality.


91-119: LGTM!

The data structures are well-designed with appropriate visibility, serialization support, and comprehensive error handling.


210-239: LGTM!

The alert fetching logic correctly uses exact string matching (s == stream_name) to avoid false positives, and properly handles parsing errors with logging.


303-317: LGTM!

The error conversion properly handles specific error types and falls back to anyhow for others.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
src/handlers/http/modal/utils/logstream_utils.rs (4)

122-129: Consider parallelizing independent async operations.

The four fetch methods are independent and could run concurrently using tokio::try_join! to reduce latency. This also consolidates the repeated stream existence checks.

🔎 Suggested refactor
 pub async fn load(stream_name: &str) -> Result<Self, LogstreamAffectedResourcesError> {
+    if !PARSEABLE.streams.contains(stream_name) {
+        return Err(StreamNotFound(stream_name.to_string()).into());
+    }
+
+    let (filters, dashboards, alerts, roles) = tokio::try_join!(
+        Self::fetch_affected_filters_unchecked(stream_name),
+        Self::fetch_affected_dashboards_unchecked(stream_name),
+        Self::fetch_affected_alerts_unchecked(stream_name),
+        Self::fetch_affected_roles_unchecked(stream_name),
+    )?;
+
     Ok(Self {
-        filters: Self::fetch_affected_filters(stream_name).await?,
-        dashboards: Self::fetch_affected_dashboards(stream_name).await?,
-        alerts: Self::fetch_affected_alerts(stream_name).await?,
-        roles: Self::fetch_affected_roles(stream_name).await?,
+        filters,
+        dashboards,
+        alerts,
+        roles,
     })
 }

Then rename the internal methods to _unchecked variants without stream existence checks, keeping the public ones with checks for standalone use.


138-145: Consider more idiomatic filter_map pattern.

The if-let chain syntax works but a standard filter().filter_map() chain may be more readable.

🔎 Suggested refactor
     Ok(PARSEABLE.metastore.get_filters().await?
         .into_iter()
-        .filter_map(|filter| {
-            if filter.stream_name == stream_name && 
-                let Some(f_id) = filter.filter_id { 
-                    Some(f_id) 
-                } else { None }
-        }).collect())
+        .filter(|filter| filter.stream_name == stream_name)
+        .filter_map(|filter| filter.filter_id)
+        .collect())
 }

269-292: Repetitive match arms could be consolidated.

The three privilege variants have identical logic for extracting the stream. Consider consolidating to reduce duplication.

🔎 Suggested refactor
-                let associated_stream = match privilege {
-                    DefaultPrivilege::Ingestor { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
-                    DefaultPrivilege::Reader { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
-                    DefaultPrivilege::Writer { resource } => {
-                        match resource {
-                            ParseableResourceType::Stream(stream) => stream,
-                            _ => continue
-                        }
-                    },
-
-                    _ => continue
-                };
+                let associated_stream = match privilege {
+                    DefaultPrivilege::Ingestor { resource }
+                    | DefaultPrivilege::Reader { resource }
+                    | DefaultPrivilege::Writer { resource } => match resource {
+                        ParseableResourceType::Stream(stream) => stream,
+                        _ => continue,
+                    },
+                    _ => continue,
+                };

338-354: Simplify by parsing directly to target type.

The two-step parsing (bytes → Value → T) adds overhead. Use serde_json::from_slice::<T>() directly. Also, the .clone() on line 350 is unnecessary and the explicit return is non-idiomatic.

🔎 Suggested refactor
 fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> {
     if json_bytes.is_empty() {
         return Err(Bytes2JSONError::ZeroSizedBytes);
     }

-    let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) {
-        Ok(value) => value,
-        Err(err) => {
-            return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err)))
-        }
-    };
-
-    return match serde_json::from_value::<T>(json_bytes_value.clone()) {
-        Ok(parsed_object) => Ok(parsed_object),
-        Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e)))
-    };
+    serde_json::from_slice::<T>(&json_bytes)
+        .map_err(|e| Bytes2JSONError::FailedToParse(e.to_string()))
 }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfb0ef1 and e6c993d.

📒 Files selected for processing (1)
  • src/handlers/http/modal/utils/logstream_utils.rs
🧰 Additional context used
🧠 Learnings (15)
📚 Learning: 2025-09-18T10:08:05.101Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-05T09:18:44.813Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1425
File: src/query/mod.rs:484-495
Timestamp: 2025-09-05T09:18:44.813Z
Learning: In the Parseable system, stream names and column names cannot contain quotes, which eliminates SQL injection concerns when interpolating these identifiers directly into SQL queries in src/query/mod.rs.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:38.114Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:418-418
Timestamp: 2025-09-09T14:08:38.114Z
Learning: In the resources/ingest_demo_data.sh demo script, the hardcoded stream name "demodata" should be kept as-is rather than replaced with the $P_STREAM variable, as this is intentional for demo consistency.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-06-15T18:18:14.590Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1348
File: src/prism/home/mod.rs:366-368
Timestamp: 2025-06-15T18:18:14.590Z
Learning: In the Parseable dashboard system, dashboard_id is guaranteed to always be present (never None) because IDs are generated at dashboard creation time, and there are no pre-v1 dashboards that could lack IDs.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/handlers/http/modal/utils/logstream_utils.rs
🔇 Additional comments (6)
src/handlers/http/modal/utils/logstream_utils.rs (6)

41-42: LGTM!

Good practice documenting the constant's purpose. The field name dbName aligns with the commit message fix for locating stream names in tiles.


91-104: LGTM!

Data structures are well-designed with appropriate derives for serialization and debugging.


106-119: LGTM!

Error type is well-structured with appropriate #[from] conversions for seamless error propagation.


167-169: Deduplication may not filter dashboards with None IDs.

When dashboard_id is None, multiple dashboards with missing IDs won't be deduplicated since None == None is true. However, this is mitigated by lines 204-211 which skip dashboards with missing IDs anyway.


204-211: Good fix for the missing dashboard_id handling.

The logic now properly skips dashboards with missing IDs instead of generating a random ULID, addressing the previous review concern.


310-324: LGTM!

Error conversion appropriately maps specific errors and wraps others in Anyhow for the catch-all case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: API for dependency graph

1 participant