-
-
Notifications
You must be signed in to change notification settings - Fork 158
Fetch Logstream Affected Resources #1502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdded a GET endpoint /logstream/{logstream}/affected-resources that verifies a stream exists, aggregates dependent resources (filters, dashboards, alerts, roles) via a new LogstreamAffectedResources utility, and enforces a new RBAC action GetLogstreamAffectedResources. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as HTTP Handler\n(logstream::get_affected_resources)
participant Auth as RBAC Authorization
participant Metastore as Metastore / ObjectStorage
participant Parser as JSON Parser & Utils
Client->>Handler: GET /logstream/{name}/affected-resources
Handler->>Auth: authorize Action::GetLogstreamAffectedResources
Auth-->>Handler: allow / deny
alt allowed
Handler->>Metastore: ensure stream exists (load if needed)
Metastore-->>Handler: stream metadata / existence
Handler->>Metastore: fetch raw bytes for filters/dashboards/alerts/roles
Metastore-->>Parser: return raw bytes
Parser-->>Handler: deserialize & filter -> LogstreamAffectedResources
Handler-->>Client: 200 OK + JSON payload
else denied
Handler-->>Client: 403 Forbidden
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
CLA Assistant Lite bot: I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
src/handlers/http/logstream.rs (1)
498-516: Fix typo in variable name.The variable
affecred_resourceson line 511 has a typo — should beaffected_resources.🔎 Suggested fix
match LogstreamAffectedResources::load(&stream_name).await { - Ok(affecred_resources) - => Ok((web::Json(affecred_resources), StatusCode::OK)), + Ok(affected_resources) + => Ok((web::Json(affected_resources), StatusCode::OK)), Err(err) => Err(err.into()) }src/handlers/http/modal/utils/logstream_utils.rs (3)
122-131: Stale doc comment.The doc comment mentions returning "A tuple where: First element:
trueif no resources are affected..." but the function returnsResult<Self, LogstreamAffectedResourcesError>. Update or remove the outdated documentation.🔎 Suggested fix
/// Load all resources that will be affected if the given logstream is deleted. /// /// ### Arguments /// - `stream_name` - The name of the logstream to check for dependencies /// /// ### Returns - /// A tuple where: - /// - First element: `true` if no resources are affected (empty loaded struct), `false` otherwise - /// - Second element: The populated `LogstreamAffectedResources` struct - + /// A `LogstreamAffectedResources` struct populated with all dependent resources, + /// or an error if the stream doesn't exist or fetching fails. pub async fn load(stream_name: &str) -> Result<Self, LogstreamAffectedResourcesError> {
141-156: Redundant stream existence checks across fetch methods.Each
fetch_affected_*method independently checksPARSEABLE.streams.contains(stream_name). Sinceload()calls all four methods sequentially, consider checking stream existence once inload()before calling the fetch methods, or make the individual methods private and trust the caller.This would reduce redundant lookups and centralize the validation logic.
341-357: Simplifybytes_to_jsonto avoid double deserialization.The function first deserializes to
serde_json::Value, then deserializes again toT. This is inefficient — useserde_json::from_slice::<T>()directly.🔎 Suggested fix
fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> { if json_bytes.is_empty() { return Err(Bytes2JSONError::ZeroSizedBytes); } - let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) { - Ok(value) => value, - Err(err) => { - return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err))) - } - }; - - return match serde_json::from_value::<T>(json_bytes_value.clone()) { - Ok(parsed_object) => Ok(parsed_object), - Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e))) - }; + serde_json::from_slice::<T>(&json_bytes) + .map_err(|e| Bytes2JSONError::FailedToParse(format!("{:#?}", e))) }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
src/handlers/http/logstream.rssrc/handlers/http/modal/ingest_server.rssrc/handlers/http/modal/query_server.rssrc/handlers/http/modal/server.rssrc/handlers/http/modal/utils/logstream_utils.rssrc/rbac/role.rs
🧰 Additional context used
🧠 Learnings (11)
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).
Applied to files:
src/handlers/http/logstream.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.
Applied to files:
src/handlers/http/logstream.rssrc/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.
Applied to files:
src/handlers/http/logstream.rssrc/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-20T17:48:53.444Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.
Applied to files:
src/handlers/http/logstream.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.
Applied to files:
src/handlers/http/logstream.rs
📚 Learning: 2025-08-18T19:10:11.941Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/ingest.rs:163-164
Timestamp: 2025-08-18T19:10:11.941Z
Learning: Field statistics calculation in src/storage/field_stats.rs uses None for the time_partition parameter when calling flatten_and_push_logs(), as field stats generation does not require time partition functionality.
Applied to files:
src/handlers/http/logstream.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.
Applied to files:
src/handlers/http/logstream.rssrc/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).
Applied to files:
src/handlers/http/logstream.rssrc/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.
Applied to files:
src/handlers/http/logstream.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
🧬 Code graph analysis (5)
src/handlers/http/modal/server.rs (1)
src/handlers/http/logstream.rs (1)
get_affected_resources(498-516)
src/handlers/http/logstream.rs (1)
src/handlers/http/modal/utils/logstream_utils.rs (1)
load(132-139)
src/handlers/http/modal/query_server.rs (2)
src/rbac/map.rs (1)
get(241-243)src/handlers/http/logstream.rs (1)
get_affected_resources(498-516)
src/handlers/http/modal/ingest_server.rs (1)
src/handlers/http/logstream.rs (1)
get_affected_resources(498-516)
src/handlers/http/modal/utils/logstream_utils.rs (1)
src/metadata.rs (1)
new(99-136)
🔇 Additional comments (9)
src/handlers/http/modal/query_server.rs (1)
316-323: LGTM!The new endpoint follows the established routing pattern for logstream-scoped resources and correctly uses resource-based authorization with
Action::GetLogstreamAffectedResources.src/handlers/http/modal/server.rs (1)
491-498: LGTM!The endpoint is consistently wired with the same handler and authorization as in
query_server.rs, maintaining uniformity across server modes.src/handlers/http/modal/ingest_server.rs (1)
275-282: LGTM!Consistent endpoint implementation across all server modes. The read-only nature of this endpoint makes it appropriate to expose on the ingest server.
src/rbac/role.rs (3)
35-35: LGTM!The new
GetLogstreamAffectedResourcesaction is correctly added and follows the existing pattern for stream-related read operations.
168-168: LGTM!Correctly classified as a resource-scoped permission, consistent with similar read actions like
GetStatsandGetSchema.
242-242: LGTM!Appropriately granted to Editor, Writer, and Reader roles. This is a read-only operation that provides dependency visibility, which all these roles should have access to.
Also applies to: 279-279, 320-320
src/handlers/http/logstream.rs (1)
23-23: LGTM!Import correctly added for the new
LogstreamAffectedResourcesutility type.src/handlers/http/modal/utils/logstream_utils.rs (2)
313-327: LGTM!The
From<LogstreamAffectedResourcesError> for StreamErrorimplementation correctly maps error variants, preserving structured errors where possible and falling back toAnyhowfor others.
91-104: LGTM!Clean data structures with appropriate
Serializederives for JSON responses. The use ofUlidfor IDs is consistent with the codebase patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/handlers/http/modal/utils/logstream_utils.rs (2)
189-193: Fragile stream name matching in dashboard tiles.Using
chart_query.contains(stream_name)for matching can produce false positives. For example, a stream named"log"would match a query referencing"catalog"or"logstream_backup".Consider using a more precise matching approach, such as word-boundary matching or parsing the query to extract the actual stream reference.
198-201: Generating new ULID for missing dashboard_id masks data issues.When
dashboard.dashboard_idisNone, generating a newUlid::new()produces a random ID that won't match the actual dashboard. This could confuse API consumers and mask underlying data integrity problems.Consider returning an error or skipping dashboards with missing IDs instead.
🔎 Suggested alternative
if !affected_tile_ids.is_empty() { - affected_dashboards.push(LogstreamAffectedDashboard { - dashboard_id: dashboard.dashboard_id.unwrap_or_else(|| { - tracing::warn!("dashboard {}: [id] is missing -- for logstream {}", dash_i, stream_name); - Ulid::new() // default to a new ULID if missing -- what else? - }), - affected_tile_ids - }); + if let Some(dashboard_id) = dashboard.dashboard_id { + affected_dashboards.push(LogstreamAffectedDashboard { + dashboard_id, + affected_tile_ids + }); + } else { + tracing::warn!("dashboard {}: [id] is missing, skipping -- for logstream {}", dash_i, stream_name); + } }
🧹 Nitpick comments (3)
src/handlers/http/modal/utils/logstream_utils.rs (3)
131-146: Consider logging when filters lack IDs.Filters without
filter_idare silently dropped (line 143-144), whereasfetch_affected_dashboardsandfetch_affected_alertslog warnings when parsing fails. For consistency and debuggability, consider logging when filters are skipped.🔎 Suggested enhancement
Ok(PARSEABLE.metastore.get_filters().await? .into_iter() .filter_map(|filter| { if filter.stream_name == stream_name && let Some(f_id) = filter.filter_id { Some(f_id) - } else { None } + } else { + if filter.stream_name == stream_name { + tracing::warn!("filter for stream {} has no filter_id, skipping", stream_name); + } + None + } }).collect())
242-300: Logic is correct, consider refactoring repetitive pattern matching.The role fetching logic is sound and handles all privilege types correctly. The pattern matching at lines 262-285 is repetitive but explicit.
🔎 Optional refactoring to reduce repetition
The three privilege arms (Ingestor, Reader, Writer) have identical structure. Consider extracting a helper:
fn extract_stream_from_resource(resource: &ParseableResourceType) -> Option<&str> { match resource { ParseableResourceType::Stream(stream) => Some(stream.as_str()), _ => None } }Then simplify:
let associated_stream = match privilege { - DefaultPrivilege::Ingestor { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - - DefaultPrivilege::Reader { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - - DefaultPrivilege::Writer { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - + DefaultPrivilege::Ingestor { resource } | + DefaultPrivilege::Reader { resource } | + DefaultPrivilege::Writer { resource } => { + match extract_stream_from_resource(resource) { + Some(s) => s, + None => continue + } + }, _ => continue };
331-347: Consider simplifying the JSON parsing logic.The function works correctly but has some style issues:
- The
returnkeyword at line 343 is redundant in Rust- The two-step parsing (lines 336-341) is unnecessary overhead
🔎 Suggested simplification
fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> { if json_bytes.is_empty() { return Err(Bytes2JSONError::ZeroSizedBytes); } - let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) { - Ok(value) => value, - Err(err) => { - return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err))) - } - }; - - return match serde_json::from_value::<T>(json_bytes_value.clone()) { - Ok(parsed_object) => Ok(parsed_object), - Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e))) - }; + serde_json::from_slice::<T>(&json_bytes) + .map_err(|e| Bytes2JSONError::FailedToParse(format!("{:#?}", e))) }This is more idiomatic Rust and avoids the unnecessary intermediate
Valueallocation.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/handlers/http/logstream.rssrc/handlers/http/modal/utils/logstream_utils.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/handlers/http/logstream.rs
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-09-18T10:08:05.101Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-05T09:18:44.813Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1425
File: src/query/mod.rs:484-495
Timestamp: 2025-09-05T09:18:44.813Z
Learning: In the Parseable system, stream names and column names cannot contain quotes, which eliminates SQL injection concerns when interpolating these identifiers directly into SQL queries in src/query/mod.rs.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:38.114Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:418-418
Timestamp: 2025-09-09T14:08:38.114Z
Learning: In the resources/ingest_demo_data.sh demo script, the hardcoded stream name "demodata" should be kept as-is rather than replaced with the $P_STREAM variable, as this is intentional for demo consistency.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-06-15T18:18:14.590Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1348
File: src/prism/home/mod.rs:366-368
Timestamp: 2025-06-15T18:18:14.590Z
Learning: In the Parseable dashboard system, dashboard_id is guaranteed to always be present (never None) because IDs are generated at dashboard creation time, and there are no pre-v1 dashboards that could lack IDs.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
🧬 Code graph analysis (1)
src/handlers/http/modal/utils/logstream_utils.rs (3)
src/parseable/mod.rs (2)
storage(284-286)new(180-194)src/storage/mod.rs (3)
from(182-188)new(207-209)new(220-226)src/metadata.rs (1)
new(99-136)
🔇 Additional comments (4)
src/handlers/http/modal/utils/logstream_utils.rs (4)
19-42: LGTM!The imports and constant definition are appropriate for the new affected resources functionality.
91-119: LGTM!The data structures are well-designed with appropriate visibility, serialization support, and comprehensive error handling.
210-239: LGTM!The alert fetching logic correctly uses exact string matching (
s == stream_name) to avoid false positives, and properly handles parsing errors with logging.
303-317: LGTM!The error conversion properly handles specific error types and falls back to anyhow for others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/handlers/http/modal/utils/logstream_utils.rs (4)
122-129: Consider parallelizing independent async operations.The four fetch methods are independent and could run concurrently using
tokio::try_join!to reduce latency. This also consolidates the repeated stream existence checks.🔎 Suggested refactor
pub async fn load(stream_name: &str) -> Result<Self, LogstreamAffectedResourcesError> { + if !PARSEABLE.streams.contains(stream_name) { + return Err(StreamNotFound(stream_name.to_string()).into()); + } + + let (filters, dashboards, alerts, roles) = tokio::try_join!( + Self::fetch_affected_filters_unchecked(stream_name), + Self::fetch_affected_dashboards_unchecked(stream_name), + Self::fetch_affected_alerts_unchecked(stream_name), + Self::fetch_affected_roles_unchecked(stream_name), + )?; + Ok(Self { - filters: Self::fetch_affected_filters(stream_name).await?, - dashboards: Self::fetch_affected_dashboards(stream_name).await?, - alerts: Self::fetch_affected_alerts(stream_name).await?, - roles: Self::fetch_affected_roles(stream_name).await?, + filters, + dashboards, + alerts, + roles, }) }Then rename the internal methods to
_uncheckedvariants without stream existence checks, keeping the public ones with checks for standalone use.
138-145: Consider more idiomatic filter_map pattern.The if-let chain syntax works but a standard
filter().filter_map()chain may be more readable.🔎 Suggested refactor
Ok(PARSEABLE.metastore.get_filters().await? .into_iter() - .filter_map(|filter| { - if filter.stream_name == stream_name && - let Some(f_id) = filter.filter_id { - Some(f_id) - } else { None } - }).collect()) + .filter(|filter| filter.stream_name == stream_name) + .filter_map(|filter| filter.filter_id) + .collect()) }
269-292: Repetitive match arms could be consolidated.The three privilege variants have identical logic for extracting the stream. Consider consolidating to reduce duplication.
🔎 Suggested refactor
- let associated_stream = match privilege { - DefaultPrivilege::Ingestor { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - - DefaultPrivilege::Reader { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - - DefaultPrivilege::Writer { resource } => { - match resource { - ParseableResourceType::Stream(stream) => stream, - _ => continue - } - }, - - _ => continue - }; + let associated_stream = match privilege { + DefaultPrivilege::Ingestor { resource } + | DefaultPrivilege::Reader { resource } + | DefaultPrivilege::Writer { resource } => match resource { + ParseableResourceType::Stream(stream) => stream, + _ => continue, + }, + _ => continue, + };
338-354: Simplify by parsing directly to target type.The two-step parsing (bytes → Value → T) adds overhead. Use
serde_json::from_slice::<T>()directly. Also, the.clone()on line 350 is unnecessary and the explicitreturnis non-idiomatic.🔎 Suggested refactor
fn bytes_to_json<T: serde::de::DeserializeOwned>(json_bytes: Bytes) -> Result<T, Bytes2JSONError> { if json_bytes.is_empty() { return Err(Bytes2JSONError::ZeroSizedBytes); } - let json_bytes_value = match serde_json::from_slice::<serde_json::Value>(&json_bytes) { - Ok(value) => value, - Err(err) => { - return Err(Bytes2JSONError::FailedToParse(format!("{:#?}", err))) - } - }; - - return match serde_json::from_value::<T>(json_bytes_value.clone()) { - Ok(parsed_object) => Ok(parsed_object), - Err(e) => Err(Bytes2JSONError::FailedToParse(format!("deserialization failed: {:#?}", e))) - }; + serde_json::from_slice::<T>(&json_bytes) + .map_err(|e| Bytes2JSONError::FailedToParse(e.to_string())) }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/handlers/http/modal/utils/logstream_utils.rs
🧰 Additional context used
🧠 Learnings (15)
📚 Learning: 2025-09-18T10:08:05.101Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-05T09:18:44.813Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1425
File: src/query/mod.rs:484-495
Timestamp: 2025-09-05T09:18:44.813Z
Learning: In the Parseable system, stream names and column names cannot contain quotes, which eliminates SQL injection concerns when interpolating these identifiers directly into SQL queries in src/query/mod.rs.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-09T14:08:38.114Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:418-418
Timestamp: 2025-09-09T14:08:38.114Z
Learning: In the resources/ingest_demo_data.sh demo script, the hardcoded stream name "demodata" should be kept as-is rather than replaced with the $P_STREAM variable, as this is intentional for demo consistency.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-28T06:17:01.201Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1276
File: src/prism/logstream/mod.rs:0-0
Timestamp: 2025-03-28T06:17:01.201Z
Learning: In the Parseable datasets API, specific stream names don't need to be logged in error cases because the API is called from the Parseable UI where only authorized users can access and the streams in the request are pre-filtered based on user authorization.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-06-15T18:18:14.590Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1348
File: src/prism/home/mod.rs:366-368
Timestamp: 2025-06-15T18:18:14.590Z
Learning: In the Parseable dashboard system, dashboard_id is guaranteed to always be present (never None) because IDs are generated at dashboard creation time, and there are no pre-v1 dashboards that could lack IDs.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.
Applied to files:
src/handlers/http/modal/utils/logstream_utils.rs
🔇 Additional comments (6)
src/handlers/http/modal/utils/logstream_utils.rs (6)
41-42: LGTM!Good practice documenting the constant's purpose. The field name
dbNamealigns with the commit message fix for locating stream names in tiles.
91-104: LGTM!Data structures are well-designed with appropriate derives for serialization and debugging.
106-119: LGTM!Error type is well-structured with appropriate
#[from]conversions for seamless error propagation.
167-169: Deduplication may not filter dashboards withNoneIDs.When
dashboard_idisNone, multiple dashboards with missing IDs won't be deduplicated sinceNone == Noneis true. However, this is mitigated by lines 204-211 which skip dashboards with missing IDs anyway.
204-211: Good fix for the missing dashboard_id handling.The logic now properly skips dashboards with missing IDs instead of generating a random ULID, addressing the previous review concern.
310-324: LGTM!Error conversion appropriately maps specific errors and wraps others in
Anyhowfor the catch-all case.
Fixes #1498.
Description
This PR has:
Summary by CodeRabbit
New Features
Bug Fixes / Improvements
✏️ Tip: You can customize this high-level summary in your review settings.