Skip to content

Conversation

@nikhilsinhaparseable
Copy link
Contributor

@nikhilsinhaparseable nikhilsinhaparseable commented Dec 29, 2025

Summary by CodeRabbit

  • New Features
    • Introduced a new daily billing metric to track total metrics collected per date, now available through the metrics monitoring system for usage analytics and billing insights.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 29, 2025

Walkthrough

A new per-date billing metric total_metrics_collected is added across the metrics collection pipeline. The BillingMetricsCollector struct receives a new HashMap field to store date-keyed counts, a corresponding Prometheus counter is registered in the metrics module, and the counter is incremented during metrics flattening operations.

Changes

Cohort / File(s) Summary
Billing Metrics Collector
src/handlers/http/cluster/mod.rs
Added total_metrics_collected_by_date: HashMap<String, u64> field to BillingMetricsCollector; metric is included in event conversion under metric_type "total_metrics_collected"; is_simple_metric updated to recognize the new metric type; process_simple_metric handles insertion into the HashMap keyed by date.
Prometheus Metrics Infrastructure
src/metrics/mod.rs
Registered new public static counter TOTAL_METRICS_COLLECTED_BY_DATE as IntCounterVec; added public increment function increment_metrics_collected_by_date(date: &str) to increment the counter by one for a given date.
Metrics Flattening Integration
src/otel/metrics.rs
Imported increment_metrics_collected_by_date function; integrated counter increment call into flatten_metrics_record after ensuring data point existence, computing current date, and invoking the increment function.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • parseablehq/parseable#1298: Modifies flatten_metrics_record function in src/otel/metrics.rs, directly overlapping with integration point in this PR.
  • parseablehq/parseable#1448: Extends BillingMetricsCollector struct and metrics ingestion pipeline, covering the same billing metrics collection infrastructure being enhanced here.

Poem

🐰 Hop, hop! A counter springs to life,
By date it tracks without the strife,
Each metric now has tales to tell,
In billing buckets, safe to dwell!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request has no description provided by the author. It lacks all required sections from the template: issue reference, description, solution rationale, key changes, and testing/documentation checkboxes. Add a comprehensive pull request description following the template, including issue reference, goal, rationale, key changes, and completion of the testing and documentation checklist.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'add metrics collections in billing metrics' is directly related to the changeset, which adds a new per-date billing metric (total_metrics_collected_by_date) across three files. It accurately summarizes the main objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f2d3b4 and 851358a.

📒 Files selected for processing (3)
  • src/handlers/http/cluster/mod.rs
  • src/metrics/mod.rs
  • src/otel/metrics.rs
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/s3.rs:837-844
Timestamp: 2025-09-25T07:13:58.909Z
Learning: In the Parseable codebase, the existing pattern of placing metrics calls after the `?` operator in S3 list_hours method is correct and should not be changed. The metrics are properly recorded only on successful operations.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/otel/metrics.rs
  • src/metrics/mod.rs
  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/otel/metrics.rs
  • src/metrics/mod.rs
  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-08-18T19:10:11.941Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/ingest.rs:163-164
Timestamp: 2025-08-18T19:10:11.941Z
Learning: Field statistics calculation in src/storage/field_stats.rs uses None for the time_partition parameter when calling flatten_and_push_logs(), as field stats generation does not require time partition functionality.

Applied to files:

  • src/otel/metrics.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.

Applied to files:

  • src/otel/metrics.rs
  • src/metrics/mod.rs
  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-10-20T17:48:53.444Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.

Applied to files:

  • src/metrics/mod.rs
  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-09-25T07:12:40.189Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/gcs.rs:625-633
Timestamp: 2025-09-25T07:12:40.189Z
Learning: In Parseable's object storage metrics system, metrics should only be captured when API calls succeed, not when they error out. The current pattern of calling increment_object_store_calls_by_date and related metrics functions after the `?` operator is the correct approach.

Applied to files:

  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-09-25T07:12:33.401Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/azure_blob.rs:727-735
Timestamp: 2025-09-25T07:12:33.401Z
Learning: In Parseable's object storage implementations, metrics should only be captured when operations succeed, not when they are attempted. The increment_object_store_calls_by_date() calls should be placed after the await? to ensure they only execute on successful operations.

Applied to files:

  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-09-25T07:12:27.407Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/azure_blob.rs:693-700
Timestamp: 2025-09-25T07:12:27.407Z
Learning: In the Parseable codebase, object store metrics (increment_object_store_calls_by_date, increment_files_scanned_in_object_store_calls_by_date, etc.) should only be recorded for successful operations, not for failed attempts. The metric calls should be placed after error handling operators like `?` to ensure they only execute when operations succeed.

Applied to files:

  • src/handlers/http/cluster/mod.rs
📚 Learning: 2025-09-25T07:13:04.112Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/gcs.rs:674-681
Timestamp: 2025-09-25T07:13:04.112Z
Learning: In the Parseable codebase, object store metrics should only be captured for successful operations. Metric recording calls should be placed after `await?` operators to ensure they only execute when operations succeed, not when they fail.

Applied to files:

  • src/handlers/http/cluster/mod.rs
🧬 Code graph analysis (1)
src/otel/metrics.rs (1)
src/metrics/mod.rs (1)
  • increment_metrics_collected_by_date (637-641)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: coverage
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
🔇 Additional comments (9)
src/metrics/mod.rs (3)

376-386: LGTM! Metric declaration follows established patterns.

The new TOTAL_METRICS_COLLECTED_BY_DATE metric follows the same structure as other date-based billing metrics, using an IntCounterVec with a single "date" label.


487-489: LGTM! Metric registration is correct.

The metric is properly registered in the custom metrics registry, consistent with the registration pattern used for all other billing metrics.


637-641: LGTM! Helper function implementation is correct.

The increment_metrics_collected_by_date function follows the same pattern as other billing metric helpers, incrementing the counter by 1 for the specified date.

src/otel/metrics.rs (2)

27-27: LGTM! Import statement is correct.

The import of increment_metrics_collected_by_date is properly scoped and necessary for the metric increment call in flatten_metrics_record.


504-506: Use metric timestamps instead of processing time, and only increment when actual data points exist.

The code uses Utc::now() to determine the billing date, which is inconsistent with how other date-based metrics work: increment_events_ingested_by_date() uses the event's parsed_timestamp, and increment_parquets_stored_by_date() uses the file creation date. Since OTEL metric data points contain time_unix_nano and start_time_unix_nano (extracted in flatten_number_data_points), these should be used instead to avoid attributing metrics to the wrong date when processing is delayed.

Additionally, the counter increments unconditionally even when data_points_json is empty and only metric_json metadata exists (lines 501–503). Metrics with zero actual data points should not be counted toward the total.

⛔ Skipped due to learnings
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/gcs.rs:625-633
Timestamp: 2025-09-25T07:12:40.189Z
Learning: In Parseable's object storage metrics system, metrics should only be captured when API calls succeed, not when they error out. The current pattern of calling increment_object_store_calls_by_date and related metrics functions after the `?` operator is the correct approach.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/azure_blob.rs:693-700
Timestamp: 2025-09-25T07:12:27.407Z
Learning: In the Parseable codebase, object store metrics (increment_object_store_calls_by_date, increment_files_scanned_in_object_store_calls_by_date, etc.) should only be recorded for successful operations, not for failed attempts. The metric calls should be placed after error handling operators like `?` to ensure they only execute when operations succeed.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/azure_blob.rs:727-735
Timestamp: 2025-09-25T07:12:33.401Z
Learning: In Parseable's object storage implementations, metrics should only be captured when operations succeed, not when they are attempted. The increment_object_store_calls_by_date() calls should be placed after the await? to ensure they only execute on successful operations.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/s3.rs:837-844
Timestamp: 2025-09-25T07:13:58.909Z
Learning: In the Parseable codebase, the existing pattern of placing metrics calls after the `?` operator in S3 list_hours method is correct and should not be changed. The metrics are properly recorded only on successful operations.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/s3.rs:788-796
Timestamp: 2025-09-25T07:13:52.733Z
Learning: In Parseable's S3 storage implementation, the current pattern of placing increment_object_store_calls_by_date() calls after the `?` operator is correct and intentional, as it ensures metrics are only captured when operations succeed, not when they fail.
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1441
File: src/storage/s3.rs:788-796
Timestamp: 2025-09-25T07:13:52.733Z
Learning: In Parseable's S3 storage implementation, the current pattern of placing increment_object_store_calls_by_date() calls after the `?` operator is correct and intentional, as it ensures metrics are only captured when operations succeed, not when they fail.
src/handlers/http/cluster/mod.rs (4)

102-102: LGTM! Field addition follows established pattern.

The total_metrics_collected_by_date field is added to BillingMetricsCollector with the same type and pattern as other simple date-based metrics.


198-204: LGTM! Event generation logic is correct.

The code properly checks if the metric collection is non-empty before generating events, and uses the correct metric type name "total_metrics_collected" (consistent with the pattern where the _by_date suffix is omitted in the event).


1275-1275: LGTM! Metric classification is correct.

The metric name parseable_total_metrics_collected_by_date is properly added to the is_simple_metric match pattern with the correct namespace prefix.


1342-1346: LGTM! Metric processing follows correct pattern.

The processing logic correctly uses .insert() to store the counter value directly, which aligns with the intended behavior for billing metrics. Based on learnings, this ensures each counter value from nodes is converted to individual events rather than being accumulated.


Comment @coderabbitai help to get the list of available commands and usage tips.

@nitisht nitisht merged commit cab9fc0 into parseablehq:main Dec 29, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants