-
Notifications
You must be signed in to change notification settings - Fork 0
feat(usage): added UI for metrics and metrics persistence #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(usage): added UI for metrics and metrics persistence #2
Conversation
WalkthroughThe changes introduce a metrics persistence system with periodic file-based snapshots, a new HTTP endpoint and dashboard UI for viewing metrics by model and time period, Docker infrastructure updates, request ID tracking, and sensitive header filtering from logs. Changes
Sequence DiagramsequenceDiagram
participant Startup as Server Startup
participant Store as Metrics Store
participant File as File Storage
participant Persist as Persistence Loop
Startup->>Store: LoadMetricsFromFile(metrics.json)
activate Store
File-->>Store: Load saved snapshot
Store-->>Startup: Metrics loaded
deactivate Store
Startup->>Persist: StartPeriodicSaving(metrics.json, interval)
activate Persist
Note over Persist: Background goroutine started
par Request Handling
Note over Store: Requests update metrics in-memory
and Periodic Saves
Persist->>Store: Read current snapshot
Persist->>File: Write to metrics.json
File-->>Persist: Saved
end
Startup-->>Startup: Server shutdown signal
Startup->>Persist: StopMetricsPersistence()
Persist->>Store: Final snapshot
Persist->>File: Final save
File-->>Persist: Saved
deactivate Persist
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/api/middleware/request_logging.go (1)
77-84: Consider skipping sensitive headers during the initial copy for efficiency.The current approach copies all headers then deletes sensitive ones. For better performance and clarity, consider filtering during the copy operation itself.
Apply this diff to skip sensitive headers during the copy:
- // Capture headers - headers := make(map[string][]string) - for key, values := range c.Request.Header { - headers[key] = values - } - - delete(headers, "Authorization") - delete(headers, "Cookie") + // Capture headers (excluding sensitive ones) + headers := make(map[string][]string) + for key, values := range c.Request.Header { + if key != "Authorization" && key != "Cookie" { + headers[key] = values + } + }internal/usage/logger_plugin.go (1)
170-224: Fix latency calculation for missing RequestedAtLine 170 computes
time.Since(record.RequestedAt)before we fall back totime.Now()on Line 165, so any record lackingRequestedAt(which we already guard for) produces a gigantic latency (~638 million years) and pollutes the dashboard. Please normalise the timestamp first and only compute latency when the originalRequestedAtis present.- latency := time.Since(record.RequestedAt) + latency := time.Duration(0) @@ - if statsKey == "" { - statsKey = resolveAPIIdentifier(nil, record) - } + if statsKey == "" { + statsKey = resolveAPIIdentifier(nil, record) + } + if !record.RequestedAt.IsZero() { + latency = time.Since(record.RequestedAt) + if latency < 0 { + latency = 0 + } + } @@ - RequestID: requestID, - LatencyMS: latency.Milliseconds(), + RequestID: requestID, + LatencyMS: latency.Milliseconds(),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
.gitignore(1 hunks)Dockerfile(1 hunks)cmd/server/main.go(1 hunks)config.example.yaml(1 hunks)internal/api/handlers/metrics/handler.go(1 hunks)internal/api/middleware/request_logging.go(1 hunks)internal/api/server.go(6 hunks)internal/cmd/run.go(2 hunks)internal/config/config.go(2 hunks)internal/logging/gin_logger.go(2 hunks)internal/usage/logger_plugin.go(6 hunks)ui/metrics.html(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
internal/cmd/run.go (1)
internal/usage/logger_plugin.go (1)
StopMetricsPersistence(486-494)
internal/api/handlers/metrics/handler.go (1)
internal/usage/logger_plugin.go (1)
RequestStatistics(62-76)
cmd/server/main.go (1)
internal/usage/logger_plugin.go (2)
LoadMetricsFromFile(370-382)StartPeriodicSaving(452-483)
internal/api/server.go (3)
internal/api/handlers/metrics/handler.go (2)
Handler(16-18)NewHandler(21-23)internal/api/handlers/management/handler.go (2)
Handler(29-42)NewHandler(45-59)internal/usage/logger_plugin.go (1)
GetRequestStatistics(143-143)
internal/usage/logger_plugin.go (1)
sdk/cliproxy/usage/manager.go (1)
Record(12-21)
🔇 Additional comments (7)
internal/logging/gin_logger.go (1)
26-28: LGTM! Request ID generation enhances observability.The UUID-based request ID is properly generated, stored in the context, and propagated via the X-Request-ID header. This enables effective request tracing across logs and distributed systems.
internal/config/config.go (1)
65-72: LGTM! Configuration fields are well-documented and properly integrated.The new metrics persistence configuration fields are appropriately defined with clear documentation and proper YAML/JSON tags. The use of
omitemptyallows these to remain optional, with sensible defaults applied at startup (metrics.json and 10-minute interval).Dockerfile (2)
26-26: LGTM! UI directory copy enables metrics dashboard.Copying the ui directory is necessary to serve the metrics visualization dashboard introduced in this PR.
30-30: The IN_DOCKER environment variable serves a clear purpose and is appropriately used.Verification confirms that IN_DOCKER is referenced in
internal/api/server.go:286, where it controls the HTTP server's bind address—setting it to0.0.0.0when running in Docker to accept external connections, orlocalhostfor local development. This usage is legitimate and necessary for the application to function correctly in both containerised and local environments..gitignore (1)
18-18: LGTM! Properly excludes runtime metrics data.Adding metrics.json to .gitignore is correct as this file contains runtime state that should not be version-controlled.
internal/cmd/run.go (1)
54-54: LGTM! Graceful metrics persistence shutdown.The StopMetricsPersistence() call is correctly placed after service.Run() to ensure a final metrics save occurs before application exit.
cmd/server/main.go (1)
382-394: LGTM! Metrics persistence initialization is well-structured.The initialization sequence properly:
- Applies sensible defaults (metrics.json, 10-minute interval)
- Loads existing metrics before starting new persistence
- Starts periodic saving with configurable error handling
| loopDelay := cfg.LoopDelay | ||
| if loopDelay == 0 { | ||
| loopDelay = 10 * time.Minute | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider adding a minimum threshold for LoopDelay.
Very small LoopDelay values (e.g., 1 second) could cause excessive file I/O. Consider validating a minimum interval, such as 1 minute, to prevent performance issues.
Apply this diff to add validation:
loopDelay := cfg.LoopDelay
if loopDelay == 0 {
loopDelay = 10 * time.Minute
+ } else if loopDelay < time.Minute {
+ log.Warnf("LoopDelay %v is too small, using minimum of 1 minute", loopDelay)
+ loopDelay = time.Minute
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| loopDelay := cfg.LoopDelay | |
| if loopDelay == 0 { | |
| loopDelay = 10 * time.Minute | |
| } | |
| loopDelay := cfg.LoopDelay | |
| if loopDelay == 0 { | |
| loopDelay = 10 * time.Minute | |
| } else if loopDelay < time.Minute { | |
| log.Warnf("LoopDelay %v is too small, using minimum of 1 minute", loopDelay) | |
| loopDelay = time.Minute | |
| } |
🤖 Prompt for AI Agents
In cmd/server/main.go around lines 387 to 390 the code sets loopDelay directly
from cfg.LoopDelay and defaults to 10 minutes only when zero; add validation to
enforce a minimum threshold (e.g., 1 minute) so excessively small values aren't
used. After computing loopDelay (and the zero-default), if loopDelay <
time.Minute then set loopDelay = time.Minute and emit a warning log (or
fmt.Println) indicating the configured value was too small and has been raised
to the minimum. Ensure imports include time and the logging facility used in
this file.
| bindAddr := "localhost" | ||
| if os.Getenv("IN_DOCKER") == "true" { | ||
| bindAddr = "0.0.0.0" | ||
| } | ||
|
|
||
| s.server = &http.Server{ | ||
| Addr: fmt.Sprintf(":%d", cfg.Port), | ||
| Addr: fmt.Sprintf("%s:%d", bindAddr, cfg.Port), | ||
| Handler: engine, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore binding to all interfaces
Line 291 now forces the listener to localhost unless IN_DOCKER is set, so any bare-metal or VM deployment instantly becomes unreachable from other machines. Previously we bound to all interfaces (fmt.Sprintf(":%d", cfg.Port)), which worked in every environment, including containers. Please keep the old behaviour and only override when you explicitly need to.
- bindAddr := "localhost"
- if os.Getenv("IN_DOCKER") == "true" {
- bindAddr = "0.0.0.0"
- }
-
- s.server = &http.Server{
- Addr: fmt.Sprintf("%s:%d", bindAddr, cfg.Port),
+ addr := fmt.Sprintf(":%d", cfg.Port)
+ if os.Getenv("IN_DOCKER") == "true" {
+ addr = fmt.Sprintf("0.0.0.0:%d", cfg.Port)
+ }
+
+ s.server = &http.Server{
+ Addr: addr,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bindAddr := "localhost" | |
| if os.Getenv("IN_DOCKER") == "true" { | |
| bindAddr = "0.0.0.0" | |
| } | |
| s.server = &http.Server{ | |
| Addr: fmt.Sprintf(":%d", cfg.Port), | |
| Addr: fmt.Sprintf("%s:%d", bindAddr, cfg.Port), | |
| Handler: engine, | |
| addr := fmt.Sprintf(":%d", cfg.Port) | |
| if os.Getenv("IN_DOCKER") == "true" { | |
| addr = fmt.Sprintf("0.0.0.0:%d", cfg.Port) | |
| } | |
| s.server = &http.Server{ | |
| Addr: addr, | |
| Handler: engine, |
🤖 Prompt for AI Agents
In internal/api/server.go around lines 285 to 292, the server Addr is being
forced to "localhost" by default which prevents remote connections on
bare-metal/VMs; restore the prior behavior by binding to all interfaces by
default (use ":<port>" as the Addr) and only set an explicit host (e.g.
"0.0.0.0:<port>") when IN_DOCKER is set; update the Addr construction
accordingly so production deployments remain reachable from other machines.
| grid-area: timeserie; | ||
| } | ||
| </style> | ||
| <script src="https://cdn.jsdelivr.net/npm/chart.js"></script> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider self-hosting Chart.js for reliability.
Loading Chart.js from a CDN introduces a dependency on external infrastructure. If the CDN is unavailable, the entire metrics UI will fail to load.
Consider bundling Chart.js with the application or using a local copy to ensure the UI remains functional even if external CDNs are down. This is particularly important for enterprise environments with restricted internet access.
🤖 Prompt for AI Agents
In ui/metrics.html around line 55, the page loads Chart.js from a CDN which
creates an external dependency; replace the CDN reference by bundling or
self-hosting Chart.js: add a local copy of the Chart.js file into your static
assets (or include it in your front-end build output), update the <script> tag
to point to that local path, and ensure your build/deploy serves that asset
(optionally update cache headers and remove/integrity attributes tied to the
CDN).
| const initUI = async () => { | ||
| const fetchUrl = `${window.location.origin}/_qs/metrics`; | ||
| const response = await fetch(fetchUrl); | ||
| const data = await response.json(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for the metrics fetch request.
The fetch request lacks error handling. Network failures or API errors will cause the UI to fail silently without any user feedback.
Apply this diff to add error handling:
const initUI = async () => {
- const fetchUrl = `${window.location.origin}/_qs/metrics`;
- const response = await fetch(fetchUrl);
- const data = await response.json();
+ try {
+ const fetchUrl = `${window.location.origin}/_qs/metrics`;
+ const response = await fetch(fetchUrl);
+ if (!response.ok) {
+ throw new Error(`HTTP error! status: ${response.status}`);
+ }
+ const data = await response.json();
+ // ... rest of the existing code
+ } catch (error) {
+ console.error('Failed to load metrics:', error);
+ document.querySelector('main').innerHTML =
+ '<div style="color: red; padding: 20px;">Failed to load metrics. Please try again later.</div>';
+ return;
+ }
+ };Note: Move the chart initialization code inside the try block.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In ui/metrics.html around lines 57 to 61, the fetch of
`${window.location.origin}/_qs/metrics` has no error handling; wrap the fetch +
response.json() and the chart initialization inside a try block, check
response.ok and throw or handle non-2xx responses, and add a catch block that
logs the error, displays a user-facing error message in the UI (or toggles an
existing error element), and avoids initializing the chart when the request
fails; ensure any finally cleanup (spinner hide) is handled as needed.
| data.by_model.forEach((modelData) => { | ||
| modelTokenChartData.data.labels.push(modelData.model); | ||
| modelTokenChartData.data.datasets[0].data.push(modelData.tokens); | ||
| modelRequestChartData.data.labels.push(modelData.model); | ||
| modelRequestChartData.data.datasets[0].data.push(modelData.requests); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation for empty or missing data.
If data.by_model is undefined or empty, the forEach will fail or create empty charts. Add validation before processing the data.
Apply this diff to add data validation:
+ if (!data.by_model || data.by_model.length === 0) {
+ console.warn('No model data available');
+ // Optionally show a message to the user
+ } else {
data.by_model.forEach((modelData) => {
modelTokenChartData.data.labels.push(modelData.model);
modelTokenChartData.data.datasets[0].data.push(modelData.tokens);
modelRequestChartData.data.labels.push(modelData.model);
modelRequestChartData.data.datasets[0].data.push(modelData.requests);
});
+ }Apply similar validation for data.timeseries.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| data.by_model.forEach((modelData) => { | |
| modelTokenChartData.data.labels.push(modelData.model); | |
| modelTokenChartData.data.datasets[0].data.push(modelData.tokens); | |
| modelRequestChartData.data.labels.push(modelData.model); | |
| modelRequestChartData.data.datasets[0].data.push(modelData.requests); | |
| }); | |
| if (!data.by_model || data.by_model.length === 0) { | |
| console.warn('No model data available'); | |
| // Optionally show a message to the user | |
| } else { | |
| data.by_model.forEach((modelData) => { | |
| modelTokenChartData.data.labels.push(modelData.model); | |
| modelTokenChartData.data.datasets[0].data.push(modelData.tokens); | |
| modelRequestChartData.data.labels.push(modelData.model); | |
| modelRequestChartData.data.datasets[0].data.push(modelData.requests); | |
| }); | |
| } |
🤖 Prompt for AI Agents
In ui/metrics.html around lines 96 to 101, the code assumes data.by_model (and
elsewhere data.timeseries) always exists and is iterable; add validation checks
before processing: verify Array.isArray(data.by_model) && data.by_model.length >
0 (or handle a missing/empty case by skipping the forEach, leaving chart data
empty, or injecting a placeholder label/zero value), and apply the same
Array.isArray/length guard for data.timeseries before building timeseries
charts; return early or skip population when the arrays are missing to avoid
runtime errors and empty-chart failures.
| <div id="modelRequest"> | ||
| <canvas id="modelRequestChart"></canvas> | ||
| </div> | ||
| <div id="modelToken"> | ||
| <canvas id="modelTokenChart"></canvas> | ||
| </div> | ||
| <div id="timeserie"> | ||
| <canvas id="timeserieChart"></canvas> | ||
| </div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Improve accessibility of chart elements.
The canvas elements lack accessible labels and descriptions, making the dashboard difficult to use with screen readers.
Add ARIA attributes to improve accessibility:
- <div id="modelRequest">
- <canvas id="modelRequestChart"></canvas>
+ <div id="modelRequest">
+ <canvas id="modelRequestChart" role="img" aria-label="Pie chart showing request count by model"></canvas>
</div>
- <div id="modelToken">
- <canvas id="modelTokenChart"></canvas>
+ <div id="modelToken">
+ <canvas id="modelTokenChart" role="img" aria-label="Pie chart showing token usage by model"></canvas>
</div>
- <div id="timeserie">
- <canvas id="timeserieChart"></canvas>
+ <div id="timeserie">
+ <canvas id="timeserieChart" role="img" aria-label="Line chart showing requests and tokens over time"></canvas>
</div>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <div id="modelRequest"> | |
| <canvas id="modelRequestChart"></canvas> | |
| </div> | |
| <div id="modelToken"> | |
| <canvas id="modelTokenChart"></canvas> | |
| </div> | |
| <div id="timeserie"> | |
| <canvas id="timeserieChart"></canvas> | |
| </div> | |
| <div id="modelRequest"> | |
| <canvas id="modelRequestChart" role="img" aria-label="Pie chart showing request count by model"></canvas> | |
| </div> | |
| <div id="modelToken"> | |
| <canvas id="modelTokenChart" role="img" aria-label="Pie chart showing token usage by model"></canvas> | |
| </div> | |
| <div id="timeserie"> | |
| <canvas id="timeserieChart" role="img" aria-label="Line chart showing requests and tokens over time"></canvas> | |
| </div> |
🤖 Prompt for AI Agents
In ui/metrics.html around lines 133 to 141, the three canvas elements for charts
lack accessible names and descriptions; add ARIA attributes to each canvas
(e.g., role="img" plus either aria-label with a concise name or aria-labelledby
pointing to a visible/visually-hidden heading) and add aria-describedby pointing
to a short descriptive element (a visually-hidden <span> or <div> with a unique
id) that explains the chart purpose and any interaction hints; ensure the
referenced description elements are present in the DOM and screen-reader-only
(CSS class like sr-only) so assistive technologies can read them.
b0f6ca1
into
bleedingdev-quantum:hiring/leinweberm
@CodeRabbit full review
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Chores