Skip to content

Conversation

@rdettai
Copy link
Collaborator

@rdettai rdettai commented Jun 26, 2025

Description

Closes #5799

Rationalize object store metrics. Be more exhaustive in request and error recording.

Important: This is a breaking change as some metrics are renamed.

How was this PR tested?

TODO: add tests to the metrics wrappers

@rdettai rdettai self-assigned this Jun 26, 2025
@rdettai rdettai requested a review from guilload June 26, 2025 15:14
@rdettai rdettai force-pushed the revamp-storage-metrics branch 2 times, most recently from 8bdb431 to 88528bf Compare July 2, 2025 11:55
@rdettai rdettai force-pushed the revamp-storage-metrics branch from 0d1674d to 93ce4eb Compare July 15, 2025 14:24
ActionLabel::DeleteObject => "delete_object",
ActionLabel::DeleteObjects => "delete_objects",
ActionLabel::GetObject => "get_object",
ActionLabel::HeadObject => "head_object",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure: get_properties
S3: head_object
Opendal: stat

@rdettai rdettai force-pushed the revamp-storage-metrics branch from 93ce4eb to cb7eb48 Compare July 15, 2025 15:00
Comment on lines +334 to +404
/// This is a fork of `tokio::io::copy_buf` that enables tracking the number of
/// bytes transferred. This estimate should be accurate as long as the network
/// is the bottleneck.
mod copy_buf {

use std::future::Future;
use std::io;
use std::pin::Pin;
use std::task::{Context, Poll, ready};

use tokio::io::{AsyncBufRead, AsyncWrite};

#[derive(Debug)]
#[must_use = "futures do nothing unless you `.await` or poll them"]
pub struct CopyBuf<'a, R: ?Sized, W: ?Sized> {
pub reader: &'a mut R,
pub writer: &'a mut W,
pub amt: u64,
}

impl<R, W> Future for CopyBuf<'_, R, W>
where
R: AsyncBufRead + Unpin + ?Sized,
W: AsyncWrite + Unpin + ?Sized,
{
type Output = io::Result<u64>;

fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
loop {
let me = &mut *self;
let buffer = ready!(Pin::new(&mut *me.reader).poll_fill_buf(cx))?;
if buffer.is_empty() {
ready!(Pin::new(&mut self.writer).poll_flush(cx))?;
return Poll::Ready(Ok(self.amt));
}

let i = ready!(Pin::new(&mut *me.writer).poll_write(cx, buffer))?;
if i == 0 {
return Poll::Ready(Err(std::io::ErrorKind::WriteZero.into()));
}
self.amt += i as u64;
Pin::new(&mut *self.reader).consume(i);
}
}
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually pretty simple to have an estimate for the number of bytes already downloaded when the error occurred, so I added it.

@rdettai rdettai requested a review from fulmicoton-dd July 15, 2025 15:15
@rdettai-sk rdettai-sk force-pushed the revamp-storage-metrics branch 6 times, most recently from 0da6565 to 0a4a5f9 Compare November 7, 2025 15:22
@rdettai-sk rdettai-sk force-pushed the revamp-storage-metrics branch from 0a4a5f9 to 69c1c7b Compare November 14, 2025 16:20
@rdettai-sk rdettai-sk force-pushed the revamp-storage-metrics branch 2 times, most recently from 24a7d1f to 63ae79d Compare November 25, 2025 08:55
@rdettai-sk rdettai-sk force-pushed the revamp-storage-metrics branch from 63ae79d to 07a973a Compare December 8, 2025 10:50
object_storage_get_errors_total: new_counter_vec::<1>(
"object_storage_get_errors_total",
"Number of GetObject errors.",
object_storage_request_duration: new_histogram_vec(
Copy link
Collaborator

@fulmicoton-dd fulmicoton-dd Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is ttfb it should be not be called duration!?

(technically I don't think this is ttfb btw :-/ )

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree the terminology TTFB is a bit stretched here as it also applies to other types of queries (deletes). And it's probably not a good idea to call it request duration for GET requests when it doesn't measure the whole download duration.

I still think it's valuable to measure the time it takes for get requests to get to the byte stream. It's valuable when trying to configure the StorageTimeoutPolicy, and it's also a way to measure the read performance of the object store without having to normalize by the downloaded size. Would you be fine with calling that separate metric object_storage_get_ttfb or do you still think it's not appropriate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdettai what are we measuring exactly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmassot The storage SDK returning a ByteStream.
We can hope most of the meta data layer stuff is done at this point.

@rdettai-sk

it's also a way to measure the read performance of the object store without having to normalize by the downloaded size

My comment had two parts.

  • the first one is: you called the metric duration. This needs to be renamed.
  • the ttfb part it does not matter much and I agree your metric is useful. I just don't want people to start comparing hyperscalers and drawing wrong conclusions due to minor implementation differences. I would edit the comment to ttfb-like for instance.

I think we need to rename that metric and keep it...
And then, count + duration total + num bytes total.

With that, at least we have enough to compute a model
TTFB + bytes * throughput.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise the changes on existing metrics:

  • I'll add a new metric called object_storage_get_slice_latency -> help: "TTFB-like latency of the object store get requests in seconds. This doesn't include the time to download the full response payload."
  • object_storage_request_duration is used only for other object store requests (only delete for now) with an updated description -> help: "Duration in seconds of the object store request"

I can add a metric to measure the total duration (download included). Should it be a counter or a histogram?

  • A histogram would be hard to leverage because you loose the connection to the actual download size. If we want the download throughput distribution, I think a histogram of the download throughput (wrapping the download itself with a timer) would be easier to use.
  • If we opt for a counter of the total duration, how would you use it to estimate the throughput?
  • We have retries when receiving an HTTP error response (e.g when receiving 427 slow down). I think failed attempts should not be included in the duration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix the object_storage_get_errors_total metric

5 participants