Skip to content

Conversation

@helsaawy
Copy link
Contributor

@helsaawy helsaawy commented Aug 28, 2025

A common scenario for pods is to write container logs (stdout and stderr) to a file on the host (specified by the LogPath and LogDirectory CRI fields), and then mount that directory to another (e.g., a monitoring or observability container) in the pod for consumption. For LCOW, this means logs must traverse from the uVM to the host, and then back into the uVM via (plan9) share, which both:

  • adds unnecessary overhead; and
  • exposes data to the host.

Address this by adding annotations to mimic LogPath and LogDirectory functionality, but within the uVM. Since the uVM rootfs is not directly backed by a VHD (i.e., directories are either RO or tmpfs backed), create a dedicated directory in the sandbox scratch for logs, and allow that path to be mounted within a container.

Note: containerd expects to be the end-all sink for container logs, use io.MultiWriter to keep default behavior and write logs to the specified path while also exposing them to the host. Therefore, annotations use TeeLog (instead of just Log) to make behavior explicit. While this helps prevent data exposure to the host by allowing log handling to remain within the uVM, it still requires additional (i.e., policy) configuration to ensure that logs are not eventually forwarded or stored on the host.

Add LCOWTeeLogPath annotation to tee container stdin and stderr to a specified file within the uVM, relative to a dedicated log directory in the sandbox scratch.

Add LCOWTeeLogDirMount annotation to mount the above log directory to the specified path within the container.

Add a dedicated trapsort.MultiWriter type that writes to both an underlying transport.Connection and the provided io.Writer.

Move logConnection from stdio to transport, to keep the different Connection implementations together.

Add functional tests.

Streamline test/internal/util by making CleanName operation on the test name directly (testing.TB.Name()), since that is the majority of the usage.
Add CleanString to handle the original functionality.

@helsaawy helsaawy requested a review from a team as a code owner August 28, 2025 20:46
@rawahars
Copy link

My understanding from this code path was that LCOW does not have a scratch of its own but if we need it, we will share the scratch of the container. If that is correct, then the scratch sharing be validated alongside this new feature.

Also, from security perspective, we should perhaps limit the locations on the UVM where the log files can be written to.

func makeLCOWDoc(ctx context.Context, opts *OptionsLCOW, uvm *UtilityVM) (_ *hcsschema.ComputeSystem, err error) {

@helsaawy
Copy link
Contributor Author

helsaawy commented Sep 2, 2025

@rawahars
I'm not sure about the shared scratch scenario; the log files are meant to be accessed from the uVM directly (either by uVM mounts or some other mechanism running in a privileged container), so the user is meant to handle that on their own.
My understanding of shared scratch is that the containers use the same backing VHD, but that doesn't automatically give them access to each others files

Re: the security perspective, since the uVM itself is the isolation boundary, and containerd does not limit the potential log paths ([1], [2]), it doesn't make sense to do so here either

@apurv15
Copy link
Contributor

apurv15 commented Sep 3, 2025

What is the use case for logging stderr and stdin within UVM?
Can UVM scratch become full of container logs and the logs be also lost if UVM exits?
What is UVM scratch backed with? Is it some VHD inside C:\ContainerPlatData\root\io.containerd.snapshotter.v1.windows-lcow used for UVM scratch?

@helsaawy
Copy link
Contributor Author

helsaawy commented Sep 3, 2025

What is the use case for logging stderr and stdin within UVM?
(privileged) logging/monitoring sidecars need access to other container's stdio
this allows logging to be handled within the uVM, without needing to expose the io to the host, and avoid needing to update the workload containers to change how they write their logs.

Can UVM scratch become full of container logs [...]?

Yes, same as writing to any file within the container

What is UVM scratch backed with? Is it some VHD inside C:\ContainerPlatData\root\io.containerd.snapshotter.v1.windows-lcow used for UVM scratch?

It depends on where the log file is written to. It is up to the user to properly pick where the logs go to and be aware of the uVM's composition, since the uVM filesystem could either be read only, backed by tempfs, or another mount.

[c]an logs be also lost if UVM exits?
yes, this is the same for any sort of data being stored within containers unless an explicit host mount is used

apurv15
apurv15 previously approved these changes Oct 9, 2025
@helsaawy helsaawy changed the base branch from ms/release/0.1 to main October 20, 2025 17:04
@helsaawy helsaawy dismissed apurv15’s stale review October 20, 2025 17:04

The base branch was changed.

@helsaawy helsaawy force-pushed the stdio-tee branch 2 times, most recently from 30769de to 3103e5f Compare October 21, 2025 15:54
@helsaawy helsaawy marked this pull request as draft October 22, 2025 16:51
@helsaawy helsaawy marked this pull request as ready for review October 27, 2025 15:23
}

c.setExitType(signal)
if c.logFile != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the container exits on its own or crashes on its own? Do we guarantee we always call kill in which case we clean this file? I'd prefer that if the multiwriter drops that this code auto runs to close the file personally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to logic so that the file is opened (and closed) in (*Container).Start.
We dont have Drop in Go (😢), and the Transport interface doesn't have a Close, so we cant tie to log file's lifecycle to that
Additionally, theres some weirdness with how we manage the io relay lifecycle, so i tied the log file to the init process, which will be closed regardless of how the container exits or is removed

A common scenario for pods is to write container logs (`stdout` and
`stderr`) to a file on the host (specified by the `LogPath` and
`LogDirectory` CRI fields), and then mount that directory to another
(e.g., a monitoring or observability container) in the pod for
consumption.
For LCOW, this means logs must traverse from the uVM to the host, and
then back into the uVM via (plan9) share, which both:
 - adds unnecessary overhead; and
 - exposes data to the host.

Address this by adding annotations to mimic `LogPath` and `LogDirectory`
functionality, but within the uVM.
Since the uVM rootfs is not directly backed by a VHD (i.e., directories
are either RO or `tmpfs` backed), create a dedicated directory in the
sandbox scratch for logs, and allow that path to be mounted within a
container.

Note: containerd expects to be the end-all sink for container
logs, use `io.MultiWriter` to keep default behavior and write logs to
the specified path while also exposing them to the host.
Therefore, annotations use `TeeLog` (instead of just `Log`) to make
behavior explicit.

Add `LCOWTeeLogPath` annotation to tee container `stdin` and `stderr`
to a specified file within the uVM, relative to a dedicated log
directory in the sandbox scratch.

Add `LCOWTeeLogDirMount` annotation to mount the above log directory to
the specified path within the container.

Add a dedicated `trapsort.MultiWriter` type that writes to both an
underlying `transport.Connection` and the provided `io.Writer`.

Move `logConnection` from `stdio` to `transport`, to keep the
different `Connection` implementations together.

Add functional tests.

Streamline `test/internal/util` by making `CleanName` operation on the
test name directly (`testing.TB.Name()`), since that is the majority of
the usage.
Add `CleanString` to handle the original functionality.

PR: move log file creation/management to (*Container).Start

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Seems to be caused primarily by the `github.com/containerd/errdefs/pkg/errgrpc`
import, but unclear why it is appearing now.

Add replace directive to forcibly ignore older/pre-submodule
`google.golang.org/genproto` package.

See: googleapis/go-genproto#1015

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
@helsaawy helsaawy merged commit d5a05aa into microsoft:main Nov 5, 2025
17 checks passed
@helsaawy helsaawy deleted the stdio-tee branch November 5, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants