Skip to content

Conversation

@luxas
Copy link
Member

@luxas luxas commented Nov 6, 2025

  • One-line PR description: Initial commit of the Conditional Authorization KEP

/sig auth

Rendered

Status Nov 26: Proposal updated to include discussions from KubeCon and last week's SIG Auth meeting comments.

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Nov 6, 2025
@k8s-ci-robot k8s-ci-robot requested a review from deads2k November 6, 2025 11:59
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: luxas
Once this PR has been reviewed and has the lgtm label, please assign micahhausler for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from ritazh November 6, 2025 11:59
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 6, 2025
@luxas luxas mentioned this pull request Nov 6, 2025
4 tasks
@enj enj added this to SIG Auth Nov 6, 2025
@enj enj moved this to Needs Triage in SIG Auth Nov 6, 2025
Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome to see this KEP - thanks for digging into this @luxas !

I'll be at both Maintainer Summit and KubeCon, so if folks are planning to discuss this there I'd love to sit in on those discussions :).

Leaving a few thoughts/questions that came to mind as I was doing an initial read.

Comment on lines 525 to 535
// If a NoOpinion condition evaluates to true, the given
// authorizer’s ConditionSet cannot evaluate to Allow anymore, but
// necessarily Deny or NoOpinion, depending on whether there are any
// true EffectDeny conditions.
// However, later authorizers in the chain can still Allow or Deny.
// It is effectively a softer deny that just overrides the
// authorizer's own allow policies. It can be used if an authorizer
// does not consider itself fully authoritative for a given request.
// TODO: Talk about error handling; what happens if any of these
// conditions fail to evaluate.
EffectNoOpinion ConditionEffect = "NoOpinion"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reader, it might be helpful to have some examples of when an authorizer presenting a condition to be evaluated may not consider itself authoritative.

Probably naively, my understanding of NoOpinion in the existing authorizer sense aligns with "I don't have an enforceable policy defined for this request, and thus have no opinion on the authorization of this request".

If an authorizer is returning a condition to be evaluated for a policy, I would assume that the authorizer inherently has an opinion on whether or not the request should be authorized based on the result of that condition evaluation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! 👍 This one is a bit hard to model/reason about yeah. This is effectively a deny policy which can be thought of as appended to each allow condition, as follows, e.g.

  • policy1: "never give lucas any privileges" ("NoOpinion condition")
    • principal.name == "lucas"
  • policy2: "never assign privileges for a node identity" ("NoOpinion condition")
    • principal.groups.contains("system:nodes")
  • policy3: "let group engineers create deployments when serviceAccountName is foo" ("Allow condition")
    • principal.groups.contains("engineers") && action == "create" && resource is core::deployments && resource.request.v1.spec.serviceAccountName == "foo"
  • policy4: "let group readers list deployments" ("Allow condition")
    • principal.groups.contains("readers") && action == "list" && resource is core::deployments

Now, even though these are not all conditional in this example, in theory they could be. The logic of the weaker "NoOpinion deny" conditions is such that they are logically appended to each Allow policy, as follows:

<allow_policy> && !(<noopinion1_policy> || ... || <noopinionN_policy>)
<=>
<allow_policy> && !<noopinion1_policy> && ... && !<noopinionN_policy>

In other words, the allow policy matches only if no NoOpinion deny policy also simultaneously. If any of the NoOpinion policies did match (that is, evaluate to true), it means that all Allow conditions are automatically falsified, as there is ... && !true && ... somewhere in every Allow condition, which necessarily never can make the expression evaluate to true.

In the above example, it is e.g. possible that user lucas also is in the engineers and/or readers groups, and it would be inconvenient to have to "remember" to add && principal.username != "lucas" to every allow condition to remove this possibility.

The above 4 policies are thus logically equivalent to the following two Allow policies, which also could have been written directly, but could be more cumbersome:

  • "let group engineers create deployments when serviceAccountName is foo, unless the principal is lucas or a node"
    • principal.groups.contains("engineers") && action == "create" && resource is core::deployments && resource.request.v1.spec.serviceAccountName == "foo" && principal.username != "lucas" && !principal.groups.contains("system:nodes")
  • "let group readers list deployments, unless the principal is lucas or a node"
    • principal.groups.contains("readers") && action == "list" && resource is core::deployments && principal.username != "lucas" && !principal.groups.contains("system:nodes")

It's always harder to reason about deny policies, and I'm not advocating for their use, but they are such strong primitives that it would feel irresponsible not to take their characteristics into account and plan for their inevitable (?) addition at some point. Today, VAP forms our deny-policy layer, and this proposal (IMO) must thus plan for deny policies existing. VAP, however, always are equivalent to the stronger Deny conditions, as they never consider any other authorizer in the chain, but do exactly as they like 😄.

Whether NoOpinion conditions like these are useful or not, is entirely dependent on what kind of authorizer chain you have. I'm not sure upstream kube can mandate folks to have one type of chain or the other.

However, it would/could be fair to demand a NoOpinion-capable authorizer to do the above rewrite from Allow+NoOpinion conditions to only Allow conditions before sending the conditions to Kubernetes, and in that case Kube would not need to worry about this. However, this is at the cost of longer condition evaluation times (as the same NoOpinion condition would evaluate once for every allow condition, instead of just once), and less nice error messages, as the error message would not be "user lucas cannot get privileges from authorizer foo", but e.g. "let group readers list deployments, unless the principal is lucas or a node", using the example from above.

This is a great point worth discussing, and I'm interested in hearing other people's thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

policy1: "never give lucas any privileges" ("NoOpinion condition")
principal.name == "lucas"
policy2: "never assign privileges for a node identity" ("NoOpinion condition")
principal.groups.contains("system:nodes")

Even with this explanation, I'm not sure I'm following how making these NoOpinion conditions are any different than making them an explicit Deny?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a concrete example is, imagine you have an custom authorizer evaluated prior to NodeAuthorizer. You don't want someone to accidentally write a policy that starts breaking Node requests, so you write a NoOpinion rule saying essentially "Short circuit this whole authorizer, and don't evaluate any of the conditions"

Maybe a question for @luxas would be, why not just have the whole external authorizer short circuit and return a NoOpinon on the SAR with zero conditions instead of have this special case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with this explanation, I'm not sure I'm following how making these NoOpinion conditions are any different than making them an explicit Deny?

Consider this scenario:

  • Authorizer 1: Allow condition 1, NoOpinion condition 2
  • Authorizer 2: Allow condition 3

If condition 2 matches, it means that the response of the first authorizer will be NoOpinion, regardless of the value of condition 1. However, this means that the next authorizer (2) is considered, which means that the request can become authorized.

If condition 2 would have been of effect Deny, and it evaluated to true, that request would have been necessarily denied.

Maybe a question for @luxas would be, why not just have the whole external authorizer short circuit and return a NoOpinon on the SAR with zero conditions instead of have this special case?

That case you mentioned would indeed not use a NoOpinion condition. The use-case for a NoOpinion condition is to effectively attenuate the every allow policies of the current authorizer, without affecting later authorizers. This could be useful if the authorizer is not authoritative to say "thou shall never do this" (Deny), but only "I won't ever allow this" (NoOpinion)

Copy link
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't follow th conventions we use for our PRs. It's so different that I found it hard to follow.

To help contributors to review it, I recommend aligning to the conventions much more.

@luxas
Copy link
Member Author

luxas commented Nov 19, 2025

This PR doesn't follow the conventions we use for our PRs. It's so different that I found it hard to follow.

Sorry you felt it is hard to follow. It'll get improved over time, with feedback and increased iteration.
Indeed, this KEP has more "background material" than most other, but the purpose of this is to make sure everyone is on the same page before reading the actual proposal, e.g. as partial evaluation is not a technique used in Kubernetes yet. I'll improve the flow of the text as we go, and eventually add all other KEP-related metadata sections that are not here yet to make it conform with the information needed. However, I prefer to proceed iteratively over polishing all parts fully before we've got enough initial reviews.

Copy link
Member Author

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments that were discussed offline at KubeCon. Will update the KEP to reflect these comments as soon as I have time.

Copy link
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial thoughts on this KEP.

subject to through (Self)SubjectAccessReview
- Ensure that a request is evaluated against an atomic set of policies in the
authorizer, even in presence of cached authorization decisions
- Allow conditions to be expressed in both transparent, analyzable forms (e.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we think sets of authz policies might exist as a. OCI artefact.

  • good idea?
  • bad idea?
  • out of scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider this out of scope. This KEP is primarily aimed at defining the plumbing of how Kubernetes can handle conditional authorization. Management of policies is a porcelain layer for another KEP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same opinion, this is up to the discretion of the authorizer. I don't think there will be an attempt to standardize how authorizers manage data. (Or I don't know what the rationale for trying to do so would be).

- Designing or mandating use of a specific policy language as the user experience
- Designing or mandating use of a specific “official” condition syntax
- Expose the conditions to arbitrary admission controllers
- Support conditional authorization for requests other than write and connect verbs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I can't use this to restrict reads, or for custom verbs (eg approving certificate signing)? That's important and needs calling out much more visibly.


That restriction conflicts with the ambition to cover impersonation; impersonation is neither a write nor a read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom verbs (approve/sign/bind/scale/etc) are mutations on resource request that go through admission and are in scope.

Reads are not in scope because there is no admission on them.

Impersonation is not a read or mutation of resources stored the database, but an API construct for identity transformations, and something that you do want to authorize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of. In an aggregated API server case, let's say I have an API kind named User, and a custom verb enrol. However this doesn't represent a change in desired state and will only show up in status. The enrollment is actually triggered by a separate API named EnrollmentRequest.

In that story, an enrol not a mutation of anything stored, and the conventional admission layer would only see the EnrollmentRequest.

It doesn't have to be a write either. This kind of mechamism could be relevant to a decrypt verb too. Except we're saying it can't be.


Yes, most APIs are served in-tree. But this feels like quite far away from the abstract model we have for authz. Now the applicability depends on whether the request is [an attempted write || impersonation]. And even that is complexity that people need to hold in mind.

I really hope we can find a way forward that works for all APIs and for all, or at least most, of the ways Kubernetes might evolve.

Copy link
Member Author

@luxas luxas Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call-out 👍 I missed to detail that what verbs we support enforcing conditional authorization on is different from what verbs can return conditional responses in SubjectAccessReview. The former considers only kube-apiserver requests, and is thus restricted to write and connect verbs only, as only for those kube-apiserver has auxiliary request data.

However, the framework as such supports any verb, your authorizer can return conditions on a custom verb, e.g. enrol, and the SubjectAccessReview client (e.g. your custom aggregated API server) can choose to enforce those conditions exactly how it sees fit. So I don't see any conflicts with what you're asking for.

I updated the doc now to call this out.


- Must preserve order of authorizers; evaluation must not differ from how it has
been evaluated using separate authorization and admission phases.
- Works for connectible (CONNECT verb) resources; only when the accessing the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is partial support for CONNECT authz a requirement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, as you can already express predicates on options (prominently available in connect requests) in VAP, and there is another KEP (the linked node proxying one) which is all about restricting the options of connectible requests.

[KEP-2862: Fine-grained Kubelet API Authorization](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2862-fine-grained-kubelet-authz/README.md))
- Encode the logic of the NodeRestriction admission controller through a
condition returned by the Node authorizer
- Empower out-of-tree authorizers to perform finer-grained policies, e.g. with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand what this means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today, the only dynamic in-tree authorizer is RBAC, and it does not support attribute or label-based controls.

With this capability, out of tree authorizers could enforce rules like "users in group label-owners can only mutate objects that have the label owner = <username>." This is an example of an attribute-based control (referencing the username to the resource) as well as a label-based control (owner label)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this wording now in case it was confusing. Hopefully the goal became clear to you @lmktfy


Now, in the introduction it was mentioned that with this proposal, two new
[authorizer.Decision](https://pkg.go.dev/k8s.io/apiserver/pkg/authorization/authorizer#Decision)
types are added, namely `ConditionalAllow` or `ConditionalDeny`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do a SubjectAccessReview as an aggregated API server, is it possible that I see a verdict at to ConditionalAllow?

(if I might, what should the API server then do given that verdict?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an open question. Based on recent conversations, I think we'll need to have APIService opt-in (via a new field) to letting ConditionalAllow requests through so the aggregated API can make its own SAR.

When the aggregated API makes its own SAR, it'll get back the condition list in the SAR, at which point the aggregated API will need to resolve those conditions itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a lot of extra complexity to inflict on someone writing an aggregated API server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra field on APIService is extra complexity? Implementing SAR and admission is already a responsibility for aggregated API server authors

Copy link
Member

@lmktfy lmktfy Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ("conditional authz") is plausibly going to make the authz check code grow by something like 500%.

So, yes, extra complexity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for self: for the API server to be able to serve AuthorizationConditionsReview, we need to add AuthorizerName to the serialization to keep track of which authorizer authored the condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the updated proposal address your questions @lmktfy?

Copy link
Member

@lmktfy lmktfy Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit … unfair on the client (ie, the aggregated API server). Let's say it makes a v1 SubjectAccessReview request, gets a ConditionalAllow verdict in the response, and because of this value being unknown, crashes that thread. It recovers but the code that was working now needs a new release; also, because the verdicts were part of the library ABI, it needs a new major version.

The person who wrote that against the v1 API shouldn't complain too hard about the need to bump their library version, but that deserialization-to-crash-and-recovery outcome seems plausible.

We could be kinder by returning plain denies to clients that only support SubjectAccessReview v1. Possibly an annotated deny, but a deny rather than a conditional allow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it crash? An aggregated API server that does not support conditional authz will never set spec.conditionsMode, which means that the authorizer will see spec.conditionsMode="", which means "never return conditions". If the authorizer wanted to return conditions in this case, it must fold into either NoOpinion or Deny, that the old aggregated API server understands without problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, it wouldn't.

OK. Good to know. The KEP covers lot and I found it hard to work out that this story would go OK.

Copy link
Member Author

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some updates/comments with info from the SIG Auth meeting of Nov 19


Now, in the introduction it was mentioned that with this proposal, two new
[authorizer.Decision](https://pkg.go.dev/k8s.io/apiserver/pkg/authorization/authorizer#Decision)
types are added, namely `ConditionalAllow` or `ConditionalDeny`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow for aggregated API servers is not any different from the main one.

  1. A request arrives at an aggregated API server. Authentication takes place.
  2. The aggregated API server is configured to webhook a SAR to kube-apiserver. There is no difference from webhooking to kube-apiserver than any other webhook impl
  3. Normal flow, if conditions are returned, the aggregated API server evaluates and enforces them.

No extra APIService field is needed. If a request hits the kube-apiserver first, it forwards it to the aggregated API server IFF the response is Allow or ConditionalAllow. The conditions are not directly propagated from the API server to the aggregated one, but respected thanks to the aggregated API server being configured to use webhook authz to kube-apiserver. In the SIG meeting, it was discussed that if the aggregated API server violates the security guidelines of not using the kube-apiserver as a webhook authorizer, that is a configuration error. And in that case, users can just hit the aggregated API server endpoint directly.

What I need to add to the proposal is the ability for the API server to implement the AuthorizationConditionsReview endpoint, however.

@luxas
Copy link
Member Author

luxas commented Dec 4, 2025

All comments so far should be addressed and responded to now. Please let me know other feedback you have and/or if anything's unclear.

@k8s-ci-robot
Copy link
Contributor

@luxas: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-test 5a97afa link true /test pull-enhancements-test
pull-enhancements-verify 5a97afa link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@lmktfy
Copy link
Member

lmktfy commented Dec 4, 2025

See #5684 (comment) fory thoughts on being kind to aggregated API servers doing SubjectAccessReviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

5 participants