Extend textual configuration support with the Datalayer's configuration #1914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

shmuelk wants to merge 14 commits into kubernetes-sigs:main from shmuelk:datalayer-config

+419 −22

Contributor

shmuelk commented Nov 30, 2025 •

edited

Loading

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR removes the need to configure the new V2 DataLayer from code. It extends the existing textual configuration enabling the configuration of DataSources and their respective Extractors.

Does this PR introduce a user-facing change?:

- The Datalayer is now configured via the standard EPP text based configuration.

k8s-ci-robot added do-not-merge/work-in-progress kind/feature labels

Contributor

k8s-ci-robot commented Nov 30, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shmuelk
Once this PR has been reviewed and has the lgtm label, please assign danehans for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the cncf-cla: yes label

k8s-ci-robot requested review from elevran and robscott

November 30, 2025 08:27

netlify bot commented Nov 30, 2025 •

edited

Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`78d915c`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6931b3ef5886f70008a1a278
😎 Deploy Preview	https://deploy-preview-1914--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot added the size/L label

shmuelk force-pushed the datalayer-config branch from df1171b to a2a0cec Compare

December 1, 2025 16:29

shmuelk changed the title ~~WIP: Extent textual configuration support with the Datalayer's configuration~~ Extend textual configuration support with the Datalayer's configuration

k8s-ci-robot removed the do-not-merge/work-in-progress label

elevran reviewed

View reviewed changes

apix/config/v1alpha1/endpointpickerconfig_types.go Outdated

    
              func (cfg EndpointPickerConfig) String() string {

              	return fmt.Sprintf(

              		"{Plugins: %v, SchedulingProfiles: %v, FeatureGates: %v, SaturationDetector: %v}",

              		"{Plugins: %v, SchedulingProfiles: %v, Data: %v, FeatureGates: %v, SaturationDetector: %v}",

Contributor

elevran Dec 2, 2025

nit: put feature gates first?

Contributor Author

shmuelk Dec 2, 2025

done

pkg/epp/config/loader/configloader.go Show resolved Hide resolved

site-src/guides/epp-configuration/config-text.md Show resolved Hide resolved

site-src/guides/epp-configuration/config-text.md Outdated

    
              The saturationDetector section configures the saturation detector, which is used to determine if special

              action needs to eb taken due to the system being overloaded or saturated. This section is described in more detail in the section [Saturation Detector configuration](#saturation-detector-configuration)

              The data section configures the data layer, which is used to gather metrics and other data used in making scheduling decisions.

Contributor

elevran Dec 2, 2025

Suggested change

      
            The data section configures the data layer, which is used to gather metrics and other data used in making scheduling decisions.
          
            The `data` section configures the data layer, which is used to gather information (such as metrics) used in making scheduling decisions.

Contributor

elevran Dec 2, 2025

In general, may want to use backticks to mark section names explicitly (i.e., data and not data)

Contributor Author

shmuelk Dec 2, 2025

Done

site-src/guides/epp-configuration/config-text.md Outdated

    
              ## Data Layer configuration

              The Data Layer collects metrics and other data used in scheduling decisions made by the various configured

              filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The basic ones

Contributor

elevran Dec 2, 2025

filters is specific to scheduling but plugins is generic. Maybe use scheduling plugins as the term?

Contributor Author

shmuelk Dec 2, 2025

Done

site-src/guides/epp-configuration/config-text.md Outdated

Comment on lines 341 to 342

    
              filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The basic ones

              collect Prometheus metrics from the Model Servers in the InferencePool.

Contributor

elevran Dec 2, 2025

Suggested change

      
            filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The basic ones
          
            collect Prometheus metrics from the Model Servers in the InferencePool.
          
            filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The baseline
          
            provided in GAIE collect Prometheus metrics from the Model Servers in the InferencePool.

nit: do we want to reference MSP doc here? Not required.

Contributor Author

shmuelk Dec 2, 2025

I don't think so. If at all it should be mentioned in more detailed Datalayer documentation

site-src/guides/epp-configuration/config-text.md Outdated

    
              filters and plugins. The exact data collected varies by the DataSource and Extractors configured. The basic ones

              collect Prometheus metrics from the Model Servers in the InferencePool.

              The Data Layer is configured via the data section of the overall configuration. It has the following form:

Contributor

elevran Dec 2, 2025

Suggested change

      
            The Data Layer is configured via the data section of the overall configuration. It has the following form:
          
            The Data Layer is configured via the `data` section of the overall configuration. It has the following form:

Contributor Author

shmuelk Dec 2, 2025

Done

site-src/guides/epp-configuration/config-text.md Outdated

    
              Each entry in the sources list has the following fields:

              - *pluginRef* is a reference to the name of the plugin instance to be used.

              - *extractors* specifies the list of the extractor plugin instances, by name, to be used with this DataSource.

Contributor

elevran Dec 2, 2025

nit:
I find having a required pluginRef to name the data source but no pluginRef for extractors somewhat inconsistent. If we can name extractors directly, can't we do the same for datasource... Logically map[string][]string

Contributor

elevran commented Dec 2, 2025 •

edited

Loading

/lgtm
left some questions/style suggestions that you may wish to address (or not), so unhold at your discretion

k8s-ci-robot assigned elevran

k8s-ci-robot added the lgtm label

Contributor

elevran commented Dec 2, 2025

/hold

k8s-ci-robot added do-not-merge/hold and removed lgtm labels

Contributor Author

shmuelk commented Dec 2, 2025

/unhold

k8s-ci-robot removed the do-not-merge/hold label

Contributor

elevran commented Dec 2, 2025

/lgtm

k8s-ci-robot added the lgtm label

kfswain reviewed

View reviewed changes

apix/config/v1alpha1/endpointpickerconfig_types.go Outdated

    
              	return "{" + result + "}"

              }

              // DataLayerConfig contains the configuration of the V2 DataLayer feature

Collaborator

kfswain Dec 1, 2025

Not this PR, but we need to figure out what we want to do with this config API. It's essentially a full fledged API, but not a CRD, and so is never 'applied' to k8s, and the control plane has no knowledge of it.

We essentially treat this just like a normal config file, but the yaml, and the way its structured in the code base makes it look like a CRD. I feel that if I was a newcomer, I would expect this file to be updatable and a part of the k8s ecosystem, which it is not.

Not a burning need, and again, not this PR. Just something to think about.

Contributor Author

shmuelk Dec 4, 2025

Originally the configuration wasn't a pseudo CRD. @ahg-g requested that I use the K8S machinery to parse the YAML text.

This can be backed off if desired. We simply need to come to a decision.

apix/config/v1alpha1/endpointpickerconfig_types.go Outdated

    
              	return "{" + result + "}"

              }

              // DataLayerConfig contains the configuration of the V2 DataLayer feature

Collaborator

kfswain Dec 1, 2025

super nit: remove v2, I dont think we have any configuration of the v1 data layer, so once v1 is finished implementation, its the only data layer.

Contributor Author

shmuelk Dec 4, 2025

Done

pkg/epp/config/loader/configloader.go

    
              func loadDataLayerConfig(rawDataConfig *configapi.DataLayerConfig, rawFeatureGates configapi.FeatureGates, handle plugins.Handle) (*datalayer.Config, error) {

              	featureGates := loadFeatureConfig(rawFeatureGates)

              	if !featureGates[datalayer.FeatureGate] {

Collaborator

kfswain Dec 1, 2025

Should we return an error if the datalayer (or any feature) is configured with plugins but the feature gate is closed? A user may want to know about that early on.

Contributor Author

shmuelk Dec 4, 2025

Added a check and an error.

shmuelk force-pushed the datalayer-config branch from e2227e9 to 8300077 Compare

December 4, 2025 12:11

Contributor

k8s-ci-robot commented Dec 4, 2025

New changes are detected. LGTM label has been removed.

k8s-ci-robot removed the lgtm label

shmuelk added 4 commits

December 4, 2025 18:14


          Added datalayer config to text config definition

1826ebc

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Added datalayer config to EPP config

f4651b3

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Load datalayer config from text to EPP config

75c4073

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Updated tests for datalayer config

b74444e

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

shmuelk added 10 commits

December 4, 2025 18:14


          Updated configuration documentation

aa75bfa

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Refactored configuration DataLayer Extractor elements

4d74eab

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Test data updates due to configuration refactoring

a09c18c

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Documentation updates due to comments from the review

1a2e848

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Removed V2 from errors and descriptions WRT the Datalayer

6c3709f

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Ensure there is no Datalayer config if it isn't enabled

08644f8

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Debug build failure

8d685af

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Removed debugging statement

179567d

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Add verbose to failing command

736d81f

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>


          Remove verbose from formerly failing command

78d915c

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

shmuelk force-pushed the datalayer-config branch from 2bc2edd to 78d915c Compare

December 4, 2025 16:16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes kind/feature size/L