From 6a5bc62b84b150181ec76b4e0f04a3bbf12b60c2 Mon Sep 17 00:00:00 2001 From: huizhang Date: Thu, 22 Jan 2026 17:45:01 +0800 Subject: [PATCH 1/5] add ACP Registry Capacity Planning Guide doc --- ...Container_Pltform_Registry_Sizing_Guide.md | 58 +++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md diff --git a/docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md b/docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md new file mode 100644 index 0000000..79d0857 --- /dev/null +++ b/docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md @@ -0,0 +1,58 @@ +--- +kind: + - Solution +products: + - Alauda Container Platform +ProductsVersion: + - 4.x +--- + +# Alauda Container Platform Registry & Registry Gateway Capacity Planning Guide + +## Introduction +This document provides hardware resource specification recommendations for **Alauda Container Platform Registry** in Kubernetes environments. The stack consists of two core components: + +* **Alauda Container Platform Registry**: The OCI image registry server responsible for storing and distributing image layers and manifests. It is I/O and network-intensive. +* **Registry Gateway**: A proxy middleware that enforces policies such as image size limits and repository tag count limits before requests reach the registry. It is primarily CPU and network-latency intensive. + +The recommendations are based on an analysis of component architectures, source code, and known performance characteristics, targeting three common deployment scales. + +## Component Analysis & Resource Profiles + +### Alauda Container Platform Registry + +**Resource Profile**: + * **I/O Intensive**: Performance is heavily dependent on storage backend speed (for layer push/pull operations). + * **Memory Sensitive**: Requires adequate memory for layer caching during pushes and pulls, and for handling concurrent connections. + * **Moderate CPU**: CPU is used for compression, hashing, and request handling. + +### Registry Gateway + +**Resource Profile**: + * **CPU Intensive**: Due to JSON parsing, size calculation, and request proxying. + * **Latency Sensitive**: Performance is tightly coupled with the response time of the backend Registry's tag listing endpoint. + * **Memory Sensitive**: Needs buffer for large manifest requests and maintains session cache. + +## Recommended Specifications by Cluster Scale +The following tables provide baseline recommendations for resource requests and limits. Vertical scaling (increasing replica resources) and horizontal scaling (increasing replica count) should be combined. + +### Scenario 1: ~100 Concurrent Pods (Light Usage) +| Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | +| --------- | -------------------- | -------------------------------------- | ---------------- | +| Alauda Container Platform Registry | 1-2 | CPU: `500m` / `1000m`
Memory: `512Mi` / `1Gi` | Single replica may suffice. | +| Registry Gateway | 1-2 | CPU: `200m-300m` / `500m`
Memory: `256Mi-512Mi` / `1Gi` | Resources accommodate bursty image pushes requiring manifest parsing. | + +### Scenario 2: ~1000 Concurrent Pods (Medium Usage) +| Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | +| --------- | -------------------- | -------------------------------------- | ---------------- | +| Alauda Container Platform Registry | 2-3 | CPU: `1000m` / `2000m`
Memory: `1Gi` / `2Gi` | Requires multiple replicas. | +| Registry Gateway | 2-3 | CPU: `300m-500m` / `1000m-2000m`
Memory: `512Mi-1Gi` / `2Gi` | The synchronous tag-list check becomes a primary bottleneck. Higher CPU limits are needed. | + +### Scenario 3: ~5000 Concurrent Pods (Large Usage) +| Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | +| --------- | -------------------- | -------------------------------------- | ---------------- | +| Alauda Container Platform Registry | 3-5+ | CPU: `2000m` / `4000m`
Memory: `2Gi` / `4Gi` | Requires significant horizontal scaling. | +| Registry Gateway | 3-5+ | CPU: `500m-1000m` / `2000m-4000m`
Memory: `1Gi-2Gi` / `4Gi` | Tag validation latency can cause cascading delays. | + +## Final Recommendation +Start with the baseline suggestions for your target scale, implement comprehensive monitoring, and iteratively adjust resources and replica counts based on observed performance metrics. From d17703bff74a1115f069f4e0741c89b6e00fc628 Mon Sep 17 00:00:00 2001 From: huizhang Date: Thu, 22 Jan 2026 19:04:19 +0800 Subject: [PATCH 2/5] add ACP Registry Capacity Planning Guide doc --- ...Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/en/solutions/{Alauda_Container_Pltform_Registry_Sizing_Guide.md => Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md} (100%) diff --git a/docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md similarity index 100% rename from docs/en/solutions/Alauda_Container_Pltform_Registry_Sizing_Guide.md rename to docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md From 47af98439e94279b9fac27cfe48f94776bb02fe3 Mon Sep 17 00:00:00 2001 From: huizhang Date: Fri, 23 Jan 2026 12:25:52 +0800 Subject: [PATCH 3/5] add ACP Registry Capacity Planning Guide doc --- ...atform_Registry_Capacity_Planning_Guide.md | 51 ++++++++++++++----- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md index 79d0857..71cf25b 100644 --- a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md +++ b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md @@ -33,26 +33,53 @@ The recommendations are based on an analysis of component architectures, source * **Latency Sensitive**: Performance is tightly coupled with the response time of the backend Registry's tag listing endpoint. * **Memory Sensitive**: Needs buffer for large manifest requests and maintains session cache. -## Recommended Specifications by Cluster Scale -The following tables provide baseline recommendations for resource requests and limits. Vertical scaling (increasing replica resources) and horizontal scaling (increasing replica count) should be combined. +## Core Evaluation Dimensions +This guide provides resource configuration recommendations based on the following two dynamic load indicators: + * **Daily Average Access Traffic**: Reflects ongoing daily load levels. + * **Peak Access Traffic**: Reflects the maximum concurrent pressure the system needs to handle. +These traffic flows primarily consist of two types of operations: + * **Push Operations**: Trigger image uploads, manifest parsing, and tag validation, placing higher demands on gateway CPU and memory. + * **Pull Operations**: Mainly generate pressure on registry I/O and network + +## Traffic Level Definitions +| Traffic Level | Daily Pull/Push Operations | Peak Concurrent Pull/Push Operations | Typical Scenario | +| --------- | -------------- | --------------------- | -------------------------- | +| Low Traffic | <1,000 | < 50 | Small team development/testing, light usage | +| Medium Traffic | 1,000-10,000 | 50-200 | Production environment with formal CI/CD pipelines | +| High Traffic | >10,000 | >200 | Enterprise central registry, shared by multiple teams | + +## Recommended Resource Configurations + +### Scenario 1: Low Traffic +Applicable: Small team development/testing, infrequent image updates -### Scenario 1: ~100 Concurrent Pods (Light Usage) | Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | | --------- | -------------------- | -------------------------------------- | ---------------- | -| Alauda Container Platform Registry | 1-2 | CPU: `500m` / `1000m`
Memory: `512Mi` / `1Gi` | Single replica may suffice. | -| Registry Gateway | 1-2 | CPU: `200m-300m` / `500m`
Memory: `256Mi-512Mi` / `1Gi` | Resources accommodate bursty image pushes requiring manifest parsing. | +| Alauda Container Platform Registry | 2 | CPU: `500m` / `1000m`
Memory: `512Mi` / `1Gi` | Basic configuration sufficient. | +| Registry Gateway | 2 | CPU: `250m` / `500m`
Memory: `256Mi` / `512Mi` | Basic configuration sufficient. | + +### Scenario 2: Medium Traffic +Applicable: Production environment with regular release processes, multiple pipelines -### Scenario 2: ~1000 Concurrent Pods (Medium Usage) | Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | | --------- | -------------------- | -------------------------------------- | ---------------- | -| Alauda Container Platform Registry | 2-3 | CPU: `1000m` / `2000m`
Memory: `1Gi` / `2Gi` | Requires multiple replicas. | -| Registry Gateway | 2-3 | CPU: `300m-500m` / `1000m-2000m`
Memory: `512Mi-1Gi` / `2Gi` | The synchronous tag-list check becomes a primary bottleneck. Higher CPU limits are needed. | +| Alauda Container Platform Registry | 3 | CPU: `1000m` / `2000m`
Memory: `1Gi` / `2Gi` | The use of object storage (S3-compatible) is advisable. | +| Registry Gateway | 3 | CPU: `500m` / `1000m`
Memory: `512Mi` / `1Gi` | HPA required to handle push peaks. | + +### Scenario 3: High Traffic +Applicable: Enterprise central registry serving multiple teams and all environments -### Scenario 3: ~5000 Concurrent Pods (Large Usage) | Component | Recommended Replicas | Container Resources (Requests / Limits) | Notes | | --------- | -------------------- | -------------------------------------- | ---------------- | -| Alauda Container Platform Registry | 3-5+ | CPU: `2000m` / `4000m`
Memory: `2Gi` / `4Gi` | Requires significant horizontal scaling. | -| Registry Gateway | 3-5+ | CPU: `500m-1000m` / `2000m-4000m`
Memory: `1Gi-2Gi` / `4Gi` | Tag validation latency can cause cascading delays. | +| Alauda Container Platform Registry | 5 | CPU: `2000m` / `4000m`
Memory: `2Gi` / `4Gi` | The use of object storage is advisable. | +| Registry Gateway | 5 | CPU: `1000m` / `2000m`
Memory: `1Gi` / `2Gi` | HPA mandatory, scaling based on CPU and latency metrics. | + +## Considerations for Dedicated Node Deployment +In production environments, deploying the `Alauda Container Platform Registry` and `Registry Gateway` on **dedicated nodes** (separate from core PaaS components) is strongly recommended when any of the following conditions apply: + * **High Concurrency/Throughput**: The registry handles over 10,000 daily operations, or experiences frequent batch image pulls during cluster scaling. + * **High Availability & Strict SLA Requirements**: Requires >99.9% availability, supports replication, or needs independent upgrade/disaster recovery procedures. + * **Resource Isolation & Security Compliance**: Mandated by multi-tenancy or regulatory audits, requiring separate security policies, logging, and data isolation. +**Benefits**: Prevents resource contention with critical platform services (e.g., API Server), minimizes performance interference, and simplifies security management. ## Final Recommendation -Start with the baseline suggestions for your target scale, implement comprehensive monitoring, and iteratively adjust resources and replica counts based on observed performance metrics. +It is recommended to configure resources based on daily/peak traffic, with basic setup for low traffic and HPA for medium/high traffic. For production environments, high concurrency (over 10k daily operations), or scenarios requiring high availability and strong isolation, it is strongly advised to deploy the Registry and Gateway on dedicated nodes. This approach avoids resource contention with core PaaS components, minimizes performance fluctuations, and facilitates independent disaster recovery, security policies, and storage optimization, ensuring service stability and data isolation. From aec50bc2e3506004b27431b36b77c338832df707 Mon Sep 17 00:00:00 2001 From: huizhang Date: Fri, 23 Jan 2026 12:33:20 +0800 Subject: [PATCH 4/5] add ACP Registry Capacity Planning Guide doc --- ...atform_Registry_Capacity_Planning_Guide.md | 27 ++++++++++--------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md index 71cf25b..1ef6d01 100644 --- a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md +++ b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md @@ -22,26 +22,27 @@ The recommendations are based on an analysis of component architectures, source ### Alauda Container Platform Registry **Resource Profile**: - * **I/O Intensive**: Performance is heavily dependent on storage backend speed (for layer push/pull operations). - * **Memory Sensitive**: Requires adequate memory for layer caching during pushes and pulls, and for handling concurrent connections. - * **Moderate CPU**: CPU is used for compression, hashing, and request handling. +* **I/O Intensive**: Performance is heavily dependent on storage backend speed (for layer push/pull operations). +* **Memory Sensitive**: Requires adequate memory for layer caching during pushes and pulls, and for handling concurrent connections. +* **Moderate CPU**: CPU is used for compression, hashing, and request handling. ### Registry Gateway **Resource Profile**: - * **CPU Intensive**: Due to JSON parsing, size calculation, and request proxying. - * **Latency Sensitive**: Performance is tightly coupled with the response time of the backend Registry's tag listing endpoint. - * **Memory Sensitive**: Needs buffer for large manifest requests and maintains session cache. +* **CPU Intensive**: Due to JSON parsing, size calculation, and request proxying. +* **Latency Sensitive**: Performance is tightly coupled with the response time of the backend Registry's tag listing endpoint. +* **Memory Sensitive**: Needs buffer for large manifest requests and maintains session cache. ## Core Evaluation Dimensions This guide provides resource configuration recommendations based on the following two dynamic load indicators: - * **Daily Average Access Traffic**: Reflects ongoing daily load levels. - * **Peak Access Traffic**: Reflects the maximum concurrent pressure the system needs to handle. +* **Daily Average Access Traffic**: Reflects ongoing daily load levels. +* **Peak Access Traffic**: Reflects the maximum concurrent pressure the system needs to handle. These traffic flows primarily consist of two types of operations: - * **Push Operations**: Trigger image uploads, manifest parsing, and tag validation, placing higher demands on gateway CPU and memory. - * **Pull Operations**: Mainly generate pressure on registry I/O and network +* **Push Operations**: Trigger image uploads, manifest parsing, and tag validation, placing higher demands on gateway CPU and memory. +* **Pull Operations**: Mainly generate pressure on registry I/O and network ## Traffic Level Definitions + | Traffic Level | Daily Pull/Push Operations | Peak Concurrent Pull/Push Operations | Typical Scenario | | --------- | -------------- | --------------------- | -------------------------- | | Low Traffic | <1,000 | < 50 | Small team development/testing, light usage | @@ -76,9 +77,9 @@ Applicable: Enterprise central registry serving multiple teams and all environme ## Considerations for Dedicated Node Deployment In production environments, deploying the `Alauda Container Platform Registry` and `Registry Gateway` on **dedicated nodes** (separate from core PaaS components) is strongly recommended when any of the following conditions apply: - * **High Concurrency/Throughput**: The registry handles over 10,000 daily operations, or experiences frequent batch image pulls during cluster scaling. - * **High Availability & Strict SLA Requirements**: Requires >99.9% availability, supports replication, or needs independent upgrade/disaster recovery procedures. - * **Resource Isolation & Security Compliance**: Mandated by multi-tenancy or regulatory audits, requiring separate security policies, logging, and data isolation. +* **High Concurrency/Throughput**: The registry handles over 10,000 daily operations, or experiences frequent batch image pulls during cluster scaling. +* **High Availability & Strict SLA Requirements**: Requires >99.9% availability, supports replication, or needs independent upgrade/disaster recovery procedures. +* **Resource Isolation & Security Compliance**: Mandated by multi-tenancy or regulatory audits, requiring separate security policies, logging, and data isolation. **Benefits**: Prevents resource contention with critical platform services (e.g., API Server), minimizes performance interference, and simplifies security management. ## Final Recommendation From 331c71a1692a8e36fdd100b73b98519394cc5576 Mon Sep 17 00:00:00 2001 From: huizhang Date: Fri, 23 Jan 2026 12:38:23 +0800 Subject: [PATCH 5/5] add ACP Registry Capacity Planning Guide doc --- ..._Container_Platform_Registry_Capacity_Planning_Guide.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md index 1ef6d01..c4cb6e3 100644 --- a/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md +++ b/docs/en/solutions/Alauda_Container_Platform_Registry_Capacity_Planning_Guide.md @@ -13,7 +13,7 @@ ProductsVersion: This document provides hardware resource specification recommendations for **Alauda Container Platform Registry** in Kubernetes environments. The stack consists of two core components: * **Alauda Container Platform Registry**: The OCI image registry server responsible for storing and distributing image layers and manifests. It is I/O and network-intensive. -* **Registry Gateway**: A proxy middleware that enforces policies such as image size limits and repository tag count limits before requests reach the registry. It is primarily CPU and network-latency intensive. +* **Registry Gateway**: A proxy middleware that enforces policies such as image size limits and repository tag count limits before requests reach the registry. It is primarily CPU and network-latency-intensive. The recommendations are based on an analysis of component architectures, source code, and known performance characteristics, targeting three common deployment scales. @@ -38,8 +38,8 @@ This guide provides resource configuration recommendations based on the followin * **Daily Average Access Traffic**: Reflects ongoing daily load levels. * **Peak Access Traffic**: Reflects the maximum concurrent pressure the system needs to handle. These traffic flows primarily consist of two types of operations: -* **Push Operations**: Trigger image uploads, manifest parsing, and tag validation, placing higher demands on gateway CPU and memory. -* **Pull Operations**: Mainly generate pressure on registry I/O and network + - **Push Operations**: Trigger image uploads, manifest parsing, and tag validation, placing higher demands on gateway CPU and memory. + - **Pull Operations**: Mainly generate pressure on registry I/O and network ## Traffic Level Definitions @@ -80,6 +80,7 @@ In production environments, deploying the `Alauda Container Platform Registry` a * **High Concurrency/Throughput**: The registry handles over 10,000 daily operations, or experiences frequent batch image pulls during cluster scaling. * **High Availability & Strict SLA Requirements**: Requires >99.9% availability, supports replication, or needs independent upgrade/disaster recovery procedures. * **Resource Isolation & Security Compliance**: Mandated by multi-tenancy or regulatory audits, requiring separate security policies, logging, and data isolation. + **Benefits**: Prevents resource contention with critical platform services (e.g., API Server), minimizes performance interference, and simplifies security management. ## Final Recommendation