diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc index d13381f291..006e88ec30 100644 --- a/modules/install/pages/sizing-general.adoc +++ b/modules/install/pages/sizing-general.adoc @@ -5,7 +5,7 @@ [abstract] {description} -When you plan to deploy a Couchbase Server cluster, the most common and important question that comes up is: how many nodes do I need and what size do they need to be? +The most common and important questions you need to ask when deploying a new Couchbase Server cluster are how many nodes you need, and what size they need to be. With the increasing number of Couchbase services and the flexibility of the Couchbase Data Platform, the answer to this question can be challenging. This guide aims to help you better size your deployment. @@ -24,39 +24,45 @@ There needs to be enough capacity in all areas to support everything the system === Multi-Dimensional Scaling -Couchbase Services are what allow you to access and maintain your data. -These services can be deployed, maintained, and provisioned independently of one another. -This independent service model allows you to take advantage of _Multi-Dimensional Scaling_, whereby a cluster can be fine-tuned for optimal handling of emergent workload-requirements, on a service-by-service basis. +Couchbase Services allow you to access and maintain your data. +You can deploy, maintain, and provision these services independently of each other. +This independent service model allows you to take advantage of Multi-Dimensional Scaling. -Since each service has different demands on hardware resources, Multi-Dimensional Scaling plays an important role when sizing your Couchbase cluster, both pre and post-deployment. -For example, core Data Service operations can often benefit greatly from the _scale out_ of smaller commodity nodes, whereas low latency operations with the Query Service may see a greater benefit from the _scale up_ of hardware resources on a given node. +Multi-Dimensional Scaling lets you fine-tune your cluster for optimal handling of changing workload-requirements, for each individual Couchbase Service. -For more information about the nature and resource demands of each Couchbase Service, refer to xref:learn:services-and-indexes/services/services.adoc[Services]. +Every Service has different demands on hardware resources. +Multi-Dimensional Scaling plays an important role when sizing your Couchbase cluster, both pre and post-deployment. +For example, core Data Service operations can often benefit from scaling out smaller commodity nodes. +Low latency operations with the Query Service might see a greater benefit from scaling up hardware resources on a given node. + +For more information about the nature and resource demands of each Couchbase Service, see xref:learn:services-and-indexes/services/services.adoc[Services]. == About Couchbase Server Resources This guide discusses four types of resources that you should consider when sizing a Couchbase Server cluster node: CPU:: -CPU refers to the number of cores and clock speed that are required to run your workload. +CPU controls the number of cores and the clock speed required to run your workload. RAM:: -RAM is frequently one of the most crucial areas to size correctly. -Cached documents allow the reads to be served at low latency and consistently high throughput. +RAM is often the most crucial area to size. +Cached documents provide low-latency reads and consistently high throughput. + -This resource represents the main memory that you allocate to Couchbase Server and must be determined by the following factors: +Your RAM represents the main memory you allocate to Couchbase Server. +Determine your allocation based on the following factors: + -- -* How much free RAM is available beyond OS and other applications -* How much data do you want to store in main memory -* How much latency you expect from KV/indexing/query performance +* How much free RAM is available beyond your OS and other applications. +* How much data you want to store in main memory. +* How much latency you expect from your Data, Indexing, and Query Service performance. -- + + Some components that require RAM are: + -- ** All index storage types which need sufficient memory quota allocation for proper functioning. -** Full Text Search (FTS) +** The Search Service. -- + .Minimum RAM Quota for Couchbase Server Components @@ -76,7 +82,7 @@ Some components that require RAM are: | 256 MB minimum; 2048 MB and above recommended | Query Service -| No RAM-allocation is required for this service. +| The Query Service does not require a RAM allocation. | Eventing Service | 256 MB @@ -89,8 +95,8 @@ Storage (disk space):: Requirements for your disk subsystem are: + -- -* [.term]_Disk size_ — Refers to the amount of the disk storage space that is needed to hold your entire data set. -* [.term]_Disk I/O_ — Is a combination of your sustained read/write rate, the need for compacting the database files, and anything else that requires disk access. +* *Disk size* — Specifies the disk storage space needed to hold your entire dataset. +* *Disk I/O* — Combines your sustained read/write rate, database file compaction, and any other operations that requires disk access. -- + To better support Couchbase Server, keep in mind the following: @@ -98,9 +104,9 @@ To better support Couchbase Server, keep in mind the following: -- * Disk space continues to grow if fragmentation ratio keeps climbing. To mitigate this, add enough buffer in your disk space to store all of the data. -Also, keep an eye on the fragmentation ratio in the Couchbase user interfaces and trigger compaction processes when needed. -* Solid State Drives (SSDs) are desired, but not required. -An SSD will give much better performance than a Hard Disk Drive (HDD) when it comes to disk throughput and latency. +Monitor your cluster's fragmentation ratio in the Couchbase Server Web Console and trigger compaction processes as needed. +* Couchbase recommends using Solid State Drives (SSD) when possible. +An SSD gives much better performance than a Hard Disk Drive (HDD) when it comes to disk throughput and latency. -- Network:: @@ -111,23 +117,23 @@ Most deployments can achieve optimal performance with 1 Gbps interconnects, but == Sizing Data Service Nodes -Data Service nodes store and perform data operations such as create/read/update/delete (CRUD). -The sizing information provided in this section applies to data stored in either Couchstore or Magma storage engines. -However, you should also consider the differences between these storage engines. -For more information, see xref:learn:buckets-memory-and-storage/storage-engines.adoc[]. +Data Service nodes handle data service operations, such as create/read/update/delete (CRUD). +The following sizing information applies to both the Couchstore and Magma storage engines. + +Couchbase recommends reviewing the differences between the available storage engines before attempting to size the Data Service nodes in your cluster. +For information, see xref:learn:buckets-memory-and-storage/storage-engines.adoc[]. It's important to keep use-cases and application workloads in mind since different application workloads have different resource requirements. -For example, if your working data set needs to be fully in memory, your cluster may need more RAM. -On the other hand, if your application requires only 10% of data in memory, you need disks with enough space to store all of the data. -Their read/write rate must also be fast enough to meet your performance goals. +For example, if your working set needs to be fully in-memory, your cluster might need more RAM. +If your application requires only 10% of data in-memory, you need disks with enough space to store all of the data, and that are fast enough for your read/write operations. === RAM Sizing for Data Service Nodes You can start sizing the Data Service nodes by answering the following questions: -. Is the application primarily (or even exclusively) using individual document access? -. Do you plan to use XDCR? -. What’s your working set size and what are your data operation throughput and latency requirements? +* Is the application primarily using individual document access? +* Do you plan to use XDCR? +* What's your working set size and what are your data operation throughput and latency requirements? Answers to the above questions can help you better understand the capacity requirement of your cluster and provide a better estimation for sizing. @@ -219,7 +225,7 @@ Based on the above formula, these are the suggested sizing guidelines: | = (312,000,000 + 4,000,000,000) * (1+0.25)/(0.85) = 6,341,176,470 bytes |=== -This tells you that the RAM requirement for the whole cluster is 7 GB. +This tells you that the RAM requirement for the whole cluster is 7{nbsp}GB. NOTE: This amount is in addition to the RAM requirements for the operating system and any other software that runs on the cluster nodes. @@ -402,27 +408,29 @@ When sizing, you must account for raw CPU overhead when using a high number of b This overhead does not account for any front-end workloads. You should allocate additional CPU cores for these workloads. -* xref:manage:monitor/monitor-intro.adoc[Monitoring] is recommended for CPU usage and System Limits. +* For more information about monitoring CPU usage and System Limits, see xref:manage:monitor/monitor-intro.adoc[]. == Sizing Index Service Nodes -A node running the Index Service must be sized properly to create and maintain secondary indexes and to perform index scan for {sqlpp} queries. +To create and maintain secondary indexes and perform index scans for {sqlpp} queries, you need to size your Index Service nodes. Similar to the nodes that run the Data Service, answer the following questions to take care of your application needs: -. What is the length oƒ the document key? -. Which fields need to be indexed? -. Will you be using simple or compound indexes? -. What is the minimum, maximum, or average value size of the index field? -. How many indexes do you need? -. How many documents need to be indexed? -. What is the working set percentage of index required memory? +-- +* What is the length of your document keys? +* Which fields need to be indexed? +* Will you be using simple or compound indexes? +* What is the minimum, maximum, or average value size of the indexed fields? +* How many indexes do you need? +* How many documents need to be indexed? +* What is the working set percentage of index required memory? +-- Answers to these questions can help you better understand the capacity requirement of your cluster, and provide a better estimation for sizing. *The following is an example use-case for sizing RAM for the Index service:* -The following sizing guide can be used to compute the memory requirement for each individual index and can be used to determine the total RAM quota required for the Index service. +Use the following sizing guide to compute the memory requirement for each individual index and to determine the total RAM quota required for the Index Service. .Input Variables for Sizing RAM |=== @@ -507,7 +515,8 @@ Based on the above formula, these are the suggested sizing guidelines: | (3200000000) * (1 + 0.25) * 0.2 = 800000000 bytes |=== -The above example shows the memory requirement of a secondary index with 10M index entries, each with 50 bytes size of secondary key and 30 bytes size of documentID. The memory usage requirements are 2.5GB(Nitro, 100% resident), 1GB(plasma, 20% resident), 800MB(Forestdb, 20% resident). +The previous example shows the memory requirement of a secondary index with 10M index entries, each with a 50 bytes secondary key and a 30 bytes DocumentID. +The memory usage requirements are 2.5{nbsp}GB (Nitro, 100% resident), 1{nbsp}GB (plasma, 20% resident), 800{nbsp}MB (ForestDB, 20% resident). NOTE: The storage engine used in the sizing calculation corresponds to the storage mode chosen for Index Service as explained in the table below. @@ -529,22 +538,22 @@ NOTE: The storage engine used in the sizing calculation corresponds to the stora A node that runs the Query Service executes queries for your application needs. -Since the Query Service doesn’t need to persist data to disk, there are very minimal resource requirements for disk space and disk I/O. +Since the Query Service does not need to persist data to disk, there are minimal resource requirements for disk space and disk I/O. You only need to consider CPU and memory. -There are a few questions that will help size the cluster: +Answer the following questions to help size the Query Service nodes on your cluster: -. What types of queries do you need to run? -. Do you need to run `stale=ok` or `stale=false` queries? -. Are the queries simple or complex (requiring JOINs, for example)? -. What are the throughput and latency requirements for your queries? +* What types of queries do you need to run? +* Do you need to run `stale=ok` or `stale=false` queries? +* Are the queries simple or complex? For example, do you need to use JOINs? +* What are the throughput and latency requirements for your queries? Different queries have different resource requirements. A simple query might return results within milliseconds while a complex query may require several seconds. -The number of queries that may be processed simultaneously may be approximated with the formula _CPU_cores * 4_. -The maximum queue-length for queries may be approximated with the formula _CPU_cores * 256_. -If either limit is reached, additional queries are rejected with a `503` error. +The formula used to calculate the number of queries that's processed simultaneously is `CPU_cores * 4`. +The formula used to calculate the maximum queue-length for queries is `CPU_cores * 256`. +If you reach either limit, the system rejects additional queries with a 503 error. == Sizing Analytics Service Nodes @@ -554,68 +563,87 @@ The Analytics Service is dependent on the Data Service and requires the Data ser === Data space -* Ensure that the data space for Analytics node takes into account metadata replicas. The Analytics Service currently only replicates metadata and not the actual data. There is a small overhead for metadata replicas as metadata is usually small. +* Make sure that the data space for your Analytics Service nodes takes into account metadata replicas. +The Analytics Service only replicates the metadata and not the actual data. +There's a small overhead for metadata replicas as metadata is generally small. -* When evaluating a query, the Analytics engine uses temporary disk space. The type of query being executed can impact the amount of temporary disk space required. For example, a query with heavy JOINs, aggregates, windowing, or more predicates will require more temporary disk space. Typically, the temporary disk space can be 2x the data space. +* When evaluating a query, the Analytics engine uses temporary disk space. +The type of query you want to run determines the required amount of temporary disk space. ++ +For example, queries with heavy JOINs, aggregates, windowing, or additional predicates require more temporary disk space. +Typically, the temporary disk space can be 2x the data space. * The percent of data shadowed, which is dependent on your use case. -* When ingesting data from the the Data Service into the Analytics Service a filter can be provided that reduces the size of the data that is ingested and also the storage size for the Analytics Service proportionally. +* When you load data from the Data Service into the Analytics Service, you can apply a filter to reduce both the loaded data size and the Analytics Service storage requirements proportionally. -=== Disk types and partioning +=== Disk Types and Partitioning -During query execution, Analytics’s query engine attempts to concurrently read and process data from all data partitions. Because of that, the Input/Output Operations per Second (IOPS) of the actual physical disk in which each data partition resides plays a major role in determining the query execution time. -Modern storage devices such as SSDs have much higher IOPS and can deal better with concurrent reads than HDDs. Therefore, having a single data partition on devices with high IOPS will not fully utilize their capabilities. +During query execution, the Analytics query engine concurrently reads and processes data from all partitions. +The Input/Output Operations per Second (IOPS) of the physical disk that hosts the data partitions plays a major role in determining the query execution time. +Modern storage devices such as SSDs have much higher IOPS and can deal better with concurrent reads than HDDs. +A single data partition underutilizes high IOPS devices. -To simplify the setup of the typical case of a node having a single modern storage device, the Analytics service automatically creates multiple data partitions within the same storage device if and only if a single “Analytics Disk Path” is specified during the node initialization. The number of automatically created data partitions is based on this formula: +To simplify setup for nodes with a single modern storage device, the Analytics Service creates multiple data partitions on the same storage device. +It does this only when you specify a single Analytics disk path during node initialization. +The Analytics Service determines the number of partitions using the following formula: * `Maximum partitions to create = Min((Analytics Memory in MB / 1024), 16)` * `Actual created partitions = Min(node virtual cores, Maximum partitions to create)` -For example, if a node has 8 virtual cores and the Analytics service was configured with memory >= 8GB, 8 data partitions will be created on that node. -Similarly, if a node has 32 virtual cores and was configured with memory >= 16GB, only 16 partitions will be created as 16 is the upper bound for automatic partitioning. +For example, if a node has 8 virtual cores and the Analytics Service has at least 8{nbsp}GB of memory, the system creates 8 data partitions on that node. +Similarly, for a node with 32 virtual cores and 16{nbsp}GB memory, the system creates 16 partitions, the maximum for automatic partitioning. -=== Index considerations +=== Index Considerations -The size of a secondary index is approximately the total size of indexed fields in the Analytics collection. For example, if a collection has 20 fields and only 1 of those fields appears in the secondary index, the secondary index size will be ~1/20 of the collection size. +The size of a secondary index is around the total size of indexed fields in the Analytics collection. +For example, if a collection has 20 fields and only 1 of those fields appears in the secondary index, the secondary index size is ~1/20 of the collection size. == Sizing Eventing Service Nodes -Eventing is a compute oriented service. By default, Eventing service has one worker and each worker has two threads of execution. You can scale eventing both vertically by adding more workers or horizontally by adding more nodes. The Eventing service will partition vBuckets across the number of available nodes. +Eventing is a compute-oriented service. +By default, the Eventing Service has 1 worker and each worker has 2 threads of execution. +You can scale the Eventing Service both vertically by adding more workers or horizontally by adding more nodes. +The Eventing Service partitions the vBuckets across the number of available nodes. === CPU -Because Eventing allows arbitrary code, JavaScript, to be written and run, it is difficult to come up with a perfect sizing formula unless all Functions have been designed and their KV ops, Query ops, and cURLops are known along with the expected mutation rate. +Eventing runs arbitrary JavaScript code. +This flexibility makes it difficult to define a precise sizing formula. +You cannot define a precise formula unless you know the function designs, their KV operations, query operations, cURL operations, and the expected mutation rate. For example, if you process 100K mutations per second and only match 1 out of 1000 patterns, then perform some intense computation on the matched 100 items in your Eventing Function, you need 100X less compute than if you performed the intense computation on each mutation. -Eventing also can perform I/O to external REST endpoints via a synchronous HTTP/S cURL call. In this case, Eventing typically blocks on I/O and doesn’t need much CPU. However. if you want high throughput to overcome bandwidth, you will need more workers and thus more cores. +Eventing also can perform I/O to external REST endpoints through a synchronous HTTP/S cURL call. +Eventing typically blocks on I/O and requires little CPU. +Achieving high throughput to overcome bandwidth requires additional workers and cores. -8 vCPUs or 4 physical cores should be considered a good start for running a few Eventing Functions. +Use 8 vCPUs or 4 physical cores to run Eventing Functions. === RAM -In general, the Eventing memory quota of 256 MB is sufficient for almost all workloads. +For more information about how to size your Eventing memory quota, see xref:eventing:eventing-memory-quota.adoc[]. -When scaling up vertically by adding more workers (in the handler’s settings), you see a stall in processing when the number exceeds 48 workers. In this case, the memory quota can be increased to 384 MB or 512 MB. Do not add memory to the Eventing Service’s memory quota without a justified need as it can create resource issues. === Eventing Storage Collection (previously Metadata Bucket) -Eventing functions store less than 2048 docs per Function. If timers are not used or if you have less than a few thousand active timers, then the size of the Eventing storage collection can simply be in a bucket with a minimum size 100 MB. +Each Eventing function stores fewer than 2048 documents in its Eventing storage collection. +If timers are not used or if the active timers count does not exceed the per-function document limit, store the Eventing storage collection in a 100 MB bucket. -However, if you use timers you will have to allocate an additional space of about 800 bytes + the size of the passed context (which is the state passed to the function when it is called in the future) per active timer. +Using timers requires additional storage for each active timer. +Each active timer requires 800 bytes, plus the size of the passed context, which represents the state supplied to the function at future execution. -If you have a context of 200 bytes (total 1K/timer), then for 100,000 active timers you'll need 100 MB of additional space in this bucket. +A 200-byte context results in 1 KB of storage per active timer. +100,000 active timers require 100 MB of additional bucket space. -As a best practice, it's recommended to keep this collection 100% resident, so that it's always available in-memory. +As a best practice, keep this collection fully resident in-memory to make sure you have constant availability. -NOTE: This collection is shared across all your Eventing Functions. +NOTE: All Eventing functions use this collection. == Sizing Backup Service Nodes The hardware requirements for running a backup cluster are as follows: - .Hardware requirements |=== ||Minimum |Recommended @@ -631,13 +659,14 @@ The hardware requirements for running a backup cluster are as follows: |=== - == Sizing for Replication (XDCR) Before setting up a replication, you must make sure your cluster is appropriately configured and provisioned. Your cluster must be properly sized to be able to handle new XDCR streams. -For example, XDCR needs 1-2 additional CPU cores per stream. In some cases, it also requires additional RAM and network resources. If a cluster is not sized to handle _both_ the existing workload _and_ the new XDCR streams, the performance of both XDCR and the cluster overall may be negatively impacted. +For example, XDCR needs 1-2 additional CPU cores per stream. +In some cases, it also requires additional RAM and network resources. +If a cluster is not sized to handle both the existing workload and the new XDCR streams, the performance of both XDCR and the cluster overall might be negatively impacted. -For information on preparing your cluster for replication, see xref:manage:manage-xdcr/prepare-for-xdcr.adoc[Prepare for XDCR]. +For information about preparing your cluster for replication, see xref:manage:manage-xdcr/prepare-for-xdcr.adoc[Prepare for XDCR]. diff --git a/preview/DOC-13422-Eventing-Memory-Quota.yml b/preview/DOC-13422-Eventing-Memory-Quota.yml new file mode 100644 index 0000000000..941d8ddab5 --- /dev/null +++ b/preview/DOC-13422-Eventing-Memory-Quota.yml @@ -0,0 +1,29 @@ +sources: + docs-devex: + branches: DOC-13422-Eventing-Memory-Quota_FOR_DEVEX + + docs-analytics: + branches: release/8.0 + + couchbase-cli: + branches: morpheus + startPaths: docs/ + + backup: + branches: morpheus + startPaths: docs/ + + #analytics: + # url: ../../docs-includes/docs-analytics + # branches: HEAD + + cb-swagger: + url: https://github.com/couchbaselabs/cb-swagger + branches: release/8.0 + start_path: docs + + # Minimal SDK build + docs-sdk-common: + branches: [release/8.0] + docs-sdk-java: + branches: [3.8-api]