Tune reserved memory and cache eviction policy
This topic introduces the concept and the tuning method of reserved memory and cache eviction policy.
What is reserved memory and cache eviction policy?
A simplified description of our memory control mechanism will help you understand these two concepts more straightforwardly. In simple words, RisingWave manages memory usage through a three-step process:
- Monitoring: The system calculates and monitors the memory usage of each component.
- Reserving buffer: The system reserves 30% of the total memory as a buffer. This is the reserved memory. In case of a sudden spike in input data, RisingWave has enough time to adjust its memory usage.
- Eviction checks: The system regularly compares current memory usage against the remaining 70% (usable memory), and decides if it should evict data. RisingWave applies a cache eviction policy based on the memory usage levels:
- Above 70%: Begins gentle data eviction
- Above 80%: Increases eviction intensity
- Above 90%: Further intensifies eviction
We allow you to tune both reserved memory and cache eviction policy.
How to tune reserved memory
Starting from version 1.10, RisingWave calculates the reserved memory based on the following gradient:
- 30% of the first 16GB
- Plus 20% of the remaining memory
This calculation method ensures that in scenarios with less memory, the system reserves more memory for critical tasks. On the other hand, in scenarios with more memory, it reserves less memory, thus achieving a better balance between system performance and memory utilization.
However, this may not be suitable for all workloads and machine setups. To address this, you can use the startup option --reserved-memory-bytes
or the environment variable RW_RESERVED_MEMORY_BYTES
to override the reserved memory configuration for compute nodes. But note that the memory reserved should be at least 512MB.
For instance, suppose you are deploying a compute node on a machine or pod with 64GB of memory. By default, the reserved memory would be calculated as follows:
- 30% of the first 16GB is 4.8GB.
- 20% of the remaining 48GB (64GB - 16GB) is 9.6GB.
- The total reserved memory would be 4.8GB + 9.6GB, which equals 14.4GB.
However, if you find this excessive for your specific use case, you have the option to specify a different value. You can set either RW_RESERVED_MEMORY_BYTES=8589934592
as an environment variable or --reserved-memory-bytes=8589934592
as a command line parameter when starting up the compute node. This will allocate 8GB as the reserved memory instead.
How to tune cache enviction policy
You can tune the cache enviction policy by configuring the memory_controller_eviction_factor_XXX
variables in the configuration toml file.
Variables in the configuration toml file
RisingWave uses a tiered eviction strategy to manage cache based on the current memory usage:
- Stable threshold: When memory usage exceeds
memory_controller_threshold_stable
, RisingWave begins to evict data from the operator cache with mild intensity. - Graceful threshold: Exceeding the
memory_controller_threshold_graceful
triggers a more aggressive eviction strategy. - Aggressive threshold: If memory usage surpasses
memory_controller_threshold_aggressive
, the eviction process becomes most aggressive to free up memory.
Each threshold corresponds to a specific configuration variable in the toml
configuration file.
For more detailed information on how the policy adjusts the intensity of eviction, you can refer to the source code directly.
Examples of tuning cache eviction policy
Suppose the reserved memory is set to 30%, and memory_controller_threshold_stable
is 0.8. RisingWave will aim to use 56% of the total memory stably during normal operations, calculated as:
Then memory usage typically stays below 56%. It only exceeds this limit during sudden throughput spikes or when creating a new materialized view that processes large amounts of historical data. In these cases, more aggressive memory eviction strategies are triggered.
Tuning recommendations
The best configuration depends on your specific workload and data pattern. We recommend you tune the parameters based on the metrics on Grafana.
See also
- Memory usage: The memory control mechanism of RisingWave.
- What consists of the memory usage and disk usage?