Deploy RisingWave on Kubernetes with Operator
This article will help you use the Kubernetes Operator for RisingWave (hereinafter ‘the Operator’) to deploy a RisingWave cluster in Kubernetes.
The Operator is a deployment and management system for RisingWave. It runs on top of Kubernetes and provides functionalities like provisioning, upgrading, scaling, and destroying the RisingWave instances inside the cluster.
Prerequisites
- Install kubectl Ensure that the Kubernetes command-line tool kubectl is installed in your environment.
- Install psql Ensure that the PostgreSQL interactive terminal psql is installed in your environment.
- Install and run Docker Ensure that Docker is installed in your environment and running.
- Ensure you allocate enough resources for the deployment. For details, see Hardware requirements.
Create a Kubernetes cluster
The steps in this section are intended for creating a Kubernetes cluster in your local environment. If you are using a managed Kubernetes service such as AKS, GKE, and EKS, refer to the corresponding documentation for instructions.
kind is a tool for running local Kubernetes clusters using Docker containers as cluster nodes. You can see the available tags of kind
on Docker Hub.
Create a cluster.
Optional: Check if the cluster is created properly.
Deploy the Operator
Before the deployment, ensure that the following requirements are satisfied.
- Docker version ≥ 18.09
kubectl
version ≥ 1.18- For Linux, set the value of the
sysctl
parameter net.ipv4.ip_forward to 1.
Install cert-manager and wait a minute to allow for initialization.
Install the latest version of the Operator.
The following errors might occur if cert-manager
is not fully initialized. Simply wait for another minute and rerun the command above.
Optional: Check if the Pods are running.
Deploy a RisingWave instance
RisingWave Kubernetes Operator extends the Kubernetes with CRDs (Custom Resource Definitions) to manage RisingWave. That means all you need to do is to create a RisingWave resource in your Kubernetes cluster, and the RisingWave Kubernetes Operator will take care of the rest.
Use the example resource files
The RisingWave resource is a custom resource that defines a RisingWave cluster. In this directory, you can find resource examples that deploy RisingWave with different configurations of metadata store and state backend. Based on your requirements, you can use these resource files directly or as a reference for your customization. The stable directory contains resource files that we have tested compatibility with the latest released version of the RisingWave Operator:
The resource files are named using the convention of risingwave-<meta_store>-<state_backend>.yaml
. For example, risingwave-postgresql-s3.yaml
means that this manifest file uses PostgreSQL as the meta storage and AWS S3 as the state backend.
RisingWave supports using these systems or services as state backends.
- MinIO
- AWS S3
- S3-compatible object storages
- Google Cloud Storage
- Azure Blob Storage
- Alibaba Cloud OSS
- HDFS
You can customize the state backend, or customize the state store directory.
Optional: Customize the state backend
Download source file
If you intend to customize a resource file, download the file to a local path and edit it:
You can also create your own resource file from scratch if you are familiar with Kubernetes resource files.
Then, apply the resource file by using the following command:
Customize the state backends
To customize the state backend of your RisingWave cluster, edit the spec:stateStore
section under the RisingWave resource (kind: RisingWave
).
Optional: Customize the state store directory
You can customize the directory for storing state data via the spec: stateStore: dataDirectory
parameter in the risingwave.yaml
file that you want to use to deploy a RisingWave instance. If you have multiple RisingWave instances, ensure the value of dataDirectory
for the new instance is unique (the default value is hummock
). Otherwise, the new RisingWave instance may crash. Save the changes to the risingwave.yaml
file before running the kubectl apply -f <...risingwave.yaml>
command. The directory path cannot be an absolute address, such as /a/b
, and must be no longer than 180 characters.
Validate the status of the instance
You can check the status of the RisingWave instance by running the following command.
If the instance is running properly, the output should look like this:
Connect to RisingWave
By default, the Operator creates a service for the frontend component, through which you can interact with RisingWave, with the type of ClusterIP
. But it is not accessible outside Kubernetes. Therefore, you need to create a standalone Pod for PostgreSQL inside Kubernetes.
Create a Pod.
Attach to the Pod so that you can execute commands inside the container.
Connect to RisingWave via `psql`.
Now you can ingest and transform streaming data. See Quick start for details.