Operational
Sink data from RisingWave to Cassandra or ScyllaDB
This guide describes how to sink data from RisingWave to Cassandra or ScyllaDB using the Cassandra sink connector in RisingWave.
PUBLIC PREVIEW
This feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our Slack channel. Your input is valuable in helping us improve this feature. For more details, see our Public Preview Feature List.
Prerequisites
- Ensure your Cassandra or ScyllaDB cluster is accessible from RisingWave.
- If you are running RisingWave locally from binaries and intend to use the native CDC source connectors or the JDBC sink connector, make sure that you have JDK 11 or later versions is installed in your environment.
Syntax
To sink data to Cassandra or ScyllaDB, create a Cassandra sink in RisingWave using the syntax below:
Once the sink is created, data changes will be streamed to the specified table.
Parameters
Parameter Names | Description |
---|---|
sink_name | Name of the sink to be created. |
sink_from | A clause that specifies the direct source from which data will be output. sink_from can be a materialized view or a table. Either this clause or select_query query must be specified. |
AS select_query | A SELECT query that specifies the data to be output to the sink. Either this query or a sink_from clause must be specified. See SELECT for the syntax and examples of the SELECT command. |
type | Required. Specify if the sink should be upsert or append-only. If creating an upsert sink, you must specify a primary key. |
primary_key | Optional. A string of a list of column names, separated by commas, that specifies the primary key of the Cassandra sink. |
force_append_only | If true, forces the sink to be append-only, even if it cannot be. |
cassandra.url | Required. The URL or IP address of the Cassandra or ScyllaDB cluster or node you want to connect to. |
cassandra.keyspace | Required. The name of the keyspace within the Cassandra database or ScyllaDB where you want to store the data. A keyspace is a logical container for organizing data in Cassandra. |
cassandra.table | Required. The name of the table in the specified keyspace where you want to insert or update the data. |
cassandra.datacenter | Required. The name of the datacenter within the Cassandra or ScyllaDB. You can set it in Cassandra or ScyllaDB. If not specified, the default value is datacenter1. |
cassandra.max_batch_rows | Optional. The number of batch rows sent at a time. The value must be between 1 and 65535. The default value is 512. |
cassandra.request_timeout_ms | Optional. The waiting time for each batch. The default value is 2000. It is recommended to reduce batch size first before trying to change the waiting time. |
cassandra.username | Optional. The username for Cassandra login. Ensure you have the necessary permissions. |
cassandra.password | Optional. The password for Cassandra login. Ensure that you have the required permissions. |
Data type mapping - RisingWave and Cassandra
RisingWave Data Type | Cassandra Data Type |
---|---|
boolean | boolean |
smallint | smallint |
integer | int |
bigint | bigint |
numeric | decimal |
real | float |
double precision | double |
character varying (varchar) | text |
bytea | blob |
date | date |
time without time zone | time |
timestamp without time zone | unsupported. You need to convert timestamp to timestamptz in RisingWave before sinking. |
timestamp with time zone | timestamp |
interval | duration |
struct | unsupported |
array | unsupported |
JSONB | unsupported |