From 545021b502551c4855c2c91ef75263f85500c2ea Mon Sep 17 00:00:00 2001
From: Richard Chien <stdrc@outlook.com>
Date: Tue, 10 Dec 2024 14:06:28 +0800
Subject: [PATCH 1/4] complete source format table

Signed-off-by: Richard Chien <stdrc@outlook.com>
---
 ingestion/supported-sources-and-formats.mdx | 50 +++++++++++++++------
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/ingestion/supported-sources-and-formats.mdx b/ingestion/supported-sources-and-formats.mdx
index 7079f957..5d563c72 100644
--- a/ingestion/supported-sources-and-formats.mdx
+++ b/ingestion/supported-sources-and-formats.mdx
@@ -12,17 +12,25 @@ To ingest data in formats marked with "T", you need to create tables (with conne
 
 | Connector       | Version     | Format               |
 | :------------ | :------------ | :------------------- |
-| [Kafka](/integrations/sources/kafka)                   | 3.1.0 or later versions                                                                                                                                    | [Avro](#avro), [JSON](#json), [protobuf](#protobuf), [Debezium JSON](#debezium-json) (T), [Debezium AVRO](#debezium-avro) (T), [DEBEZIUM\_MONGO\_JSON](#debezium-mongo-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T), [Upsert JSON](#upsert-json) (T), [Upsert AVRO](#upsert-avro) (T), [Bytes](#bytes) |
-| [Redpanda](/integrations/sources/redpanda)             | Latest                                                                                                                                                     | [Avro](#avro), [JSON](#json), [protobuf](#protobuf)                                                                                                                                                                                                                                                                                   |
-| [Pulsar](/integrations/sources/pulsar)                 | 2.8.0 or later versions                                                                                                                                    | [Avro](#avro), [JSON](#json), [protobuf](#protobuf), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T)                                                                                                                                                                            |
-| [Kinesis](/integrations/sources/kinesis)               | Latest                                                                                                                                                     | [Avro](#avro), [JSON](#json), [protobuf](#protobuf), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T)                                                                                                                                                                            |
-| [PostgreSQL CDC](/integrations/sources/postgresql-cdc)   | 10, 11, 12, 13, 14                                                                                                                                         | [Debezium JSON](#debezium-json) (T)                                                                                                                                                                                                                                                                                                   |
-| [MySQL CDC](/integrations/sources/mysql-cdc)           | 5.7, 8.0                                                                                                                                                   | [Debezium JSON](#debezium-json) (T)                                                                                                                                                                                                                                                                                                   |
-| [CDC via Kafka](/ingestion/change-data-capture-with-risingwave)             |       |  [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T)        |
-| [Amazon S3](/integrations/sources/s3)                  | Latest                                                                                                                                                     | [JSON](#json), CSV                                                                                                                                                                                                                                                                                                                    |
-| [Load generator](/ingestion/generate-test-data)        | Built-in                                                                                                                                                   | [JSON](#json)                                                                                                                                                                                                                                                                                                                         |
-| [Google Pub/Sub](/integrations/sources/google-pub-sub)  |  |   [Avro](#avro), [JSON](#json), [protobuf](#protobuf), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T)            |
-| [Google Cloud Storage](/integrations/sources/google-cloud-storage)      |               |      [JSON](#json)    |
+| [Kafka](/integrations/sources/kafka) | 3.1.0 or later versions | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro), [Bytes](#bytes), [CSV](#csv), [Upsert JSON](#upsert-json) (T), [Upsert Avro](#upsert-avro) (T), Upsert Protobuf (T), [Debezium JSON](#debezium-json) (T), [Debezium Avro](#debezium-avro) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T), [Debezium Mongo JSON](#debezium-mongo-json) (T) |
+| [Redpanda](/integrations/sources/redpanda) | Latest | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro) |
+| [Pulsar](/integrations/sources/pulsar) | 2.8.0 or later versions | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro), [Bytes](#bytes), [Upsert JSON](#upsert-json) (T), [Upsert Avro](#upsert-avro) (T), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
+| [Kinesis](/integrations/sources/kinesis) | Latest | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro), [Bytes](#bytes), [Upsert JSON](#upsert-json) (T), [Upsert Avro](#upsert-avro) (T), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
+| [PostgreSQL CDC](/integrations/sources/postgresql-cdc) | 10, 11, 12, 13, 14 | [Debezium JSON](#debezium-json) (T) |
+| [MySQL CDC](/integrations/sources/mysql-cdc) | 5.7, 8.0 | [Debezium JSON](#debezium-json) (T) |
+| [SQL Server CDC](/integrations/sources/sql-server-cdc) | 2019, 2022 | [Debezium JSON](#debezium-json) (T) |
+| [MongoDB CDC](/integrations/sources/mongodb-cdc) |  | [Debezium Mongo JSON](#debezium-mongo-json) (T) |
+| [Citus CDC](/integrations/sources/citus-cdc) | 10.2 | [Debezium JSON](#debezium-json) (T) |
+| [CDC via Kafka](/ingestion/change-data-capture-with-risingwave) |  | [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
+| [Google Pub/Sub](/integrations/sources/google-pub-sub) |  | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro), [Bytes](#bytes), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
+| [Amazon S3](/integrations/sources/s3) | Latest | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
+| [Google Cloud Storage](/integrations/sources/google-cloud-storage) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
+| [Azure Blob](/integrations/sources/azure-blob) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
+{/* | [POSIX File System]() |  | [CSV](#csv) | */}
+| [NATS JetStream](/integrations/sources/nats-jetstream) |  | [JSON](#json), [Protobuf](#protobuf), [Bytes](#bytes) |
+| [MQTT](/integrations/sources/mqtt) |  | [JSON](#json), [Bytes](#bytes) |
+| [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMAT` |
+| [Load generator](/ingestion/generate-test-data) | Built-in | [JSON](#json) |
 
 <Note>
 When a source is created, RisingWave does not ingest data immediately. RisingWave starts to process data when a materialized view is created based on the source.
@@ -72,7 +80,7 @@ FORMAT PLAIN
 ENCODE BYTES
 ```
 
-### Debezium AVRO
+### Debezium Avro
 
 When creating a source from streams in with Debezium AVRO, the schema of the source does not need to be defined in the `CREATE TABLE` statement as it can be inferred from the `SCHEMA REGISTRY`. This means that the schema file location must be specified. The schema file location can be an actual Web location, which is in `http://...`, `https://...`, or `S3://...` format, or a Confluent Schema Registry. For more details about using Schema Registry for Kafka data, see [Read schema from Schema Registry](/integrations/sources/kafka#read-schemas-from-confluent-schema-registry).
 
@@ -190,11 +198,26 @@ ENCODE JSON [ (
    ) ]
 ```
 
+### CSV
+
+To consume data in CSV format, you can use `ENCODE PLAIN FORMAT CSV` with options. Configurable options include `delimiter` and `without_header`.
+
+Syntax:
+
+```sql
+FORMAT PLAIN
+ENCODE CSV (
+    delimiter = 'delimiter',
+    without_header = 'false' | 'true'
+)
+```
+
+The `delimiter` option is required, while the `without_header` option is optional, with a default value of `false`.
+
 ### Parquet
 
 Parquet format allows you to efficiently store and retrieve large datasets by utilizing a columnar storage architecture. RisingWave supports reading Parquet files from object storage systems including Amazon S3, Google Cloud Storage (GCS), and Azure Blob Storage.
 
-
 Syntax:
 
 ```sql
@@ -230,7 +253,6 @@ ENCODE PROTOBUF (
 
 For more information on supported protobuf types, refer to [Supported protobuf types](/sql/data-types/supported-protobuf-types).
 
-
 ## General parameters for supported formats
 
 Here are some notes regarding parameters that can be applied to multiple formats supported by our systems.

From 8db8e602468effd9dda8bdca2bcbc064fe15c7c5 Mon Sep 17 00:00:00 2001
From: Richard Chien <stdrc@outlook.com>
Date: Tue, 10 Dec 2024 14:08:41 +0800
Subject: [PATCH 2/4] fix

Signed-off-by: Richard Chien <stdrc@outlook.com>
---
 ingestion/supported-sources-and-formats.mdx | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ingestion/supported-sources-and-formats.mdx b/ingestion/supported-sources-and-formats.mdx
index 5d563c72..7009ff67 100644
--- a/ingestion/supported-sources-and-formats.mdx
+++ b/ingestion/supported-sources-and-formats.mdx
@@ -26,10 +26,9 @@ To ingest data in formats marked with "T", you need to create tables (with conne
 | [Amazon S3](/integrations/sources/s3) | Latest | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
 | [Google Cloud Storage](/integrations/sources/google-cloud-storage) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
 | [Azure Blob](/integrations/sources/azure-blob) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
-{/* | [POSIX File System]() |  | [CSV](#csv) | */}
 | [NATS JetStream](/integrations/sources/nats-jetstream) |  | [JSON](#json), [Protobuf](#protobuf), [Bytes](#bytes) |
 | [MQTT](/integrations/sources/mqtt) |  | [JSON](#json), [Bytes](#bytes) |
-| [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMAT` |
+| [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMA |
 | [Load generator](/ingestion/generate-test-data) | Built-in | [JSON](#json) |
 
 <Note>

From feab61fa75b55d260645c4ee34b1d9227ee25a96 Mon Sep 17 00:00:00 2001
From: Richard Chien <stdrc@outlook.com>
Date: Tue, 10 Dec 2024 14:08:56 +0800
Subject: [PATCH 3/4] fix

Signed-off-by: Richard Chien <stdrc@outlook.com>
---
 ingestion/supported-sources-and-formats.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ingestion/supported-sources-and-formats.mdx b/ingestion/supported-sources-and-formats.mdx
index 7009ff67..ba98f083 100644
--- a/ingestion/supported-sources-and-formats.mdx
+++ b/ingestion/supported-sources-and-formats.mdx
@@ -28,7 +28,7 @@ To ingest data in formats marked with "T", you need to create tables (with conne
 | [Azure Blob](/integrations/sources/azure-blob) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
 | [NATS JetStream](/integrations/sources/nats-jetstream) |  | [JSON](#json), [Protobuf](#protobuf), [Bytes](#bytes) |
 | [MQTT](/integrations/sources/mqtt) |  | [JSON](#json), [Bytes](#bytes) |
-| [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMA |
+| [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMAT` |
 | [Load generator](/ingestion/generate-test-data) | Built-in | [JSON](#json) |
 
 <Note>

From a9a1c483ab7313b59db8f90ad98dcc4ab52a47d5 Mon Sep 17 00:00:00 2001
From: Richard Chien <stdrc@outlook.com>
Date: Tue, 10 Dec 2024 14:10:37 +0800
Subject: [PATCH 4/4] add version for azblob and gcs

Signed-off-by: Richard Chien <stdrc@outlook.com>
---
 ingestion/supported-sources-and-formats.mdx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ingestion/supported-sources-and-formats.mdx b/ingestion/supported-sources-and-formats.mdx
index ba98f083..a565f405 100644
--- a/ingestion/supported-sources-and-formats.mdx
+++ b/ingestion/supported-sources-and-formats.mdx
@@ -24,8 +24,8 @@ To ingest data in formats marked with "T", you need to create tables (with conne
 | [CDC via Kafka](/ingestion/change-data-capture-with-risingwave) |  | [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
 | [Google Pub/Sub](/integrations/sources/google-pub-sub) |  | [JSON](#json), [Protobuf](#protobuf), [Avro](#avro), [Bytes](#bytes), [Debezium JSON](#debezium-json) (T), [Maxwell JSON](#maxwell-json) (T), [Canal JSON](#canal-json) (T) |
 | [Amazon S3](/integrations/sources/s3) | Latest | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
-| [Google Cloud Storage](/integrations/sources/google-cloud-storage) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
-| [Azure Blob](/integrations/sources/azure-blob) |  | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
+| [Google Cloud Storage](/integrations/sources/google-cloud-storage) | Latest | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
+| [Azure Blob](/integrations/sources/azure-blob) | Latest | [JSON](#json), [CSV](#csv), [Parquet](#parquet) |
 | [NATS JetStream](/integrations/sources/nats-jetstream) |  | [JSON](#json), [Protobuf](#protobuf), [Bytes](#bytes) |
 | [MQTT](/integrations/sources/mqtt) |  | [JSON](#json), [Bytes](#bytes) |
 | [Apache Iceberg](/integrations/sources/apache-iceberg) |  | No need to specify `FORMAT` |