In this hands-on workshop, we’ll learn how to process real-time streaming data using SQL in RisingWave. The system we’ll use is RisingWave, an open-source SQL database for processing and managing streaming data. You may not feel unfamiliar with RisingWave’s user experience, as it’s fully wire compatible with PostgreSQL.
We’ll cover the following topics in this Workshop:
- Why Stream Processing?
- Stateless computation (Filters, Projections)
- Stateful Computation (Aggregations, Joins)
- Time windowing
- Watermark
- Data Ingestion and Delivery
RisingWave in 10 Minutes: https://tutorials.risingwave.com/docs/intro
The contents below will be covered during the workshop.
What is RisingWave meant to be used for?
- OLTP workloads.
- Adhoc OLAP Workloads.
- Stream Processing.
What is the interface which RisingWave supports?
- Java SDK.
- PostgreSQL like interface.
- Rust SDK.
- Python SDK.
What if I want to run a custom function which RisingWave does not support?
- Sink the data out, run the function, and sink it back in.
- Write a Python / Java / WASM / JS UDF.
Is this statement True or False?
I cannot create materialized views on top of other materialized views.
-
True
-
False
How does RisingWave process ingested data?
- Incrementally, only on checkpoints.
- In batch, each time a user queries a materialized view.
- In batch, at fixed intervals.
- Incrementally, as new records are ingested.
Is the following Statement True or False:
RisingWave is only for Stream Processing, it cannot serve any select requests from applications.
- True
- False
Why can’t we use cross joins in RisingWave Materialized Views?
- Because they are not supported by the SQL standard.
- Because they are not supported by the Incremental View Maintenance algorithm.
- Because they are not supported by the PostgreSQL planner.
- Because they are too expensive, so it is banned in RisingWave’s stream engine.
What is the recommended way to view the progress of long-running SQL
statements like CREATE MATERIALIZED VIEW
in RisingWave?
- Using the EXPLAIN ANALYZE statement.
- Querying the
rw_catalog.rw_ddl_progress
table. - Checking the RisingWave logs.
- It is not possible to view the progress of such statements.
How do I view the execution plan of my SQL query?
SHOW <query>
EXPLAIN <query>
DROP <query>
VIEW <query>
Which is used to ingest data from external systems?
1.CREATE SOURCE <...>
2.CREATE SINK <...>
What is the purpose of a watermark?
- To specify the time at which a record was ingested.
- To specify the time at which a record was updated.
- To specify the time at which a record was deleted.
- To specify the time at which a record is considered stale, and can be deleted.
- Form for submitting: TBA
- You can submit your homework multiple times. In this case, only the last submission will be used.
Deadline: TBA