Skip to content

Commit

Permalink
Merge branch 'main' into fix/update-upstream-fragment-4-replace
Browse files Browse the repository at this point in the history
  • Loading branch information
yezizp2012 authored Jan 8, 2024
2 parents 8daca2f + 414c6ec commit baa3053
Show file tree
Hide file tree
Showing 135 changed files with 3,647 additions and 4,162 deletions.
89 changes: 89 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ hashbrown = { version = "0.14.0", features = [
criterion = { version = "0.5", features = ["async_futures"] }
tonic = { package = "madsim-tonic", version = "0.4.1" }
tonic-build = { package = "madsim-tonic-build", version = "0.4.2" }
otlp-embedded = { git = "https://github.com/risingwavelabs/otlp-embedded", rev = "58c1f003484449d7c6dd693b348bf19dd44889cb" }
prost = { version = "0.12" }
icelake = { git = "https://github.com/icelake-io/icelake", rev = "3f7b53ba5b563524212c25810345d1314678e7fc", features = [
"prometheus",
Expand Down
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,8 @@
</a>
</div>
RisingWave is a distributed SQL streaming database that enables <b>simple</b>, <b>efficient</b>, and <b>reliable</b> processing of streaming data.
RisingWave is a distributed SQL streaming database engineered to provide the <i><b>simplest</b></i> and <i><b>most cost-efficient</b></i> approach for <b>processing</b> and <b>managing</b> streaming data with utmost reliability.


![RisingWave](https://github.com/risingwavelabs/risingwave-docs/blob/main/docs/images/new_archi_grey.png)

Expand Down Expand Up @@ -98,7 +99,7 @@ For **Kubernetes deployments**, please refer to [Kubernetes with Helm](https://d

## Why RisingWave for stream processing?

RisingWave specializes in providing **incrementally updated, consistent materialized views** — a persistent data structure that represents the results of stream processing. RisingWave significantly reduces the complexity of building stream processing applications by allowing developers to express intricate stream processing logic through cascaded materialized views. Furthermore, it allows users to persist data directly within the system, eliminating the need to deliver results to external databases for storage and query serving.
RisingWave provides users with a comprehensive set of frequently used stream processing features, including exactly-once consistency, [time window functions](https://docs.risingwave.com/docs/current/sql-function-time-window/), [watermarks](https://docs.risingwave.com/docs/current/watermarks/), and more. It specializes in providing **incrementally updated, consistent materialized views** — a persistent data structure that represents the results of stream processing. RisingWave significantly reduces the complexity of building stream processing applications by allowing developers to express intricate stream processing logic through cascaded materialized views. Furthermore, it allows users to persist data directly within the system, eliminating the need to deliver results to external databases for storage and query serving.

![Real-time Data Pipelines without or with RisingWave](https://github.com/risingwavelabs/risingwave/assets/100685635/414afbb7-5187-410f-9ba4-9a640c8c6306)

Expand All @@ -122,14 +123,25 @@ Compared to existing stream processing systems like [Apache Flink](https://flink
* **Instant failure recovery**
* RisingWave's state management mechanism also allows it to recover from failure in seconds, not minutes or hours.

### RisingWave as a database
RisingWave is fundamentally a database that **extends beyond basic streaming data processing capabilities**. It excels in **the effective management of streaming data**, making it a trusted choice for data persistence and powering online applications. RisingWave offers an extensive range of database capabilities, which include:

* High availability
* Serving highly concurrent queries
* Role-based access control (RBAC)
* Integration with data modeling tools, such as [dbt](https://docs.risingwave.com/docs/current/use-dbt/)
* Integration with database management tools, such as [Dbeaver](https://docs.risingwave.com/docs/current/dbeaver-integration/)
* Integration with BI tools, such as [Grafana](https://docs.risingwave.com/docs/current/grafana-integration/)
* Schema change
* Processing of semi-structured data


## RisingWave's limitations
RisingWave isn’t a panacea for all data engineering hurdles. It has its own set of limitations:
* **No programmable interfaces**
* RisingWave does not provide low-level APIs in languages like Java and Scala, and does not allow users to manage internal states manually (unless you want to hack!). For coding in Java, Scala, and other languages, please consider using RisingWave's User-Defined Functions (UDF).
* RisingWave does not provide low-level APIs in languages like Java and Scala, and does not allow users to manage internal states manually (unless you want to hack!). _For coding in Java, Python, and other languages, please consider using RisingWave's [User-Defined Functions (UDF)](https://docs.risingwave.com/docs/current/user-defined-functions/)_.
* **No support for transaction processing**
* RisingWave isn’t cut out for transactional workloads, thus it’s not a viable substitute for operational databases dedicated to transaction processing. However, it supports read-only transactions, ensuring data freshness and consistency. It also comprehends the transactional semantics of upstream database Change Data Capture (CDC).
* **Not tailored for ad-hoc analytical queries**
* RisingWave's row store design is tailored for optimal stream processing performance rather than interactive analytical workloads. Hence, it's not a suitable replacement for OLAP databases. Yet, a reliable integration with many OLAP databases exists, and a collaborative use of RisingWave and OLAP databases is a common practice among many users.
* RisingWave isn’t cut out for transactional workloads, thus it’s not a viable substitute for operational databases dedicated to transaction processing. _However, it supports [read-only transactions](https://docs.risingwave.com/docs/current/transactions/#read-only-transactions), ensuring data freshness and consistency. It also comprehends the transactional semantics of upstream database [Change Data Capture (CDC)](https://docs.risingwave.com/docs/current/transactions/#transactions-within-a-cdc-table)_.


## In-production use cases
Expand Down
2 changes: 2 additions & 0 deletions ci/scripts/notify.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"test-notify-2": ["noelkwan", "noelkwan"],
"backfill-tests": ["noelkwan"],
"backwards-compat-tests": ["noelkwan"],
"sqlsmith-differential-tests": ["noelkwan"],
"fuzz-test": ["noelkwan"],
"e2e-test-release": ["zhi"],
"e2e-iceberg-sink-tests": ["renjie"],
Expand Down Expand Up @@ -98,6 +99,7 @@ def get_mock_test_status(test):
"e2e-clickhouse-sink-tests": "hard_failed",
"e2e-pulsar-sink-tests": "",
"s3-source-test-for-opendal-fs-engine": "",
"s3-source-tests": "",
"pulsar-source-tests": "",
"connector-node-integration-test": ""
}
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/s3-source-test-for-opendal-fs-engine.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ cargo make ci-start ci-3cn-3fe-opendal-fs-backend

echo "--- Run test"
python3 -m pip install minio psycopg2-binary
python3 e2e_test/s3/$script.py
python3 e2e_test/s3/$script

echo "--- Kill cluster"
rm -rf /tmp/rw_ci
Expand Down
28 changes: 26 additions & 2 deletions ci/workflows/main-cron.yml
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,7 @@ steps:
retry: *auto-retry

- label: "PosixFs source on OpenDAL fs engine (csv parser)"
key: "s3-source-test-for-opendal-fs-engine-csv-parser"
command: "ci/scripts/s3-source-test.sh -p ci-release -s 'posix_fs_source.py csv_without_header'"
if: |
!(build.pull_request.labels includes "ci/main-cron/skip-ci") && build.env("CI_STEPS") == null
Expand All @@ -507,7 +508,7 @@ steps:

- label: "S3 source on OpenDAL fs engine"
key: "s3-source-test-for-opendal-fs-engine"
command: "ci/scripts/s3-source-test-for-opendal-fs-engine.sh -p ci-release -s run"
command: "ci/scripts/s3-source-test-for-opendal-fs-engine.sh -p ci-release -s run.csv"
if: |
!(build.pull_request.labels includes "ci/main-cron/skip-ci") && build.env("CI_STEPS") == null
|| build.pull_request.labels includes "ci/run-s3-source-tests"
Expand All @@ -527,6 +528,29 @@ steps:
timeout_in_minutes: 20
retry: *auto-retry

# TODO(Kexiang): Enable this test after we have a GCS_SOURCE_TEST_CONF.
# - label: "GCS source on OpenDAL fs engine"
# key: "s3-source-test-for-opendal-fs-engine"
# command: "ci/scripts/s3-source-test-for-opendal-fs-engine.sh -p ci-release -s gcs.csv"
# if: |
# !(build.pull_request.labels includes "ci/main-cron/skip-ci") && build.env("CI_STEPS") == null
# || build.pull_request.labels includes "ci/run-s3-source-tests"
# || build.env("CI_STEPS") =~ /(^|,)s3-source-tests?(,|$$)/
# depends_on: build
# plugins:
# - seek-oss/aws-sm#v2.3.1:
# env:
# S3_SOURCE_TEST_CONF: ci_s3_source_test_aws
# - docker-compose#v4.9.0:
# run: rw-build-env
# config: ci/docker-compose.yml
# mount-buildkite-agent: true
# environment:
# - S3_SOURCE_TEST_CONF
# - ./ci/plugins/upload-failure-logs
# timeout_in_minutes: 20
# retry: *auto-retry

- label: "pulsar source check"
key: "pulsar-source-tests"
command: "ci/scripts/pulsar-source-test.sh -p ci-release"
Expand Down Expand Up @@ -626,6 +650,7 @@ steps:

# Sqlsmith differential testing
- label: "Sqlsmith Differential Testing"
key: "sqlsmith-differential-tests"
command: "ci/scripts/sqlsmith-differential-test.sh -p ci-release"
if: |
!(build.pull_request.labels includes "ci/main-cron/skip-ci") && build.env("CI_STEPS") == null
Expand All @@ -639,7 +664,6 @@ steps:
config: ci/docker-compose.yml
mount-buildkite-agent: true
timeout_in_minutes: 40
soft_fail: true

- label: "Backfill tests"
key: "backfill-tests"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@ import * as d3 from "d3"
import { Dag, DagLink, DagNode, zherebko } from "d3-dag"
import { cloneDeep } from "lodash"
import { useCallback, useEffect, useRef, useState } from "react"
import { Position } from "../lib/layout"

const nodeRadius = 5
const edgeRadius = 12

export default function DependencyGraph({
export default function FragmentDependencyGraph({
mvDependency,
svgWidth,
selectedId,
Expand All @@ -18,7 +19,7 @@ export default function DependencyGraph({
selectedId: string | undefined
onSelectedIdChange: (id: string) => void | undefined
}) {
const svgRef = useRef<any>()
const svgRef = useRef<SVGSVGElement>(null)
const [svgHeight, setSvgHeight] = useState("0px")
const MARGIN_X = 10
const MARGIN_Y = 2
Expand Down Expand Up @@ -47,7 +48,7 @@ export default function DependencyGraph({
// How to draw edges
const curveStyle = d3.curveMonotoneY
const line = d3
.line<{ x: number; y: number }>()
.line<Position>()
.curve(curveStyle)
.x(({ x }) => x + MARGIN_X)
.y(({ y }) => y)
Expand Down Expand Up @@ -85,8 +86,7 @@ export default function DependencyGraph({
sel
.attr(
"transform",
({ x, y }: { x: number; y: number }) =>
`translate(${x + MARGIN_X}, ${y})`
({ x, y }: Position) => `translate(${x + MARGIN_X}, ${y})`
)
.attr("fill", (d: any) =>
isSelected(d) ? theme.colors.blue["500"] : theme.colors.gray["500"]
Expand Down
Loading

0 comments on commit baa3053

Please sign in to comment.