add error_record_table #68

TennyZhuang · 2023-07-31T07:57:49Z

Better name are welcome!

Signed-off-by: TennyZhuang <[email protected]>

xxchan · 2023-07-31T13:02:12Z

rfcs/0068-error-record-table.md

+
+## Summary
+
+Our current streaming engine does not help users to discover, debug, and handle errors well. When user met an data record error, they can only find a log record like ``ExprError: Parse error: expected `,` or `]` at line 1 column 10 (ProjectExecutor: fragment_id=19007)``.


FWIW, currently we also have error reporting to Prometheus (risingwavelabs/risingwave#7824), but it do not contains original data (seemly for the same reason described in the Alternatives section)

@neverchanje proposed similar ideas in risingwavelabs/risingwave#7803

But when it comes to production deployment, we will need to provide an option that allows users to persist lost messages somewhere (maybe in Hummock or a log store) so that they can search for what reason and what data were lost (maybe through ElasticSearch).

Closed as "we are now able to find the errors from Grafana", but we haven't supported "allow users to persist lost messages" yet.

Yeah. According to our discussion today, I feel that the major motivation of this RFC is to provide "dead letter table" per operator, so that user can find the raw records that caused the problem.

I totally agree with the idea but the approach seems to be too difficult for a common user, especially when they faces tens of tables and they can hardly tell which table to look into.

Signed-off-by: TennyZhuang <[email protected]>

kwannoel · 2023-08-02T06:06:08Z

rfcs/0068-error-record-table.md

+
+1. `id bigint`: The ID can be generated by the similar method like `row_id` (vnode + local monotical ID).
+2. `error_reason varchar`: A human-readable error message.
+


Do we also have a column for the sql? Or store it in the error_reason col?

IMO we may store error ID instead of varchar here?

Then store error_reasons in a system table.

Do we also have a column for the sql? Or store it in the error_reason col?

Store the mview ID.

IMO we may store error ID instead of varchar here?

Yes we can, but have to deal with backwards compat issues, in case users construct their own mapping logic using the error ids.

fuyufjh · 2023-08-02T06:07:45Z

rfcs/0068-error-record-table.md

+
+### Creating
+
+The ERTs are automatically created as internal tables when an operator is created. In most cases, an operator will have n ERTs, where n corresponds to the number of inputs it has.


So how can a user get all errors happened in a MV?

I think users are not supposed to under the distributed execution plan...

Also, I think it sounds too heavy to keep one error table for each operator. I tend to keep one single error table for the whole database.

+1. This inspires me that it makes stateless operators like Project and Filter contain state tables as well. We even have to handle stuff related to distributed execution for them, like scaling or plan evolution. From this perspective, it seems too invasive to me.

+1. This inspires me that it makes stateless operators like Project and Filter contain state tables as well. We even have to handle stuff related to distributed execution for them, like scaling or plan evolution. From this perspective, it seems too invasive to me.

~~For scaling I think it should be less of a problem if we follow's Eric's approach of uuid + singleton distribution.~~

Edit: Sorry I realized that scale-out still need to create new state table, and scale in need to merge state table.

tabVersion · 2023-08-02T06:06:20Z

rfcs/0068-error-record-table.md

+1. `id bigint`: The ID can be generated by the similar method like `row_id` (vnode + local monotical ID).
+2. `error_reason varchar`: A human-readable error message.


also error level: warn / error / critical

tabVersion · 2023-08-02T06:09:58Z

rfcs/0068-error-record-table.md

+
+### Naming
+
+Same as other internal tables while suffixed by `error_{seq}`.


what's the scope of ERT (one ERT per exec / per fragment)

BugenZhao · 2023-08-02T06:10:13Z

rfcs/0068-error-record-table.md

+
+### Creating
+
+The ERTs are automatically created as internal tables when an operator is created. In most cases, an operator will have n ERTs, where n corresponds to the number of inputs it has.


Do we have any implementation details in this RFC? For example, how will this change the interface of expression evaluation? Will we have to pass the new StateTable for error records everywhere?

BugenZhao · 2023-08-02T07:09:18Z

rfcs/0068-error-record-table.md

+
+### Data correction
+
+ERT could potentially be used to correct data, for example, users could clean up the data within ERT and then reimport it into the source.


The record in ERT could be an intermediate result of arbitrary operators in the plan, so it might be difficult for users to recognize or understand the values (unless he is an expert of RisingWave and knows the execution details). So I'm afraid users are not always able to operate on these "discarded" records, not to mention to correct them, except for some special ones like the errors for source parsing or type casting.

fuyufjh · 2023-08-02T09:50:12Z

After some extra thinking, I would like to propose to use a single error_table rather than one error_table per operator.

As mentioned in the RFC, our targets include

We can ensure that our storage engine can handle the volume of erroneous data, as it is of the same magnitude as the source.

We can ensure the error records are durable.

Users can view the error records directly over psql.

Users can reproduce the error easily by the similar SQL.

In short, turning error_table per operator into a single error_table will basically enhance 3 a lot and slightly impair 4. Additionally, it avoids introducing lots of error_tables, which will be confusing for most people that don't really care errors; and keeps backward compatibility.

I'll explain these pros and cons one by one.

"Enhance 3: Users can view the error records directly over psql"

In the single-table approach, the users only need to query the rw_error_table and he will get all errors happened recently. This is quite intuitive to anyone.

Instead, in the error_table-per-operator fashion, we might need to provide an extra 'view' for that purpose. Note that this is not an actual 'view' because when a new materialized view appears, the view needs to be updated accordingly, which is somehow dirty to me.

"Slightly impair 4: Users can reproduce the error easily by the similar SQL."

In most cases, users don't need to really reproduce it in RisingWave (except it's RW's bug, but that's another story). For example, as a user, if I encountered a "JSON Parse" exception, It would be enough to tell me the record id and I can check the JSON in upstream by myself. Furthermore, I'll try to fix that data and optionally emit the record into Kafka again to let RisingWave consume it. Reproducing it in RisingWave side doesn't help me a lot.

"It avoids introducing lots of error_tables"

This is ugly both from user side and for us.

From user side, our SHOW ALL TABLES command is supposed to show all internal state tables. It just looks confusing if more than half of them are xxx_error_record_tables.

For us, we now assume a table is one of these 3 types: tables, materialized views and streaming states. The xxx_error_record_tables will be a new kind and creates lots of related things: compaction group, monitoring metrics, etc.

"Keeps backward compatibility"

Because their will be only one error_table, we can simply hard-code this table in system catalog. That means that we don't need to change the plan node of existing fragments, limiting the changes only in operator's code implementation.

Finally, since we have discussed that we will write into "error_record_tables" with blind write and singleton distribution (which means the writers aren't aware of any distribution, simply wirte with a random unique ID), both approach have similar complexity in terms of the core implementation (i.e. write-path).

kwannoel · 2023-08-03T02:47:09Z

"Enhance 3: Users can view the error records directly over psql"

In the single-table approach, the users only need to query the rw_error_table and he will get all errors happened recently. This is quite intuitive to anyone.

Instead, in the error_table-per-operator fashion, we might need to provide an extra 'view' for that purpose. Note that this is not an actual 'view' because when a new materialized view appears, the view needs to be updated accordingly, which is somehow dirty to me.

For 3. I didn't quite get it. Why don't we just CREATE VIEW to simulate the rw_error_table which contains all recent errors. Rather than CREATE MATERIALIZED VIEW. This runs the computation adhoc, rather than incrementally?

Finally, since we have discussed that we will write into "error_record_tables" with blind write and singleton distribution (which means the writers aren't aware of any distribution, simply write with a random unique ID), both approach have similar complexity in terms of the core implementation (i.e. write-path).

How do we synchronize access to the state table? Or we use uuid here to synchronize, instead of row_id? If yes that makes sense 👍

Also I'm assuming the error record will be stored as jsonb? Is our jsonb implementation still fixed size, so it could lead to data loss?
I guess another advantage of ERT per executor is that we won't encounter this?

fuyufjh · 2023-08-03T07:03:19Z

Instead, in the error_table-per-operator fashion, we might need to provide an extra 'view' for that purpose. Note that this is not an actual 'view' because when a new materialized view appears, the view needs to be updated accordingly, which is somehow dirty to me.

For 3. I didn't quite get it. Why don't we just CREATE VIEW to simulate the rw_error_table which contains all recent errors. Rather than CREATE MATERIALIZED VIEW. This runs the computation adhoc, rather than incrementally?

No, I'm not using MATERIALIZED VIEW.

A "View" in database is defined by a SQL query: create view <view_name> as <query>. However, here we have multiple error_tables with different schema:

table name                   schema
error_table_of_operator_a    col1 (int)  col2 (varchar)   error_message (varchar)
error_table_of_operator_b    col1 (int)  col2 (decimal) col3 (varchar)  error_message (varchar)
error_table_of_operator_c    col1 (varchar)  col2 (varchar) col3 (int)  error_message (varchar)

If we want to query from all error tables, then the query would be like:

/* note the columns are lost because they can't be merged into one result set */
select error_message from error_table_of_operator_a
union all
select error_message from error_table_of_operator_b
union all
select error_message from error_table_of_operator_c

Furthermore, if materialized view is created or dropped, you need to update the view accordingly

select error_message from error_table_of_operator_a
union all
select error_message from error_table_of_operator_b
union all
select error_message from error_table_of_operator_c
union all
select error_message from error_table_of_operator_d -- d is a new operator

That's why I said it's kind of dirty - We need to hook the DDL processes. Or, alternatively, give up the SQL approach and provide a hard-code access way. This is also dirty because we need to construct such a query plan in hard-code way, instead of running normal optimizer process. 😇

fuyufjh · 2023-08-03T07:10:23Z

How do we synchronize access to the state table? Or we use uuid here to synchronize, instead of row_id? If yes that makes sense 👍

Yes, uuid should be the most practical solution.

Also I'm assuming the error record will be stored as jsonb? Is our jsonb implementation still fixed size, so it could lead to data loss?

Yes, JSONB is one of the option. Technically there is no reason of data loss. Actually I don't really care whether the record is kept losslessly, I think it's okay to truncate records to limit the size, because the error is only supposed for human users to read.

I guess another advantage of ERT per executor is that we won't encounter this?

True.

recommit

4d45bf4

Signed-off-by: TennyZhuang <[email protected]>

TennyZhuang force-pushed the error-record-table branch from 0a184f2 to 4d45bf4 Compare July 31, 2023 08:06

xxchan reviewed Jul 31, 2023

View reviewed changes

refine

1f1b70c

Signed-off-by: TennyZhuang <[email protected]>

kwannoel reviewed Aug 2, 2023

View reviewed changes

fuyufjh reviewed Aug 2, 2023

View reviewed changes

tabVersion reviewed Aug 2, 2023

View reviewed changes

BugenZhao reviewed Aug 2, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add error_record_table #68

add error_record_table #68

TennyZhuang commented Jul 31, 2023 •

edited

Loading

xxchan Jul 31, 2023

xxchan Jul 31, 2023

fuyufjh Aug 2, 2023

kwannoel Aug 2, 2023 •

edited

Loading

kwannoel Aug 2, 2023

kwannoel Aug 2, 2023

kwannoel Aug 2, 2023 •

edited

Loading

kwannoel Aug 2, 2023

fuyufjh Aug 2, 2023 •

edited

Loading

fuyufjh Aug 2, 2023

BugenZhao Aug 2, 2023 •

edited

Loading

kwannoel Aug 3, 2023 •

edited

Loading

tabVersion Aug 2, 2023

tabVersion Aug 2, 2023

BugenZhao Aug 2, 2023

BugenZhao Aug 2, 2023

fuyufjh commented Aug 2, 2023

kwannoel commented Aug 3, 2023

fuyufjh commented Aug 3, 2023 •

edited

Loading

fuyufjh commented Aug 3, 2023


		## Summary

		Our current streaming engine does not help users to discover, debug, and handle errors well. When user met an data record error, they can only find a log record like ``ExprError: Parse error: expected `,` or `]` at line 1 column 10 (ProjectExecutor: fragment_id=19007)``.


		1. `id bigint`: The ID can be generated by the similar method like `row_id` (vnode + local monotical ID).
		2. `error_reason varchar`: A human-readable error message.


		### Creating

		The ERTs are automatically created as internal tables when an operator is created. In most cases, an operator will have n ERTs, where n corresponds to the number of inputs it has.


		### Naming

		Same as other internal tables while suffixed by `error_{seq}`.


		### Data correction

		ERT could potentially be used to correct data, for example, users could clean up the data within ERT and then reimport it into the source.

add error_record_table #68

Are you sure you want to change the base?

add error_record_table #68

Conversation

TennyZhuang commented Jul 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwannoel Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwannoel Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fuyufjh Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BugenZhao Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

kwannoel Aug 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fuyufjh commented Aug 2, 2023

kwannoel commented Aug 3, 2023

fuyufjh commented Aug 3, 2023 • edited Loading

fuyufjh commented Aug 3, 2023

TennyZhuang commented Jul 31, 2023 •

edited

Loading

kwannoel Aug 2, 2023 •

edited

Loading

kwannoel Aug 2, 2023 •

edited

Loading

fuyufjh Aug 2, 2023 •

edited

Loading

BugenZhao Aug 2, 2023 •

edited

Loading

kwannoel Aug 3, 2023 •

edited

Loading

fuyufjh commented Aug 3, 2023 •

edited

Loading