Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division by zero exception #17131

Closed
ravenfu opened this issue Jun 5, 2024 · 12 comments
Closed

Division by zero exception #17131

ravenfu opened this issue Jun 5, 2024 · 12 comments
Labels
type/bug Something isn't working user-feedback
Milestone

Comments

@ravenfu
Copy link

ravenfu commented Jun 5, 2024

Describe the bug

I generated a record with over 15 million entries and 49 columns into a RisingWave database. When I query it, sometimes it reports "division by zero," and other times it works well.

I am sure there is no number division by zero.

I found that when it reports the exception, then if I reduce the number of columns by half, the query works. And then, if I query all the columns again, it works.

sql 时好时不好, 报除零错误

Error message/log

SQL 错误 [XX000]: ERROR: Failed to execute the statement

Caused by these errors (recent errors listed first):
  1: gRPC request to batch service failed: Internal error
  2: Expr error
  3: Division by zero

To Reproduce

No response

Expected behavior

sql is:
select
symbol,
date_time,
(bid_price0 * bid_volume0 + offer_price0 * offer_volume0) / (bid_volume0 + offer_volume0) as wap0,
(bid_price1 * bid_volume1 + offer_price1 * offer_volume1) / (bid_volume1 + offer_volume1) as wap1,
(bid_price2 * bid_volume2 + offer_price2 * offer_volume2) / (bid_volume2 + offer_volume2) as wap2,
(bid_price3 * bid_volume3 + offer_price3 * offer_volume3) / (bid_volume3 + offer_volume3) as wap3,
(bid_price4 * bid_volume4 + offer_price4 * offer_volume4) / (bid_volume4 + offer_volume4) as wap4,
(bid_price5 * bid_volume5 + offer_price5 * offer_volume5) / (bid_volume5 + offer_volume5) as wap5,
(bid_price6 * bid_volume6 + offer_price6 * offer_volume6) / (bid_volume6 + offer_volume6) as wap6,
(bid_price7 * bid_volume7 + offer_price7 * offer_volume7) / (bid_volume7 + offer_volume7) as wap7,
(bid_price8 * bid_volume8 + offer_price8 * offer_volume8) / (bid_volume8 + offer_volume8) as wap8,
(bid_price9 * bid_volume9 + offer_price9 * offer_volume9) / (bid_volume9 + offer_volume9) as wap9,
(offer_price0 - bid_price0) * 2 / (offer_price0 + bid_price0) as price_spread0,
(offer_price1 - bid_price1) * 2 / (offer_price1 + bid_price1) as price_spread1,
(offer_price2 - bid_price2) * 2 / (offer_price2 + bid_price2) as price_spread2,
(offer_price3 - bid_price3) * 2 / (offer_price3 + bid_price3) as price_spread3,
(offer_price4 - bid_price4) * 2 / (offer_price4 + bid_price4) as price_spread4,
(offer_price5 - bid_price5) * 2 / (offer_price5 + bid_price5) as price_spread5,
(offer_price6 - bid_price6) * 2 / (offer_price6 + bid_price6) as price_spread6,
(offer_price7 - bid_price7) * 2 / (offer_price7 + bid_price7) as price_spread7,
(offer_price8 - bid_price8) * 2 / (offer_price8 + bid_price8) as price_spread8,
(offer_price9 - bid_price9) * 2 / (offer_price9 + bid_price9) as price_spread9,
(bid_price0 - bid_price1) as bid_spread1,
(bid_price1 - bid_price2) as bid_spread2,
(bid_price2 - bid_price3) as bid_spread3,
(bid_price3 - bid_price4) as bid_spread4,
(bid_price4 - bid_price5) as bid_spread5,
(bid_price5 - bid_price6) as bid_spread6,
(bid_price6 - bid_price7) as bid_spread7,
(bid_price7 - bid_price8) as bid_spread8,
(bid_price8 - bid_price9) as bid_spread9,
(offer_price0 - offer_price1) as offer_spread1,
(offer_price1 - offer_price2) as offer_spread2,
(offer_price2 - offer_price3) as offer_spread3,
(offer_price3 - offer_price4) as offer_spread4,
(offer_price4 - offer_price5) as offer_spread5,
(offer_price5 - offer_price6) as offer_spread6,
(offer_price6 - offer_price7) as offer_spread7,
(offer_price7 - offer_price8) as offer_spread8,
(offer_price8 - offer_price9) as offer_spread9,
(offer_volume0+offer_volume1+offer_volume2+offer_volume3+offer_volume4++offer_volume5+offer_volume6+offer_volume7+offer_volume8+offer_volume9
+bid_volume0+bid_volume1+bid_volume2+bid_volume3+bid_volume4++bid_volume5+bid_volume6+bid_volume7+bid_volume8+bid_volume9) as total_volume,
(offer_volume0+offer_volume1+offer_volume2+offer_volume3+offer_volume4++offer_volume5+offer_volume6+offer_volume7+offer_volume8+offer_volume9 -bid_volume0+bid_volume1+bid_volume2+bid_volume3+bid_volume4++bid_volume5+bid_volume6+bid_volume7+bid_volume8+bid_volume9) as volume_imbalance,
log10(offer_price1 / offer_price0) as log_return_offer_1,
log10(offer_price2 / offer_price1) as log_return_offer_2,
log10(offer_price3 / offer_price2) as log_return_offer_3,
log10(offer_price4 / offer_price3) as log_return_offer_4,
log10(offer_price5 / offer_price4) as log_return_offer_5,
log10(offer_price6 / offer_price5) as log_return_offer_6,
log10(offer_price7 / offer_price6) as log_return_offer_7,
log10(offer_price8 / offer_price7) as log_return_offer_8,
log10(offer_price9 / offer_price8) as log_return_offer_9,
log10(bid_price1 / bid_price0) as log_return_bid_1,
log10(bid_price2 / bid_price1) as log_return_bid_2,
log10(bid_price3 / bid_price2) as log_return_bid_3,
log10(bid_price4 / bid_price3) as log_return_bid_4,
log10(bid_price5 / bid_price4) as log_return_bid_5,
log10(bid_price6 / bid_price5) as log_return_bid_6,
log10(bid_price7 / bid_price6) as log_return_bid_7,
log10(bid_price8 / bid_price7) as log_return_bid_8,
log10(bid_price9 / bid_price8) as log_return_bid_9
from tick

data generator is:
tick_generator_optimized2.py.zip

How did you deploy RisingWave?

using docker composer file: docker-compose-distribution.yml

The version of RisingWave

1.8.0

Additional context

No response

@ravenfu ravenfu added the type/bug Something isn't working label Jun 5, 2024
@github-actions github-actions bot added this to the release-1.10 milestone Jun 5, 2024
@wangrunji0408
Copy link
Contributor

wangrunji0408 commented Jun 5, 2024

You can try the following query to find possible 0 divisors:

select 
    bid_volume0 + offer_volume0,
    bid_volume1 + offer_volume1,
    bid_volume2 + offer_volume2,
    bid_volume3 + offer_volume3,
    bid_volume4 + offer_volume4,
    bid_volume5 + offer_volume5,
    bid_volume6 + offer_volume6,
    bid_volume7 + offer_volume7,
    bid_volume8 + offer_volume8,
    bid_volume9 + offer_volume9,
    offer_price0 + bid_price0,
    offer_price1 + bid_price1,
    offer_price2 + bid_price2,
    offer_price3 + bid_price3,
    offer_price4 + bid_price4,
    offer_price5 + bid_price5,
    offer_price6 + bid_price6,
    offer_price7 + bid_price7,
    offer_price8 + bid_price8,
    offer_price9 + bid_price9,
    offer_price0,
    offer_price1,
    offer_price2,
    offer_price3,
    offer_price4,
    offer_price5,
    offer_price6,
    offer_price7,
    offer_price8,
    bid_price0,
    bid_price1,
    bid_price2,
    bid_price3,
    bid_price4,
    bid_price5,
    bid_price6,
    bid_price7,
    bid_price8
from tick
where
    bid_volume0 + offer_volume0 = 0
    or bid_volume1 + offer_volume1 = 0
    or bid_volume2 + offer_volume2 = 0
    or bid_volume3 + offer_volume3 = 0
    or bid_volume4 + offer_volume4 = 0
    or bid_volume5 + offer_volume5 = 0
    or bid_volume6 + offer_volume6 = 0
    or bid_volume7 + offer_volume7 = 0
    or bid_volume8 + offer_volume8 = 0
    or bid_volume9 + offer_volume9 = 0
    or offer_price0 + bid_price0 = 0
    or offer_price1 + bid_price1 = 0
    or offer_price2 + bid_price2 = 0
    or offer_price3 + bid_price3 = 0
    or offer_price4 + bid_price4 = 0
    or offer_price5 + bid_price5 = 0
    or offer_price6 + bid_price6 = 0
    or offer_price7 + bid_price7 = 0
    or offer_price8 + bid_price8 = 0
    or offer_price9 + bid_price9 = 0
    or offer_price0 = 0
    or offer_price1 = 0
    or offer_price2 = 0
    or offer_price3 = 0
    or offer_price4 = 0
    or offer_price5 = 0
    or offer_price6 = 0
    or offer_price7 = 0
    or offer_price8 = 0
    or bid_price0 = 0
    or bid_price1 = 0
    or bid_price2 = 0
    or bid_price3 = 0
    or bid_price4 = 0
    or bid_price5 = 0
    or bid_price6 = 0
    or bid_price7 = 0
    or bid_price8 = 0;

@wangrunji0408
Copy link
Contributor

#17083 would also be helpful if you use the latest nightly version.

@neverchanje
Copy link
Contributor

neverchanje commented Jun 6, 2024

(bid_price0 * bid_volume0 + offer_price0 * offer_volume0) / (bid_volume0 + offer_volume0) as wap0,
Is it possible that there're some expressions like bid_volume0 + offer_volume0 being 0?

@ravenfu
Copy link
Author

ravenfu commented Jun 7, 2024

  1. I did find some 0 in tick later, and this is indeed a slap in the face.
    But before that, I used the following sql to query, and the result was not found.

select * from
tick
where
offer_price0 = 0
or offer_price1 = 0
or offer_price2 = 0
or offer_price3 = 0
or offer_price4 = 0
or offer_price5 = 0
or offer_price6 = 0
or offer_price7 = 0
or offer_price8 = 0
or bid_price0 = 0
or bid_price1 = 0
or bid_price2 = 0
or bid_price3 = 0
or bid_price4 = 0
or bid_price5 = 0
or bid_price6 = 0
or bid_price7 = 0
or bid_price8 = 0;
2. What puzzles me is that my sql can sometimes succeed and sometimes fail.
3. I wrapped the query by selct count , and I have also found the result before, which took more than 2 minutes
select count(1) from (
select
symbol,
date_time,
(bid_price0 * bid_volume0 + offer_price0 * offer_volume0) / (bid_volume0 + offer_volume0) as wap0,
(bid_price1 * bid_volume1 + offer_price1 * offer_volume1) / (bid_volume1 + offer_volume1) as wap1,
(bid_price2 * bid_volume2 + offer_price2 * offer_volume2) / (bid_volume2 + offer_volume2) as wap2,
(bid_price3 * bid_volume3 + offer_price3 * offer_volume3) / (bid_volume3 + offer_volume3) as wap3,
(bid_price4 * bid_volume4 + offer_price4 * offer_volume4) / (bid_volume4 + offer_volume4) as wap4,
...
log10(bid_price7 / bid_price6) as log_return_bid_7,
log10(bid_price8 / bid_price7) as log_return_bid_8,
log10(bid_price9 / bid_price8) as log_return_bid_9
from tick
)
4. And this following sql would crash the risingwave cluster and can no longer connect, so it can only be restarted to return to normal;
If I reduce the query conditions, I can execute successfully

select
*
from tick
where
bid_volume0 + offer_volume0 = 0
or bid_volume1 + offer_volume1 = 0
or bid_volume2 + offer_volume2 = 0
or bid_volume3 + offer_volume3 = 0
or bid_volume4 + offer_volume4 = 0
or bid_volume5 + offer_volume5 = 0
or bid_volume6 + offer_volume6 = 0
or bid_volume7 + offer_volume7 = 0
or bid_volume8 + offer_volume8 = 0
or bid_volume9 + offer_volume9 = 0
or offer_price0 + bid_price0 = 0
or offer_price1 + bid_price1 = 0
or offer_price2 + bid_price2 = 0
or offer_price3 + bid_price3 = 0
or offer_price4 + bid_price4 = 0
or offer_price5 + bid_price5 = 0
or offer_price6 + bid_price6 = 0
or offer_price7 + bid_price7 = 0
or offer_price8 + bid_price8 = 0
or offer_price9 + bid_price9 = 0
or offer_price0 = 0
or offer_price1 = 0
or offer_price2 = 0
or offer_price3 = 0
or offer_price4 = 0
or offer_price5 = 0
or offer_price6 = 0
or offer_price7 = 0
or offer_price8 = 0
or bid_price0 = 0
or bid_price1 = 0
or bid_price2 = 0
or bid_price3 = 0
or bid_price4 = 0
or bid_price5 = 0
or bid_price6 = 0
or bid_price7 = 0
or bid_price8 = 0;

@ravenfu
Copy link
Author

ravenfu commented Jun 7, 2024

And After regenerating the data and ensuring that no zeros in it, the SQL query is normal.
I still think this unstable phenomenon still needs to be solved

@wangrunji0408
Copy link
Contributor

  1. And this following sql would crash the risingwave cluster and can no longer connect, so it can only be restarted to return to normal;

Did you find any error message in the log? This sounds like a bug.

@ravenfu
Copy link
Author

ravenfu commented Jun 7, 2024

I didn't find any exception logs for this sql, and don't know what the cluster doing, the cpu monitor is very low, under 4%

Front-end was outputting these logs, but it should no matter with this sql

2024-06-07T05:07:28.988642552Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:30.029558191Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46236
2024-06-07T05:07:30.029647855Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:31.061669414Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46252
2024-06-07T05:07:31.061762196Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:32.096111237Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46268
2024-06-07T05:07:32.09617817Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:33.138831406Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46286
2024-06-07T05:07:33.138897776Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:34.172614848Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46300
2024-06-07T05:07:34.172683396Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:35.206957536Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46316
2024-06-07T05:07:35.207022742Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:36.241974459Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46330
2024-06-07T05:07:36.242111131Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:37.279862422Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46346
2024-06-07T05:07:37.279939087Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:38.318065723Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46362
2024-06-07T05:07:38.318121505Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:39.356312607Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46378
2024-06-07T05:07:39.356369015Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:40.392404724Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46394
2024-06-07T05:07:40.392473698Z ERROR pgwire::pg_server: error when reading message error=unexpected end of file
2024-06-07T05:07:41.43673664Z INFO pgwire::pg_server: accept connection peer_addr=127.0.0.1:46414

@fuyufjh
Copy link
Member

fuyufjh commented Jun 19, 2024

FYI. In future versions, RW will print the function name and argument values along with the expression error #17083

@fuyufjh
Copy link
Member

fuyufjh commented Jun 19, 2024

  1. And this following sql would crash the risingwave cluster and can no longer connect, so it can only be restarted to return to normal; If I reduce the query conditions, I can execute successfully

select * from tick where bid_volume0 + offer_volume0 = 0 or bid_volume1 + offer_volume1 = 0 or bid_volume2 + offer_volume2 = 0 or bid_volume3 + offer_volume3 = 0 or bid_volume4 + offer_volume4 = 0 or bid_volume5 + offer_volume5 = 0 or bid_volume6 + offer_volume6 = 0 or bid_volume7 + offer_volume7 = 0 or bid_volume8 + offer_volume8 = 0 or bid_volume9 + offer_volume9 = 0 or offer_price0 + bid_price0 = 0 or offer_price1 + bid_price1 = 0 or offer_price2 + bid_price2 = 0 or offer_price3 + bid_price3 = 0 or offer_price4 + bid_price4 = 0 or offer_price5 + bid_price5 = 0 or offer_price6 + bid_price6 = 0 or offer_price7 + bid_price7 = 0 or offer_price8 + bid_price8 = 0 or offer_price9 + bid_price9 = 0 or offer_price0 = 0 or offer_price1 = 0 or offer_price2 = 0 or offer_price3 = 0 or offer_price4 = 0 or offer_price5 = 0 or offer_price6 = 0 or offer_price7 = 0 or offer_price8 = 0 or bid_price0 = 0 or bid_price1 = 0 or bid_price2 = 0 or bid_price3 = 0 or bid_price4 = 0 or bid_price5 = 0 or bid_price6 = 0 or bid_price7 = 0 or bid_price8 = 0;

and

I didn't find any exception logs for this sql, and don't know what the cluster doing, the cpu monitor is very low, under 4%

Suspecting this might be related to stack size, which is improved recently. #17224 #17342 Can you please take a try with the nightly image? Or try setting env var RUST_MIN_STACK=4194304.

@ravenfu
Copy link
Author

ravenfu commented Jun 21, 2024

RUST_MIN_STACK=4194304 added to my docker compose yml envs, and restarted, but didn't work
and this time I found that the frontend-node CPU is 100%, after 1 hour later, it still not return any result.

@BugenZhao
Copy link
Member

RUST_MIN_STACK=4194304 added to my docker compose yml envs, and restarted, but didn't work and this time I found that the frontend-node CPU is 100%, after 1 hour later, it still not return any result.

Hi. Would you please update to the latest version (v1.9.1) and try again? I guess #16861 (#16868) could help in this case. cc @chenzl25

@BugenZhao
Copy link
Member

Hi. Kindly reopen if you have any updates.

@BugenZhao BugenZhao closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working user-feedback
Projects
None yet
Development

No branches or pull requests

5 participants