[BUG] Failed test_multi_tier_ast[DATAGEN_SEED=1701445668] on CI #9932

cindyyuanjiang · 2023-12-02T03:30:20Z

Describe the bug
Failed integration test. Pipeline: rapids_it-MT-egx06-standalone

Details:

FAILED ../../src/main/python/ast_test.py::test_multi_tier_ast[DATAGEN_SEED=1701445668]

[2023-12-01T22:00:53.723Z] --- CPU OUTPUT

[2023-12-01T22:00:53.723Z] +++ GPU OUTPUT

[2023-12-01T22:00:53.723Z] @@ -1,4 +1,3 @@

[2023-12-01T22:00:53.723Z] -Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z] @@ -8,3 +7,4 @@

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z]  Row(((id < x) = (id < (id + x)))=False)

[2023-12-01T22:00:53.723Z] +Row(((id < x) = (id < (id + x)))=False)

Steps/Code to reproduce bug
Integration test

Expected behavior
Should pass

The text was updated successfully, but these errors were encountered:

gerashegalov · 2023-12-02T06:12:01Z

It does not easily reproduce for me in the local mode. To reproduce with Spark 3.5.0 locally, run against a local-cluster with at least 2 executors using the following command presuming you have installed pytest-repeat because it may take several iterations

DATAGEN_SEED=1701445707 \
NUM_LOCAL_EXECS=2 \
SPARK_HOME=~/dist/spark-3.5.0-bin-hadoop3 \
TEST_PARALLEL=0 \
  ./integration_tests/run_pyspark_from_build.sh -k test_multi_tier_ast --count=10

Reran the above 3 times: 4 (pass) / 6 (fail), 3 (pass) / 7 (fail), 5 (pass) / 5 (fail)

jlowe · 2023-12-04T15:11:26Z

The problem is one of ordering. It becomes very apparent when updating the projection to include the original input value and increasing the number of executors.

revans2 · 2023-12-04T15:14:33Z

I think it has to be because of the repartition(1) in the test. A project should maintain the order, but a repartition has races in it. Why do we have that in it?

cindyyuanjiang added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 2, 2023

jlowe self-assigned this Dec 4, 2023

jlowe mentioned this issue Dec 4, 2023

Fix test_multi_tier_ast to ignore ordering of output rows #9946

Merged

ttnghia linked a pull request Dec 4, 2023 that will close this issue

Fix test_multi_tier_ast to ignore ordering of output rows #9946

Merged

jlowe closed this as completed Dec 4, 2023

mattahrens removed the ? - Needs Triage Need team to review and classify label Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Failed test_multi_tier_ast[DATAGEN_SEED=1701445668] on CI #9932

[BUG] Failed test_multi_tier_ast[DATAGEN_SEED=1701445668] on CI #9932

cindyyuanjiang commented Dec 2, 2023

gerashegalov commented Dec 2, 2023 •

edited

Loading

jlowe commented Dec 4, 2023

revans2 commented Dec 4, 2023

[BUG] Failed test_multi_tier_ast[DATAGEN_SEED=1701445668] on CI #9932

[BUG] Failed test_multi_tier_ast[DATAGEN_SEED=1701445668] on CI #9932

Comments

cindyyuanjiang commented Dec 2, 2023

gerashegalov commented Dec 2, 2023 • edited Loading

jlowe commented Dec 4, 2023

revans2 commented Dec 4, 2023

gerashegalov commented Dec 2, 2023 •

edited

Loading