Skip to content

Commit

Permalink
Enable specifying specific integration test methods via TESTS environ…
Browse files Browse the repository at this point in the history
…ment (NVIDIA#10564)

* WIP

Signed-off-by: Gera Shegalov <[email protected]>

* WIP

Signed-off-by: Gera Shegalov <[email protected]>

* Enable specifying the pytest using file_or_dir args

```bash
TEST_PARALLEL=0 \
SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \
TEST_FILE_OR_DIR=~/gits/NVIDIA/spark-rapids/integration_tests/src/main/python/arithmetic_ops_test.py::test_addition  \
./integration_tests/run_pyspark_from_build.sh --collect-only

<Module src/main/python/arithmetic_ops_test.py>
  <Function test_addition[Byte]>
  <Function test_addition[Short]>
  <Function test_addition[Integer]>
  <Function test_addition[Long]>
  <Function test_addition[Float]>
  <Function test_addition[Double]>
  <Function test_addition[Decimal(7,3)]>
  <Function test_addition[Decimal(12,2)]>
  <Function test_addition[Decimal(18,0)]>
  <Function test_addition[Decimal(20,2)]>
  <Function test_addition[Decimal(30,2)]>
  <Function test_addition[Decimal(36,5)]>
  <Function test_addition[Decimal(38,10)]>
  <Function test_addition[Decimal(38,0)]>
  <Function test_addition[Decimal(7,7)]>
  <Function test_addition[Decimal(7,-3)]>
  <Function test_addition[Decimal(36,-5)]>
  <Function test_addition[Decimal(38,-10)]>
```

Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>

* Changing to TESTS=module::method

Signed-off-by: Gera Shegalov <[email protected]>

---------

Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>
  • Loading branch information
gerashegalov and razajafri authored Mar 8, 2024
1 parent 11fde83 commit 785eaa5
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 16 deletions.
33 changes: 21 additions & 12 deletions integration_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,13 @@ at `$SPARK_HOME`. It will be very useful to read the contents of the
[run_pyspark_from_build.sh](run_pyspark_from_build.sh) to get a better insight
into what is needed as we constantly keep working on to improve and expand the plugin-support.

The python tests run with pytest and the script honors pytest parameters. Some handy flags are:
- `-k` <pytest-file-name>. This will run all the tests in that test file.
- `-k` <test-name>. This will also run an individual test.
The python tests run with pytest and the script honors pytest parameters:

- The explicit test specification of specific modules, methods, and their parametrization
is supported by using the `TESTS` environment variable instead of positional arguments
in pytest CLI
- `-k` <keyword_expression>. This will run all the tests satisfying the keyword
expression.
- `-s` Doesn't capture the output and instead prints to the screen.
- `-v` Increase the verbosity of the tests
- `-r fExXs` Show extra test summary info as specified by chars: (f)ailed, (E)rror, (x)failed, (X)passed, (s)kipped
Expand All @@ -175,7 +179,12 @@ Examples:
## running all integration tests for Map
./integration_tests/run_pyspark_from_build.sh -k map_test.py
## Running a single integration test in map_test
./integration_tests/run_pyspark_from_build.sh -k test_map_integration_1
./integration_tests/run_pyspark_from_build.sh -k 'map_test.py and test_map_integration_1'
## Running tests marching the keyword "exist" from any module
./integration_tests/run_pyspark_from_build.sh -k exist
## Running all parametrization of the method arithmetic_ops_test.py::test_addition
## and a specific parametrization of array_test.py::test_array_exists
TESTS="arithmetic_ops_test.py::test_addition array_test.py::test_array_exists[3VL:off-data_gen0]" ./integration_tests/run_pyspark_from_build.sh
```
### Spark execution mode
Expand Down Expand Up @@ -343,14 +352,14 @@ integration tests. For example:
$ DATAGEN_SEED=1702166057 SPARK_HOME=~/spark-3.4.0-bin-hadoop3 integration_tests/run_pyspark_from_build.sh
```
Tests can override the seed used using the test marker:
Tests can override the seed used using the test marker:
```
@datagen_overrides(seed=<new seed here>, [condition=True|False], [permanent=True|False])`.
@datagen_overrides(seed=<new seed here>, [condition=True|False], [permanent=True|False])`.
```
This marker has the following arguments:
- `seed`: a hard coded datagen seed to use.
This marker has the following arguments:
- `seed`: a hard coded datagen seed to use.
- `condition`: is used to gate when the override is appropriate, usually used to say that specific shims
need the special override.
- `permanent`: forces a test to ignore `DATAGEN_SEED` if True. If False, or if absent, the `DATAGEN_SEED` value always wins.
Expand Down Expand Up @@ -507,10 +516,10 @@ The marks you care about are all in marks.py
For the most part you can ignore this file. It provides the underlying Spark session to operations that need it, but most tests should interact with
it through `asserts.py`.

All data generation and Spark function calls should occur within a Spark session. Typically
this is done by passing a lambda to functions in `asserts.py` such as
`assert_gpu_and_cpu_are_equal_collect`. However, for scalar generation like `gen_scalars`, you
may need to put it in a `with_cpu_session`. It is because negative scale decimals can have
All data generation and Spark function calls should occur within a Spark session. Typically
this is done by passing a lambda to functions in `asserts.py` such as
`assert_gpu_and_cpu_are_equal_collect`. However, for scalar generation like `gen_scalars`, you
may need to put it in a `with_cpu_session`. It is because negative scale decimals can have
problems when calling `f.lit` from outside of `with_spark_session`.

## Guidelines for Testing
Expand Down
16 changes: 12 additions & 4 deletions integration_tests/run_pyspark_from_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,18 @@ else
## Under cloud environment, overwrite the '--std_input_path' param to point to the distributed file path
INPUT_PATH=${INPUT_PATH:-"$SCRIPTPATH"}

RUN_TESTS_COMMAND=("$SCRIPTPATH"/runtests.py
--rootdir
"$LOCAL_ROOTDIR"
"$LOCAL_ROOTDIR"/src/main/python)
RUN_TESTS_COMMAND=(
"$SCRIPTPATH"/runtests.py
--rootdir "$LOCAL_ROOTDIR"
)
if [[ "${TESTS}" == "" ]]; then
RUN_TESTS_COMMAND+=("${LOCAL_ROOTDIR}/src/main/python")
else
read -a RAW_TESTS <<< "${TESTS}"
for raw_test in ${RAW_TESTS[@]}; do
RUN_TESTS_COMMAND+=("${LOCAL_ROOTDIR}/src/main/python/${raw_test}")
done
fi

REPORT_CHARS=${REPORT_CHARS:="fE"} # default as (f)ailed, (E)rror
TEST_COMMON_OPTS=(-v
Expand Down

0 comments on commit 785eaa5

Please sign in to comment.