Skip to content

Commit

Permalink
[DOC] Update docs for 23.12.0 release [skip ci] (#9943)
Browse files Browse the repository at this point in the history
* update docs for 23.12 release

Signed-off-by: Suraj Aralihalli <[email protected]>

* update docs for 23.12 release

Signed-off-by: Suraj Aralihalli <[email protected]>

* drop pascal support - docs

Signed-off-by: Suraj Aralihalli <[email protected]>

* update download docs

Signed-off-by: Suraj Aralihalli <[email protected]>

* update download docs

Signed-off-by: Suraj Aralihalli <[email protected]>

* add scala 2.13 support

Signed-off-by: Suraj Aralihalli <[email protected]>

* address feedback for release notes

Signed-off-by: Suraj Aralihalli <[email protected]>

* address feedback for release notes

Signed-off-by: Suraj Aralihalli <[email protected]>

* update verify signature

Signed-off-by: Suraj Aralihalli <[email protected]>

* jars -> jar

Signed-off-by: Suraj Aralihalli <[email protected]>

---------

Signed-off-by: Suraj Aralihalli <[email protected]>
  • Loading branch information
SurajAralihalli authored Dec 12, 2023
1 parent 1b7940e commit e8256f7
Show file tree
Hide file tree
Showing 4 changed files with 115 additions and 28 deletions.
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,15 +130,15 @@ mvn -pl dist -PnoSnapshots package -DskipTests
Verify that shim-specific classes are hidden from a conventional classloader.

```bash
$ javap -cp dist/target/rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar com.nvidia.spark.rapids.shims.SparkShimImpl
$ javap -cp dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar com.nvidia.spark.rapids.shims.SparkShimImpl
Error: class not found: com.nvidia.spark.rapids.shims.SparkShimImpl
```

However, its bytecode can be loaded if prefixed with `spark3XY` not contained in the package name

```bash
$ javap -cp dist/target/rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar spark320.com.nvidia.spark.rapids.shims.SparkShimImpl | head -2
Warning: File dist/target/rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar(/spark320/com/nvidia/spark/rapids/shims/SparkShimImpl.class) does not contain class spark320.com.nvidia.spark.rapids.shims.SparkShimImpl
$ javap -cp dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar spark320.com.nvidia.spark.rapids.shims.SparkShimImpl | head -2
Warning: File dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar(/spark320/com/nvidia/spark/rapids/shims/SparkShimImpl.class) does not contain class spark320.com.nvidia.spark.rapids.shims.SparkShimImpl
Compiled from "SparkShims.scala"
public final class com.nvidia.spark.rapids.shims.SparkShimImpl {
```
Expand Down Expand Up @@ -181,7 +181,7 @@ mvn package -pl dist -am -Dbuildver=340 -DallowConventionalDistJar=true
Verify `com.nvidia.spark.rapids.shims.SparkShimImpl` is conventionally loadable:
```bash
$ javap -cp dist/target/rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar com.nvidia.spark.rapids.shims.SparkShimImpl | head -2
$ javap -cp dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar com.nvidia.spark.rapids.shims.SparkShimImpl | head -2
Compiled from "SparkShims.scala"
public final class com.nvidia.spark.rapids.shims.SparkShimImpl {
```
Expand Down
80 changes: 80 additions & 0 deletions docs/archive.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,86 @@ nav_order: 15
---
Below are archived releases for RAPIDS Accelerator for Apache Spark.

## Release v23.10.0
### Hardware Requirements:

The plugin is tested on the following architectures:

GPU Models: NVIDIA P100, V100, T4, A10/A100, L4 and H100 GPUs

### Software Requirements:

OS: Ubuntu 20.04, Ubuntu 22.04, CentOS 7, or Rocky Linux 8

NVIDIA Driver*: R470+

Runtime:
Scala 2.12
Python, Java Virtual Machine (JVM) compatible with your spark-version.

* Check the Spark documentation for Python and Java version compatibility with your specific
Spark version. For instance, visit `https://spark.apache.org/docs/3.4.1` for Spark 3.4.1.
Please be aware that we do not currently support Spark builds with Scala 2.13.

Supported Spark versions:
Apache Spark 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4
Apache Spark 3.3.0, 3.3.1, 3.3.2, 3.3.3
Apache Spark 3.4.0, 3.4.1
Apache Spark 3.5.0

Supported Databricks runtime versions for Azure and AWS:
Databricks 10.4 ML LTS (GPU, Scala 2.12, Spark 3.2.1)
Databricks 11.3 ML LTS (GPU, Scala 2.12, Spark 3.3.0)
Databricks 12.2 ML LTS (GPU, Scala 2.12, Spark 3.3.2)

Supported Dataproc versions:
GCP Dataproc 2.0
GCP Dataproc 2.1

*Some hardware may have a minimum driver version greater than R470. Check the GPU spec sheet
for your hardware's minimum driver version.

*For Cloudera and EMR support, please refer to the
[Distributions](https://docs.nvidia.com/spark-rapids/user-guide/latest/faq.html#which-distributions-are-supported) section of the FAQ.

#### RAPIDS Accelerator's Support Policy for Apache Spark
The RAPIDS Accelerator maintains support for Apache Spark versions available for download from [Apache Spark](https://spark.apache.org/downloads.html)

### Download v23.10.0
* Download the [RAPIDS
Accelerator for Apache Spark 23.10.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)

This package is built against CUDA 11.8. It is tested on V100, T4, A10, A100, L4 and H100 GPUs with
CUDA 11.8 through CUDA 12.0.

### Verify signature
* Download the [RAPIDS Accelerator for Apache Spark 23.10.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
and [RAPIDS Accelerator for Apache Spark 23.10.0 jars.asc](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar.asc)
* Download the [PUB_KEY](https://keys.openpgp.org/[email protected]).
* Import the public key: `gpg --import PUB_KEY`
* Verify the signature: `gpg --verify rapids-4-spark_2.12-23.10.0.jar.asc rapids-4-spark_2.12-23.10.0.jar`

The output of signature verify:

gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) <[email protected]>"

### Release Notes
New functionality and performance improvements for this release include:
* Introduced support for Spark 3.5.0.
* Improved memory management for better control in YARN and K8s on CSP.
* Strengthened Parquet and ORC tests for enhanced stability and support.
* Reduce GPU out-of-memory (OOM) occurrences.
* Enhanced driver log with actionable insights.
* Qualification and Profiling tool:
* Enhanced user experience with the availability of the 'ascli' tool for qualification and
profiling across all platforms.
* The qualification tool now accommodates CPU-fallback transitions and broadens the speedup factor coverage.
* Extended diagnostic support for user tools to cover EMR, Databricks AWS, and Databricks Azure.
* Introduced support for cluster configuration recommendations in the profiling tool for supported platforms.

For a detailed list of changes, please refer to the
[CHANGELOG](https://github.com/NVIDIA/spark-rapids/blob/main/CHANGELOG.md).

## Release v23.08.2
### Hardware Requirements:

Expand Down
4 changes: 2 additions & 2 deletions docs/dev/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ nav_order: 2
parent: Developer Overview
---
An overview of testing can be found within the repository at:
* [Unit tests](https://github.com/NVIDIA/spark-rapids/tree/branch-23.10/tests#readme)
* [Integration testing](https://github.com/NVIDIA/spark-rapids/tree/branch-23.10/integration_tests#readme)
* [Unit tests](https://github.com/NVIDIA/spark-rapids/tree/branch-23.12/tests#readme)
* [Integration testing](https://github.com/NVIDIA/spark-rapids/tree/branch-23.12/integration_tests#readme)
51 changes: 29 additions & 22 deletions docs/download.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ cuDF jar, that is either preinstalled in the Spark classpath on all nodes or sub
that uses the RAPIDS Accelerator For Apache Spark. See the [getting-started
guide](https://docs.nvidia.com/spark-rapids/user-guide/latest/getting-started/overview.html) for more details.

## Release v23.10.0
## Release v23.12.0
### Hardware Requirements:

The plugin is tested on the following architectures:

GPU Models: NVIDIA P100, V100, T4, A10/A100, L4 and H100 GPUs
GPU Models: NVIDIA V100, T4, A10/A100, L4 and H100 GPUs

### Software Requirements:

Expand All @@ -32,12 +32,11 @@ The plugin is tested on the following architectures:
NVIDIA Driver*: R470+

Runtime:
Scala 2.12
Scala 2.12, 2.13
Python, Java Virtual Machine (JVM) compatible with your spark-version.

* Check the Spark documentation for Python and Java version compatibility with your specific
Spark version. For instance, visit `https://spark.apache.org/docs/3.4.1` for Spark 3.4.1.
Please be aware that we do not currently support Spark builds with Scala 2.13.
Spark version. For instance, visit `https://spark.apache.org/docs/3.4.1` for Spark 3.4.1.

Supported Spark versions:
Apache Spark 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4
Expand All @@ -53,47 +52,55 @@ The plugin is tested on the following architectures:
Supported Dataproc versions:
GCP Dataproc 2.0
GCP Dataproc 2.1

Supported Dataproc Serverless versions:
Spark runtime 1.1 LTS

*Some hardware may have a minimum driver version greater than R470. Check the GPU spec sheet
for your hardware's minimum driver version.

*For Cloudera and EMR support, please refer to the
[Distributions](https://docs.nvidia.com/spark-rapids/user-guide/latest/faq.html#which-distributions-are-supported) section of the FAQ.

#### RAPIDS Accelerator's Support Policy for Apache Spark
### RAPIDS Accelerator's Support Policy for Apache Spark
The RAPIDS Accelerator maintains support for Apache Spark versions available for download from [Apache Spark](https://spark.apache.org/downloads.html)

### Download v23.10.0
* Download the [RAPIDS
Accelerator for Apache Spark 23.10.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
### Download RAPIDS Accelerator for Apache Spark v23.12.0
- **Scala 2.12:**
- [RAPIDS Accelerator for Apache Spark 23.12.0 - Scala 2.12 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)
- [RAPIDS Accelerator for Apache Spark 23.12.0 - Scala 2.12 jar.asc](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar.asc)

- **Scala 2.13:**
- [RAPIDS Accelerator for Apache Spark 23.12.0 - Scala 2.13 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/23.12.0/rapids-4-spark_2.13-23.12.0.jar)
- [RAPIDS Accelerator for Apache Spark 23.12.0 - Scala 2.13 jar.asc](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/23.12.0/rapids-4-spark_2.13-23.12.0.jar.asc)

This package is built against CUDA 11.8. It is tested on V100, T4, A10, A100, L4 and H100 GPUs with
CUDA 11.8 through CUDA 12.0.

### Verify signature
* Download the [RAPIDS Accelerator for Apache Spark 23.10.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
and [RAPIDS Accelerator for Apache Spark 23.10.0 jars.asc](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar.asc)
* Download the [PUB_KEY](https://keys.openpgp.org/[email protected]).
* Import the public key: `gpg --import PUB_KEY`
* Verify the signature: `gpg --verify rapids-4-spark_2.12-23.10.0.jar.asc rapids-4-spark_2.12-23.10.0.jar`
* Verify the signature for Scala 2.12 jar:
`gpg --verify rapids-4-spark_2.12-23.12.0.jar.asc rapids-4-spark_2.12-23.12.0.jar`
* Verify the signature for Scala 2.13 jar:
`gpg --verify rapids-4-spark_2.13-23.12.0.jar.asc rapids-4-spark_2.13-23.12.0.jar`

The output of signature verify:

gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) <[email protected]>"

### Release Notes
New functionality and performance improvements for this release include:
* Introduced support for Spark 3.5.0.
* Improved memory management for better control in YARN and K8s on CSP.
* Strengthened Parquet and ORC tests for enhanced stability and support.
* Reduce GPU out-of-memory (OOM) occurrences.
* Enhanced driver log with actionable insights.
* Introduced support for chunked reading of ORC files.
* Enhanced support for additional time zones and added stack function support.
* Enhanced performance for join and aggregation operations.
* Kernel optimizations have been implemented to improve Parquet read performance.
* RAPIDS Accelerator also built and tested with Scala 2.13.
* Last version to support Pascal-based Nvidia GPUs; discontinued in the next release.
* Qualification and Profiling tool:
* Enhanced user experience with the availability of the 'ascli' tool for qualification and
profiling across all platforms.
* The qualification tool now accommodates CPU-fallback transitions and broadens the speedup factor coverage.
* Extended diagnostic support for user tools to cover EMR, Databricks AWS, and Databricks Azure.
* Introduced support for cluster configuration recommendations in the profiling tool for supported platforms.
* Profiling Tool now processes Spark Driver log for GPU runs, enhancing feature analysis.
* Auto-tuner recommendations include AQE settings for optimized performance.
* New configurations in Profiler for enabling off-default features: udfCompiler, incompatibleDateFormats, hasExtendedYearValues.

For a detailed list of changes, please refer to the
[CHANGELOG](https://github.com/NVIDIA/spark-rapids/blob/main/CHANGELOG.md).
Expand Down

0 comments on commit e8256f7

Please sign in to comment.