-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Gera Shegalov <[email protected]>
- Loading branch information
1 parent
8ad0b83
commit e89e50a
Showing
1 changed file
with
28 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# TPC-DS Scale Factor 10 (GiB) - CPU Spark vs GPU Spark | ||
|
||
[TPC-DS](https://www.tpc.org/tpcds/) is a decision support benchmark often used to evaluate | ||
performance of OLAP Databases and Big Data systems. | ||
|
||
The notebook in this folder runs a user-specified subset of the TPC-DS queries on the | ||
Scale Factor 10 (GiB) dataset. It uses [TPCDS PySpark](https://github.com/cerndb/SparkTraining/blob/master/notebooks/TPCDS_PySpark_CERN_SWAN_getstarted.ipynb) | ||
to execute TPC-DS queries with SparkSQL on GPU and CPU capturing the metrics | ||
as a Pandas dataframe. It then plots a comparison bar chart visualizing | ||
the GPU acceleration achieved for the queries run with RAPIDS Spark in this | ||
very notebook. | ||
|
||
This notebook can be opened and executed using standard | ||
|
||
- Jupyter(Lab) | ||
- in VSCode with Jupyter [extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) | ||
|
||
It can also be opened and evaluated on hosted Notebook environments. Use the link below to launch on | ||
Google Colab and connect it to a [GPU instance](https://research.google.com/colaboratory/faq.html). | ||
|
||
<a target="_blank" href="https://colab.research.google.com/github/NVIDIA/spark-rapids-examples/blob/b3b48f8d384581c211dedf714a48943561c25899/examples/SQL%2BDF-Examples/tpcds/notebooks/TPCDS-SF10.ipynb"> | ||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> | ||
</a> | ||
|
||
Here is the bar chart from a recent execution on Google Colab's T4 High RAM instance using | ||
RAPIDS Spark 24.10.0 with Apache Spark 3.5.1 | ||
|
||
![tpcds-speedup](/docs/img/guides/tpcds.png) |