diff --git a/.github/workflows/markdown-links-check.yml b/.github/workflows/markdown-links-check.yml new file mode 100644 index 000000000..833b99ec8 --- /dev/null +++ b/.github/workflows/markdown-links-check.yml @@ -0,0 +1,36 @@ +# Copyright (c) 2023, NVIDIA CORPORATION. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# A workflow to check if PR got broken hyperlinks +name: Check Markdown links + +on: + pull_request: + types: [opened, synchronize, reopened] + +jobs: + markdown-link-check: + runs-on: ubuntu-latest + steps: + - name: work around permission issue + run: git config --global --add safe.directory /github/workspace + - uses: actions/checkout@master + - uses: gaurav-nelson/github-action-markdown-link-check@v1 + with: + max-depth: -1 + use-verbose-mode: 'yes' + config-file: '.github/workflows/markdown-links-check/markdown-links-check-config.json' + base-branch: 'dev' + + \ No newline at end of file diff --git a/.github/workflows/markdown-links-check/markdown-links-check-config.json b/.github/workflows/markdown-links-check/markdown-links-check-config.json new file mode 100644 index 000000000..ec4af8ca8 --- /dev/null +++ b/.github/workflows/markdown-links-check/markdown-links-check-config.json @@ -0,0 +1,17 @@ +{ + "ignorePatterns": [ + { + "pattern": "https://github.com/NVIDIA/spark-rapids-tools/issues/*" + }, + { + "pattern": "http://localhost*" + }, + { + "pattern": "https://www.nvidia.com/en-us/security/pgp-key" + } + ], + "timeout": "15s", + "retryOn429": true, + "retryCount":30, + "aliveStatusCodes": [200, 403] +} \ No newline at end of file diff --git a/README.md b/README.md index 42f142176..ac1a5dcfb 100644 --- a/README.md +++ b/README.md @@ -4,12 +4,15 @@ This repo provides the tools to use [RAPIDS Accelerator for Apache Spark](https: ## Catalog -- [RAPIDS core tools](/core): Tools that help developers getting the most out of their Apache Spark applications +- [RAPIDS core tools](./core): Tools that help developers getting the most out of their Apache + Spark applications without any code change: - Report acceleration potential of RAPIDS Accelerator for Apache Spark on a set of Spark applications. - Generate comprehensive profiling analysis for Apache Sparks executing on accelerated GPU instances. This information can be used to further tune and optimize the application. -- [spark-rapids-user-tools](/user_tools): A simple wrapper process around cloud service providers to run - [RAPIDS core tools](/core) across multiple cloud platforms. In addition, the output educates the users on +- [spark-rapids-user-tools](./user_tools): A simple wrapper process around cloud service + providers to run + [RAPIDS core tools](./core) across multiple cloud platforms. In addition, the output educates + the users on the cost savings and acceleration potential of RAPIDS Accelerator for Apache Spark and makes recommendations to tune the application performance based on the cluster shape. diff --git a/user_tools/docs/user-tools-databricks-aws.md b/user_tools/docs/user-tools-databricks-aws.md index 8e94e654d..2e9198af4 100644 --- a/user_tools/docs/user-tools-databricks-aws.md +++ b/user_tools/docs/user-tools-databricks-aws.md @@ -43,7 +43,7 @@ Before running any command, you can set environment variables to specify configu - RAPIDS variables have a naming pattern `RAPIDS_USER_TOOLS_*`: - `RAPIDS_USER_TOOLS_CACHE_FOLDER`: specifies the location of a local directory that the RAPIDS-cli uses to store and cache the downloaded resources. The default is `/var/tmp/spark_rapids_user_tools_cache`. Note that caching the resources locally has an impact on the total execution time of the command. - `RAPIDS_USER_TOOLS_OUTPUT_DIRECTORY`: specifies the location of a local directory that the RAPIDS-cli uses to generate the output. The wrapper CLI arguments override that environment variable (`--local_folder` for Qualification). -- For Databricks CLI, some environment variables can be set and picked by the RAPIDS-user tools such as: `DATABRICKS_CONFIG_FILE`, `DATABRICKS_HOST` and `DATABRICKS_TOKEN`. See the description of the variables in [Environment variables](https://docs.databricks.com/en/dev-tools/auth.html#environment-variables-and-fields-for-client-unified-authentication). +- For Databricks CLI, some environment variables can be set and picked by the RAPIDS-user tools such as: `DATABRICKS_CONFIG_FILE`, `DATABRICKS_HOST` and `DATABRICKS_TOKEN`. See the description of the variables in [Environment variables](https://docs.databricks.com/en/dev-tools/auth/index.html#environment-variables-and-fields-for-client-unified-authentication). - For AWS CLI, some environment variables can be set and picked by the RAPIDS-user tools such as: `AWS_SHARED_CREDENTIALS_FILE`, `AWS_CONFIG_FILE`, `AWS_REGION`, `AWS_DEFAULT_REGION`, `AWS_PROFILE` and `AWS_DEFAULT_OUTPUT`. See the full list of variables in [aws-cli-configure-envvars](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html). ## Qualification command diff --git a/user_tools/docs/user-tools-databricks-azure.md b/user_tools/docs/user-tools-databricks-azure.md index 2605b70e8..96cf6888e 100644 --- a/user_tools/docs/user-tools-databricks-azure.md +++ b/user_tools/docs/user-tools-databricks-azure.md @@ -47,7 +47,7 @@ Before running any command, you can set environment variables to specify configu - RAPIDS variables have a naming pattern `RAPIDS_USER_TOOLS_*`: - `RAPIDS_USER_TOOLS_CACHE_FOLDER`: specifies the location of a local directory that the RAPIDS-cli uses to store and cache the downloaded resources. The default is `/var/tmp/spark_rapids_user_tools_cache`. Note that caching the resources locally has an impact on the total execution time of the command. - `RAPIDS_USER_TOOLS_OUTPUT_DIRECTORY`: specifies the location of a local directory that the RAPIDS-cli uses to generate the output. The wrapper CLI arguments override that environment variable (`--local_folder` for Qualification). -- For Databricks CLI, some environment variables can be set and picked up by the RAPIDS-user tools such as: `DATABRICKS_CONFIG_FILE`, `DATABRICKS_HOST` and `DATABRICKS_TOKEN`. See the description of the variables in [Environment variables](https://docs.databricks.com/en/dev-tools/auth.html#environment-variables-and-fields-for-client-unified-authentication). +- For Databricks CLI, some environment variables can be set and picked up by the RAPIDS-user tools such as: `DATABRICKS_CONFIG_FILE`, `DATABRICKS_HOST` and `DATABRICKS_TOKEN`. See the description of the variables in [Environment variables](https://docs.databricks.com/en/dev-tools/auth/index.html#environment-variables-and-fields-for-client-unified-authentication). - For Azure CLI, some environment variables can be set and picked up by the RAPIDS-user tools such as: `AZURE_CONFIG_FILE` and `AZURE_DEFAULTS_LOCATION`. ## Qualification command