Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notebook for using qualification user tools in Databricks #334

Merged

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Nov 21, 2023

Fixes NVIDIA/spark-rapids-tools#571. This PR adds a new example notebook for using qualification user tools in Databricks

Changes:

  1. Added a new dropdown field csp to choose between aws and azure (defaults to aws).
  2. Create a virtual environment and install spark-rapids-user-tools
  3. Run qualification on user tools.
  4. Parse the logs to store the output folder as a variable.

Output:

image

Signed-off-by: Partho Sarthi [email protected]

NvTimLiu
NvTimLiu previously approved these changes Nov 22, 2023
Copy link
Collaborator

@NvTimLiu NvTimLiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nvliyuan
Copy link
Collaborator

Hi @parthosa , thanks for the contributing, could you help re-target the pr to branch-23.12?

@parthosa parthosa changed the base branch from main to branch-23.12 November 22, 2023 08:46
@parthosa parthosa dismissed NvTimLiu’s stale review November 22, 2023 08:46

The base branch was changed.

@parthosa parthosa changed the base branch from branch-23.12 to main November 22, 2023 08:46
@nvliyuan
Copy link
Collaborator

Hi @parthosa , seems the branch is still main.
And is the notebook specific to some Databricks runtime version?
I hit below error while running on 10.4ML:
image

@nvliyuan
Copy link
Collaborator

I hit below error while running on 10.4ML:

The error will not show up if I run the cell for the second time. Another issue is FileNotFoundError, I checked the folder and confirmed there are output files in the dbfs, any idea?
image
image

@parthosa parthosa changed the base branch from main to branch-23.12 November 27, 2023 20:14
@parthosa parthosa force-pushed the qualification-user-tools-notebook branch from 335544c to 418ef95 Compare November 27, 2023 20:19
@parthosa
Copy link
Collaborator Author

Thanks @nvliyuan. I have updated the base branch to branch-23.12 and fixed the issues related to output folder: The notebook now creates a separate virtual environment and installs spark-rapids-user-tools package in it. See description for updated UI.

@parthosa parthosa requested a review from NvTimLiu November 28, 2023 00:18
@nvliyuan
Copy link
Collaborator

LGTM

@nvliyuan nvliyuan merged commit 4a46ddf into NVIDIA:branch-23.12 Nov 29, 2023
2 checks passed
@parthosa parthosa deleted the qualification-user-tools-notebook branch November 29, 2023 05:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Create example Databricks notebook for qualification tool usage with user tools
3 participants