Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support extra arguments in new user tools CLI #646

Merged
merged 2 commits into from
Nov 2, 2023

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Nov 1, 2023

Fixes #574. This PR improves new user tools CLI to support extra arguments that are passed to the JAR (eg. --csv, --print-plans etc)

Testing

Sample Command 1

ascii profiling -- --help

Output

...
    -v, --verbose=VERBOSE
        Type: bool
        Default: False
        True or False to enable verbosity of the script.
    Additional flags are accepted.
        A list of valid Profiling tool options. Note that the wrapper ignores ["output-directory", "worker-info"] flags, and it does not support multiple "spark-property" arguments. For more details on Profiling tool options, please visit https://nvidia.github.io/spark-rapids/docs/spark-profiling-tool.html#profiling-tool-options

Sample Command 2

ascli profiling --eventlogs test-eventlog  --print-plans  --csv --platform dataproc

Output

...
____________________________________________________________________________________________________
                                          PROFILING Report
____________________________________________________________________________________________________

Output:
--------------------
Profiling tool output: /work_dir/prof_20231101232938_036EBDBc/rapids_4_spark_profile
    prof_20231101232938_036EBDBc
    └── rapids_4_spark_profile
        └── application_1698358295332_0018
            ├── planDescriptions.log
            ├── job_information.csv
            ├── sql_plan_metrics_for_application.csv
            ├── rapids_accelerator_jar_and_cudf_jar.csv
            ├── profile.log
            ├── executor_information.csv
            ├── spark_properties.csv
            ├── application_information.csv
            ├── job_+_stage_level_aggregated_task_metrics.csv
            ├── sql_duration_and_executor_cpu_time_percent.csv
            ├── data_source_information.csv
            ├── sql_to_stage_information.csv
            ├── application_log_path_mapping.csv
            ├── io_metrics.csv
            ├── sql_level_aggregated_task_metrics.csv
            └── spark_rapids_parameters_set_explicitly.csv
    2 directories, 16 files
    - To learn more about the output details, visit https://nvidia.github.io/spark-rapids/docs/spark-profiling-tool.html#understanding-profiling-tool-detailed-output-and-examples

Signed-off-by: Partho Sarthi <[email protected]>
@parthosa parthosa added the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Nov 1, 2023
@parthosa parthosa self-assigned this Nov 1, 2023
Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@parthosa parthosa merged commit ebca530 into NVIDIA:dev Nov 2, 2023
9 checks passed
@parthosa parthosa deleted the spark-rapids-tools-574 branch November 2, 2023 16:29
@parthosa parthosa added the feature request New feature or request label Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support extra arguments in new user tools CLI
4 participants