Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect processing of short flags for user tools cli #677

Merged
merged 6 commits into from
Dec 7, 2023

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Dec 5, 2023

Fixes #671. This PR fixes the bug due to incorrect handling of short flag arguments and keyword arguments in the fire library (#671 (comment)).

The problem occurs because the short flag arguments gets stored in the keyword argument dictionary (rapids_options in our case). The current resolution involves retrieving the value from this dictionary.

Design

  1. Check if the value is passed as an argument, else check if short flag is present in rapids_options dictionary, else return the default value.
  2. An alternative approach could be to do this as a pre-processing step at a single place. Since fire generates help text from the function documentation, executing the pre-processing function first would impact the generation of help text.

Code Changes

  • Define a method get_value_or_pop() to get the correct value of argument based on #1.
  • Call this method in each wrapper class' qualification() and profiling() method.
  • bootstrap() and diagnostic() do not need this change because they do not have rapids_option argument and hence short flag arguments are processed correctly.

User Facing Changes

  • No user facing changes

User Tools Short Flag Argument List

spark_rapids_user_tools CLI:

  • Below table consists of only arguments that have short flags.
  • - denotes argument does not exist for the given tool and CSP.
  • Conflict denotes short flag is not possible due to conflict with other argument
Tool Configs DB AWS DB Azure Dataproc GKE Dataproc EMR OnPrem
Profiling aws_profile a - - - - -
credentials_file c c - c - -
eventlogs e e - e e e
gpu_cluster g g - g g -
jvm_heap_size j j - j j j
local_folder l l - l l l
profile p p - - p -
remote_folder r r - r r -
tools_jar t t - t t t
verbose v v - v v v
worker_info w w - w w w
Qualification aws_profile a - - - - -
eventlogs e e e e e e
filter_apps f f f f f f
jvm_heap_size j j j j j j
local_folder l l l l l l
profile p p - - p -
remote_folder r r r r r -
tools_jar t t t t t Conflict
verbose v v v v v v

New CLI:

  • In the new CLI many of the above arguments are removed.
  • Short flags are now same for all CSPs.
Tool Configs ShortFlag
Profiling cluster c
eventlogs e
output_folder o
platform p
verbose v
Qualification filter_apps f
output_folder o
platform p
target_platform t
verbose v

@parthosa parthosa self-assigned this Dec 5, 2023
@parthosa parthosa added the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Dec 5, 2023
@parthosa parthosa marked this pull request as ready for review December 5, 2023 19:01
@nartal1
Copy link
Collaborator

nartal1 commented Dec 5, 2023

LGTM in general. Could you please post the results when using short flags for user-tools cli. I am not certain if we have any tests to process arguments.

Signed-off-by: Partho Sarthi <[email protected]>
@parthosa
Copy link
Collaborator Author

parthosa commented Dec 6, 2023

Thanks @nartal1. Adding sample input and output here:

Input/Output

  1. CMD with actual parameters (--eventlogs, --verbose) - No warning message:
psarthi@linux:~/work_dir$ spark_rapids_user_tools onprem qualification --eventlogs ~/event.log --verbose
2023-12-06 11:11:27,859 INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 23.10.2
2023-12-06 11:11:27,859 INFO rapids.tools.qualification: ******* [Initialization]: Starting *******
  1. CMD with short flags only (-e, -v) - Warning message:
psarthi@linux:~/work_dir$ spark_rapids_user_tools onprem qualification -e ~/event.log -v
Warning: Instead of using short flags for argument, consider providing the value directly.
2023-12-06 11:12:01,463 INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 23.10.2
2023-12-06 11:12:01,463 INFO rapids.tools.qualification: ******* [Initialization]: Starting *******
  1. CMD with short flag and actual param (-e, --verbose) - Warning message:
psarthi@linux:~/work_dir$ spark_rapids_user_tools onprem qualification -e ~/event.log --verbose
Warning: Instead of using short flags for argument, consider providing the value directly.
2023-12-06 10:30:43,391 INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 23.10.2
2023-12-06 10:30:43,391 INFO rapids.tools.qualification: ******* [Initialization]: Starting *******
  1. CMD with short flag for core tools also (-e, -v, -n) - Warning message:
psarthi@linux:~/work_dir$ spark_rapids_user_tools onprem qualification -e ~/event.log -v -n 100
Warning: Instead of using short flags for argument, consider providing the value directly.
2023-12-06 11:13:48,956 INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 23.10.2

....
2023-12-06 11:13:52,003 DEBUG rapids.tools.cmd: submitting system command: <java -Xmx24g -cp \
~/work_dir/spark_rapids.jar:~/work_dir/hadoop/jars/* \
com.nvidia.spark.rapids.tool.qualification.QualificationMain --output-directory \
~/work_dir/qual_20231206191348_Ea5CAbCf --platform onprem -n 100 ~/event.log>

Signed-off-by: Partho Sarthi <[email protected]>
Copy link
Collaborator

@nartal1 nartal1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this PR with combination of short flags and arg values for both spark_rapids_user_tools and spark_rapids cli. LGTM. Thanks @parthosa !

Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @parthosa! LGTM.

@parthosa parthosa merged commit 2fdf994 into NVIDIA:dev Dec 7, 2023
9 checks passed
@parthosa parthosa deleted the spark-rapids-tools-671 branch December 7, 2023 23:02
@parthosa parthosa added the bug Something isn't working label Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] command line arguments are not processed correctly
3 participants