-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Profiling tool should detect if there is "file.encoding=UTF-8" #713
Labels
Comments
amahussein
added
core_tools
Scope the core module (scala)
and removed
? - Needs Triage
labels
Jan 10, 2024
amahussein
changed the title
[FEA] Profiling tool should detect if there is "file.encoding=UTF-8 "
[FEA] Profiling tool should detect if there is "file.encoding=UTF-8"
Jan 10, 2024
This property is part of the environment information in the eventlog. For example {
"Event":"SparkListenerEnvironmentUpdate",
"JVM Information":{
"Java Home":"/usr/lib/jvm/java-8-openjdk-amd64/jre",
"Java Version":"1.8.0_232 (Private Build)",
"Scala Version":"version 2.12.10"
},
"Spark Properties":{
},
"Hadoop Properties":{
},
"System Properties":{
"file.encoding":"UTF-8"
},
"Classpath Entries":{
}
} |
@viadea How do you want this information to be reported? |
Does this mean we need to update the qualification to consider the regex not supported if the non UTF-8 files are used? |
amahussein
added a commit
to amahussein/spark-rapids-tools
that referenced
this issue
Jan 22, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]> Fixes NVIDIA#713, Fixes NVIDIA#734 This PR changes the behavior of the AutTuner to display a comment when the file-encoding of an application is set to a value that is not "utf-8". The changes also improves the extraction of the RAPIDS jars values. *Changes for 713*: - Added a new field in `ApplicationSummaryInfo` that represents the systemProperties - Capture SystemProperties in the App in order to be able to check the file-encoding - Moved map properties to `CacheableProps` so that it can be used by the Qualification as well. - Moved String-conversion methods from AutTuner to StringUtils - Moved `getEventFromJsonMethod` to EventUtils object - Added a new UnitTest for the AutoTuner - Updated ApplicationInfoSuite unitTests *Changes for 734*: - Fixed the implementation of `CollectInformation.getRapidsJARInfo`
amahussein
added a commit
that referenced
this issue
Jan 23, 2024
…ons (#736) * [FEA] AutoTuner warns that non-utf8 may not support some GPU expressions Fixes #713, Fixes #734 This PR changes the behavior of the AutTuner to display a comment when the file-encoding of an application is set to a value that is not "utf-8". The changes also improves the extraction of the RAPIDS jars values. *Changes for 713*: - Added a new field in `ApplicationSummaryInfo` that represents the systemProperties - Capture SystemProperties in the App in order to be able to check the file-encoding - Moved map properties to `CacheableProps` so that it can be used by the Qualification as well. - Moved String-conversion methods from AutTuner to StringUtils - Moved `getEventFromJsonMethod` to EventUtils object - Added a new UnitTest for the AutoTuner - Updated ApplicationInfoSuite unitTests *Changes for 734*: - Fixed the implementation of `CollectInformation.getRapidsJARInfo` --------- Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
I wish Profiling tool should detect if there is "file.encoding=UTF-8 " from event log.
If it is empty or set to some other non-UTF8 encoding, we should raise the warning.
This is because, our regexp related functions might fallback to CPU if it is non-UTF8.
Such as below driver log message might show up:
The text was updated successfully, but these errors were encountered: