Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the log noise caused by core report summary #1426

Merged
merged 1 commit into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ class ToolTextFileWriter(
// No need to close the outputStream.
// Java should handle nested streams automatically.
utf8Writer.foreach { writer =>
logInfo(s"$finalLocationText output location: $textOutputLoc")
logDebug(s"$finalLocationText output location: $textOutputLoc")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be unnecessary but should we consider adding a single INFO level log per app indicating successful output generation, instead of logging for each individual file?

Again, it is unnecessary because these metrics are primarily for our internal use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if I get your comment.
Do you mean some sort of a summary of files that were generated but put them in a single message?

  • If that's the case, then I think we will hit the same issue that we have on the python side and Qualx. For example, when we generated tree of the output; it became too long. So, I believe if we put a single LogInfo message that contains list of all metrics generated, then the list will keep growing and I don't see that very useful information to put in the logging message.
  • In that change I changed the debug-level so that it does not get generated by default in production.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a single log line like:

INFO: Successfully generated output files for app <app-id> at <metrics location dir>

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! For now this PR does exactly that.
We can later improve the implementation if we want the debug-level to be a single message as well.

writer.flush()
writer.close()
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ class Profiler(hadoopConf: Configuration, appArgs: ProfileArgs, enablePB: Boolea
progressBar.foreach(_.finishAll())

// Write status reports for all event logs to a CSV file
logOutputPath()
val reportResults = generateStatusResults(appStatusReporter.asScala.values.toSeq)
ProfileOutputWriter.writeCSVTable("Profiling Status", reportResults, outputDir)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,13 +237,6 @@ class Qualification(outputPath: String, numRows: Int, hadoopConf: Configuration,
}
}

/**
* The outputPath of the current instance of the provider
*/
def getReportOutputPath: String = {
s"$outputDir/rapids_4_spark_qualification_output"
}

/**
* Generates a qualification report based on the provided summary information.
*/
Expand All @@ -263,6 +256,7 @@ class Qualification(outputPath: String, numRows: Int, hadoopConf: Configuration,
qWriter.writeStageReport(allAppsSum, order)
qWriter.writeUnsupportedOpsSummaryCSVReport(allAppsSum)
val appStatusResult = generateStatusResults(appStatusReporter.asScala.values.toSeq)
logOutputPath()
qWriter.writeStatusReport(appStatusResult, order)
if (mlOpsEnabled) {
if (allAppsSum.exists(x => x.mlFunctions.nonEmpty)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ trait RuntimeReporter extends Logging {
def generateRuntimeReport(hadoopConf: Option[Configuration] = None): Unit = {
RuntimeUtil.generateReport(outputDir, hadoopConf)
}
def logOutputPath(): Unit = {
logInfo(s"Tools output directory: $outputDir")
}

/**
* Updates the status of "SUCCESS" applications to "SKIPPED" if newer attempts with
Expand Down