Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11367. Improve ozone balancing status command output #7139

Open
wants to merge 43 commits into
base: master
Choose a base branch
from

Conversation

juncevich
Copy link
Contributor

What changes were proposed in this pull request?

Minor improvements to command output:

  1. job task time consumption
  2. iteration time consumption
  3. unfinished iteration shouldn't be in iteration history section
  4. show data units, even it less than gb show it in mb

What is the link to the Apache JIRA

HDDS-11367

How was this patch tested?

Tested by robot test

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juncevich can you please elaborate on what these points mean?

job task time consumption
iteration time consumption

It will help reviewers understand what you're proposing.

@juncevich
Copy link
Contributor Author

@juncevich can you please elaborate on what these points mean?

job task time consumption
iteration time consumption

It will help reviewers understand what you're proposing.

Hi, @siddhantsangwan. Sure, i can explain. Balancing job constists from some iterations. Job task time - the whole balancing time, iteration time - particular iteration time. Job time = iteration time + iteration time ...

@juncevich
Copy link
Contributor Author

Hi siddhantsangwan. Can you review my PR?

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juncevich thanks for working on this. Other than my comments below, we definitely need more testing here. I'd like to at least see:

  1. Unit tests for balancer that verify the iteration duration and other properties are correct.
  2. Tests for the balancer status command that verify the code is correctly picking up the mocked values (mocked response from SCM).

We can also add verifications to the robot tests.

"Scheduled to move containers", containerMovesScheduled,
"Already moved containers", containerMovesCompleted,
"Failed to move containers", containerMovesFailed,
"Failed to move containers by timeout", containerMovesTimeout,
"Entered data to nodes", enteringDataNodeList,
"Exited data from nodes", leavingDataNodeList);
}

private String getPrettyIterationStatusInfo(long duration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably existing utility methods to do this as well. See if you can find any? It's better to use them since they will be well tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find utility methods. I will extract this method to separate util class and write test for it.

@siddhantsangwan
Copy link
Contributor

@juncevich please let me know when you've addressed comments and this is ready for another review.

@juncevich
Copy link
Contributor Author

please let me know when you've addressed comments and this is ready for another review.

@siddhantsangwan I've not understood about comments, can you explain about? PR is ready for another review.

@juncevich
Copy link
Contributor Author

@siddhantsangwan, can you take another look at this PR?

@@ -40,6 +40,9 @@ public final class ContainerBalancerMetrics {
" in the latest iteration.")
private MutableCounterLong dataSizeMovedGBInLatestIteration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this metric now? Or is it for backward compatibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility.

Copy link
Contributor

@myskov myskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few minor comments

@myskov
Copy link
Contributor

myskov commented Nov 22, 2024

it seems issues @siddhantsangwan mentioned have been addressed. @siddhantsangwan could you take look again, please?

@siddhantsangwan
Copy link
Contributor

@juncevich if you've addressed all comments and fixed tests, I can start the CI workflow.

@juncevich
Copy link
Contributor Author

@juncevich if you've addressed all comments and fixed tests, I can start the CI workflow.

Great, thanks! Just fixed the flaky test. Everything should be OK.

# Conflicts:
#	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerTask.java
@juncevich
Copy link
Contributor Author

@siddhantsangwan I've fixed your review notices. Could you look through the changed code? I have to fix the robot test and PR will be ready.

@juncevich juncevich force-pushed the HDDS-11367 branch 2 times, most recently from dec8e06 to 4a40429 Compare November 26, 2024 15:14
@juncevich
Copy link
Contributor Author

@siddhantsangwan robot test fixed. PR is ready for another review round.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants