-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development
: Reduce payload for live synchronization of build overview
#9888
Conversation
WalkthroughThe changes in this pull request involve updates to several classes related to build job processing and metrics management. A new constructor has been added to the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 pmd (7.7.0)src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.javaThe following rules are missing or misspelled in your ruleset file category/vm/bestpractices.xml: BooleanInstantiation, DontImportJavaLang, DuplicateImports, EmptyFinallyBlock, EmptyIfStmt, EmptyInitializer, EmptyStatementBlock, EmptyStatementNotInLoop, EmptySwitchStatements, EmptySynchronizedBlock, EmptyTryBlock, EmptyWhileStmt, ExcessiveClassLength, ExcessiveMethodLength, ImportFromSamePackage, MissingBreakInSwitch, SimplifyBooleanAssertion. Please check your ruleset configuration. 📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (12)
src/main/java/de/tum/cit/aet/artemis/buildagent/dto/BuildJobQueueItem.java (1)
51-55
: Consider documenting shared data structure impact.Since this is a shared DTO used between core and build agent nodes, the new constructor could affect existing Hazelcast/Redis data structures. Consider adding a Javadoc comment explaining the impact and migration requirements.
Example addition:
+ /** + * Constructor used to update retry count for a build job + * Note: Changes to this shared data structure may require Hazelcast/Redis migration + * + * @param queueItem The existing queue item + * @param buildAgent The build agent that will process the job + * @param newRetryCount The updated retry count + */ public BuildJobQueueItem(BuildJobQueueItem queueItem, BuildAgentDTO buildAgent, int newRetryCount) {src/main/java/de/tum/cit/aet/artemis/programming/service/localci/SharedQueueManagementService.java (3)
119-121
: Add method documentation and consider race condition warningWhile the method is straightforward, it should include JavaDoc documentation. Also, consider documenting potential race conditions in distributed scenarios where the size might be stale by the time it's used.
+ /** + * Returns the current size of the build job queue. + * Note: In a distributed environment, the size may change immediately after this call. + * @return the number of queued build jobs + */ public int getQueuedJobsSize() { return queue.size(); }
131-133
: Add method documentationAdd JavaDoc documentation to maintain consistency with other methods in the class.
+ /** + * Returns the current number of jobs being processed across all nodes. + * Note: In a distributed environment, the size may change immediately after this call. + * @return the number of jobs currently being processed + */ public int getProcessingJobsSize() { return processingJobs.size(); }
Line range hint
119-158
: Implementation aligns well with PR objectivesThe addition of these size methods supports the PR's goal of reducing payload for live synchronization of build overview. The implementation is clean, consistent, and well-placed within the class structure. The methods provide a more efficient way to access collection sizes without transferring the entire collections.
Consider adding a monitoring aspect to track the effectiveness of this payload reduction, perhaps through metrics that compare the old vs new payload sizes.
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (4)
269-270
: Convert TODO comments to tracked issues.These TODO comments about centralized logging should be properly tracked for future implementation.
Would you like me to create GitHub issues to track these TODOs for implementing centralized logging?
277-277
: Extract magic number to a constant.The maximum retry count of 5 should be extracted to a constant for better maintainability.
+ private static final int MAX_RETRY_ATTEMPTS = 5; + private void checkAvailabilityAndProcessNextBuild() { // ... existing code ... - if (buildJob.retryCount() >= 5) { + if (buildJob.retryCount() >= MAX_RETRY_ATTEMPTS) {Also applies to: 278-278
269-270
: Consider using structured logging.The error logging could benefit from structured logging with additional context fields.
- log.error("Couldn't add build job to thread pool: {}\n Concurrent Build Jobs Count: {} Active tasks in pool: {}, Concurrent Build Jobs Size: {}", buildJob, - localProcessingJobs.get(), localCIBuildExecutorService.getActiveCount(), localCIBuildExecutorService.getMaximumPoolSize(), e); + log.error("Failed to add build job to thread pool. buildJob={}, concurrentJobs={}, activeTasks={}, maxPoolSize={}, error={}", + buildJob, localProcessingJobs.get(), localCIBuildExecutorService.getActiveCount(), + localCIBuildExecutorService.getMaximumPoolSize(), e.getMessage(), e);Also applies to: 280-280
283-285
: Add more context to the TODO comment.The TODO comment about running the job on a different build agent needs more context about the implementation strategy.
- // TODO: we should try to run this job on a different build agent to avoid getting the same error again + // TODO: Implement build agent selection strategy for retries + // - Track failed build agents in BuildJobQueueItem + // - Implement selection logic to choose a different available agent + // - Consider agent load and failure history in selectionsrc/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIQueueWebsocketService.java (4)
157-167
: Improve object construction with the Builder patternThe constructors for
BuildJobQueueItem
,BuildConfig
,RepositoryInfo
, andBuildAgentInformation
involve many parameters, including several empty strings andnull
values. This can make the code harder to read and maintain due to the risk of misordering parameters. Implementing the Builder pattern for these classes can enhance readability and reduce errors by allowing named parameter assignments.Also applies to: 175-178, 185-188, 195-203
158-167
: Simplify list transformation using Stream APIIn the
removeUnnecessaryInformation
method, you can simplify the loop by utilizing the Stream API to map and collect the filtered jobs. This can make the code more concise and improve readability.Apply this refactor:
-private static List<BuildJobQueueItem> removeUnnecessaryInformation(List<BuildJobQueueItem> queuedJobs) { - var filteredQueuedJobs = new ArrayList<BuildJobQueueItem>(); // make list mutable in case it is not - for (BuildJobQueueItem job : queuedJobs) { - var buildConfig = removeUnnecessaryInformationFromBuildConfig(job.buildConfig()); - var repositoryInfo = removeUnnecessaryInformationFromRepositoryInfo(job.repositoryInfo()); - filteredQueuedJobs.add(new BuildJobQueueItem(job.id(), job.name(), job.buildAgent(), job.participationId(), job.courseId(), job.exerciseId(), job.retryCount(), - job.priority(), job.status(), repositoryInfo, job.jobTimingInfo(), buildConfig, null)); - } - return filteredQueuedJobs; +private static List<BuildJobQueueItem> removeUnnecessaryInformation(List<BuildJobQueueItem> queuedJobs) { + return queuedJobs.stream() + .map(job -> { + var buildConfig = removeUnnecessaryInformationFromBuildConfig(job.buildConfig()); + var repositoryInfo = removeUnnecessaryInformationFromRepositoryInfo(job.repositoryInfo()); + return new BuildJobQueueItem( + job.id(), + job.name(), + job.buildAgent(), + job.participationId(), + job.courseId(), + job.exerciseId(), + job.retryCount(), + job.priority(), + job.status(), + repositoryInfo, + job.jobTimingInfo(), + buildConfig, + null + ); + }) + .toList(); }
175-178
: Avoid hardcoding empty strings and nullsPassing empty strings and
null
values directly into constructors can lead to confusion and potential errors. Consider the following improvements:
- Modify the constructors to handle missing or optional fields more gracefully, possibly by using default values.
- Utilize builder patterns or factory methods to set only the required fields.
- Use
Optional
for fields that may or may not be present.Also applies to: 185-188
74-75
: Enhance variable naming for clarityFor better readability, consider renaming the variables after filtering to reflect their state. For example:
- Rename
queuedJobs
tofilteredQueuedJobs
after removing unnecessary information.- Rename
processingJobs
tofilteredProcessingJobs
similarly.This makes it clear that these lists have been processed and only contain the necessary information.
Also applies to: 81-82
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (6)
src/main/java/de/tum/cit/aet/artemis/buildagent/dto/BuildJobQueueItem.java
(1 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java
(1 hunks)src/main/java/de/tum/cit/aet/artemis/core/config/MetricsBean.java
(1 hunks)src/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIQueueWebsocketService.java
(4 hunks)src/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIWebsocketMessagingService.java
(0 hunks)src/main/java/de/tum/cit/aet/artemis/programming/service/localci/SharedQueueManagementService.java
(3 hunks)
💤 Files with no reviewable changes (1)
- src/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIWebsocketMessagingService.java
🧰 Additional context used
📓 Path-based instructions (5)
src/main/java/de/tum/cit/aet/artemis/buildagent/dto/BuildJobQueueItem.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/core/config/MetricsBean.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIQueueWebsocketService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/programming/service/localci/SharedQueueManagementService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
🔇 Additional comments (11)
src/main/java/de/tum/cit/aet/artemis/buildagent/dto/BuildJobQueueItem.java (2)
52-55
: LGTM! The new constructor follows established patterns.
The implementation maintains consistency with existing constructors and properly handles the retry count update while preserving immutability.
Let's verify the retry count usage across the codebase:
✅ Verification successful
The retry count implementation is correctly used across the codebase
The new constructor aligns perfectly with the existing retry count usage patterns:
- The retry count is properly tracked in the
BuildJob
entity - It's consistently used in queue processing with a maximum limit of 5 retries
- The new constructor is used when re-queueing failed jobs with an incremented retry count
- All other constructors properly maintain the retry count value through constructor chaining
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check how retry count is used to ensure consistent handling
# Look for retry count field access and modifications
ast-grep --pattern 'retryCount()'
Length of output: 159
Script:
#!/bin/bash
# Search for retry count usage patterns in Java files
rg -U "retryCount" --type java -A 2 -B 2
Length of output: 10162
52-54
: Verify maximum retry count validation.
The constructor accepts any retry count value. Consider adding validation to ensure it doesn't exceed the maximum of 5 attempts mentioned in the PR objectives.
Let's check if validation exists elsewhere:
src/main/java/de/tum/cit/aet/artemis/programming/service/localci/SharedQueueManagementService.java (1)
156-158
: Add method documentation and verify metrics integration
Add JavaDoc documentation to maintain consistency. This method appears to be part of the payload reduction effort mentioned in the PR objectives.
+ /**
+ * Returns the current number of registered build agents.
+ * Note: In a distributed environment, the size may change immediately after this call.
+ * @return the number of build agents
+ */
public int getBuildAgentInformationSize() {
return buildAgentInformation.size();
}
Let's verify the integration with metrics collection:
✅ Verification successful
Based on the analysis of the MetricsBean.java file, I can now generate the final response as I have found clear evidence of the integration with metrics collection.
Method is correctly integrated with metrics collection
The getBuildAgentInformationSize()
method is properly integrated into the metrics collection system through the extractBuildAgents
method in MetricsBean.java, which is used to publish build agent metrics via:
Gauge.builder("artemis.global.localci.agents", localCIBuildJobQueueService, MetricsBean::extractBuildAgents)
.strongReference(true)
.description("Number of active build agents")
.register(meterRegistry);
The suggested JavaDoc documentation is appropriate as it provides important context about the distributed nature of the value.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the usage of the new size methods in metrics collection
# Search for references to these methods in metrics-related classes
rg -l "getQueuedJobsSize|getProcessingJobsSize|getBuildAgentInformationSize" | grep -i "metric"
# Check if these methods are used instead of collection.size() calls
ast-grep --pattern 'collection.size()' | grep -i "metric"
Length of output: 343
Script:
#!/bin/bash
# Let's check the content of MetricsBean.java and look for any usage of these methods
cat src/main/java/de/tum/cit/aet/artemis/core/config/MetricsBean.java
# Also search for any direct references to SharedQueueManagementService in metrics
rg -l "SharedQueueManagementService" | grep -i "metric"
# Let's also check if there are any other metric-related files that might use these methods
fd metric -t f -e java
Length of output: 38496
src/main/java/de/tum/cit/aet/artemis/core/config/MetricsBean.java (4)
293-293
: LGTM! Code simplification improves readability
The change to use a direct method call instead of stream operations simplifies the code while maintaining the same functionality.
297-297
: LGTM! Consistent with previous method's simplification
The change maintains consistency with the extractRunningBuilds method's pattern, improving code readability.
301-301
: LGTM! Maintains consistent simplification pattern
The change follows the established pattern of simplifying stream operations to direct method calls.
306-307
: Verify the impact of excluding paused build agents
The addition of filtering out paused build agents is a significant change that improves metric accuracy. However, we should verify that this change doesn't affect any dependent systems that might rely on the previous behavior.
Let's verify the usage of this metric:
✅ Verification successful
Based on the search results, I can generate the final response as we have sufficient information:
Excluding paused build agents from maxConcurrentBuilds metric is safe
The metric artemis.global.localci.maxConcurrentBuilds
is only used as a Micrometer gauge for monitoring purposes, and there are no direct dependencies on this value in the application logic. The PAUSED status is properly handled in the build agent service, indicating this is an expected state in the system's design.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for dependencies on maxConcurrentBuilds metric
# Search for references to the metric name
rg -A 5 "artemis.global.localci.maxConcurrentBuilds"
# Search for potential build agent status checks
rg -A 5 "BuildAgentStatus.PAUSED"
Length of output: 3700
src/main/java/de/tum/cit/aet/artemis/programming/service/localci/LocalCIQueueWebsocketService.java (4)
199-200
: Handle null values in BuildAgentInformation
appropriately
In the removeUnnecessaryInformationFromBuildAgentInformation
method, you're passing null
for recentBuildJobs
and publicKey
. Ensure that the rest of the codebase can handle these null
values without issues. If these fields are optional, consider using Optional.empty()
or default values to improve robustness.
3-3
: Confirm necessity of newly added imports
Please verify that the imports added on lines 3, 24, and 26 are required. Unused imports can clutter the code and should be removed to maintain cleanliness.
Also applies to: 24-24, 26-26
157-203
: Consider reducing method visibility to enforce encapsulation
The methods removeUnnecessaryInformation
and its related helper methods are declared private static
. Ensure that this level of access is appropriate and that these methods are not needed outside this class. Keeping methods as private as possible aligns with the principle of least privilege.
161-163
:
Add null checks to prevent potential NullPointerExceptions
In methods like removeUnnecessaryInformation
, removeUnnecessaryInformationFromBuildConfig
, and removeUnnecessaryInformationFromRepositoryInfo
, ensure that job.buildConfig()
, job.repositoryInfo()
, and other accessed properties are not null
before invoking methods on them. If these can be null
, add appropriate null checks to avoid NullPointerException
s.
Also applies to: 175-178, 185-188
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in testing session, works as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on staging and works. Also used triggerall for ~500 queued jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in testing session, works as described
Checklist
General
Server
Client
authorities
to all new routes and checked the course groups for displaying navigation elements (links, buttons).Changes affecting Programming Exercises
Motivation and Context
We want to reduce the payload size for messages related to build overview (localci). Many of the attributes aren't actually needed by the view. so we removed those.
PR also includes a workaround for flickering, so a build job can only be run a max of 5 attempts.
Steps for Testing
PR ALREADY DEPLOYED ON
TS3and STAGINGPrerequisites:
Testserver States
Note
These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.
Review Progress
Performance Review
Code Review
Manual Tests
Performance Tests
Summary by CodeRabbit
Release Notes
New Features
BuildJobQueueItem
to enhance flexibility in updating retry counts.SharedQueueManagementService
for retrieving sizes of queued jobs, processing jobs, and build agent information.Improvements
SharedQueueProcessingService
.MetricsBean
for improved performance and clarity.Future Enhancements
LocalCIWebsocketMessagingService
indicating plans to convert build agent information into a proper DTO.