-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrated code lifecycle
: Improve reinitialization when pausing build agents
#9939
Conversation
# Conflicts: # src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java
Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 pmd (7.8.0)src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.javaThe following rules are missing or misspelled in your ruleset file category/vm/bestpractices.xml: BooleanInstantiation, DontImportJavaLang, DuplicateImports, EmptyFinallyBlock, EmptyIfStmt, EmptyInitializer, EmptyStatementBlock, EmptyStatementNotInLoop, EmptySwitchStatements, EmptySynchronizedBlock, EmptyTryBlock, EmptyWhileStmt, ExcessiveClassLength, ExcessiveMethodLength, ImportFromSamePackage, MissingBreakInSwitch, SimplifyBooleanAssertion. Please check your ruleset configuration. src/test/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.javaThe following rules are missing or misspelled in your ruleset file category/vm/bestpractices.xml: BooleanInstantiation, DontImportJavaLang, DuplicateImports, EmptyFinallyBlock, EmptyIfStmt, EmptyInitializer, EmptyStatementBlock, EmptyStatementNotInLoop, EmptySwitchStatements, EmptySynchronizedBlock, EmptyTryBlock, EmptyWhileStmt, ExcessiveClassLength, ExcessiveMethodLength, ImportFromSamePackage, MissingBreakInSwitch, SimplifyBooleanAssertion. Please check your ruleset configuration. WalkthroughThe pull request introduces significant modifications to the build agent configuration and service classes, focusing on centralizing Docker client and build job executor management. The changes primarily involve replacing direct dependencies on Changes
Sequence DiagramsequenceDiagram
participant App as Application
participant BAConfig as BuildAgentConfiguration
participant DockerClient as Docker Client
participant Executor as Thread Pool Executor
App->>BAConfig: onApplicationReady()
BAConfig->>Executor: createBuildExecutor()
BAConfig->>DockerClient: createDockerClient()
App->>BAConfig: getBuildExecutor()
App->>BAConfig: getDockerClient()
App->>BAConfig: closeBuildAgentServices()
BAConfig->>Executor: shutdownBuildExecutor()
BAConfig->>DockerClient: closeDockerClient()
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (10)
src/main/java/de/tum/cit/aet/artemis/buildagent/BuildAgentConfiguration.java (2)
Line range hint
175-182
: Restrict visibility ofcreateDockerClient()
methodThe method
createDockerClient()
is only used within theBuildAgentConfiguration
class. To follow the principle of least privilege, consider changing its access modifier frompublic
toprivate
.Apply this diff to change the method's visibility:
-public DockerClient createDockerClient() { +private DockerClient createDockerClient() {
Line range hint
183-200
: Handle unexpected memory units inparseMemoryString
The
parseMemoryString
method does not handle cases where memory strings have unexpected units or formats (e.g., missing quotes, uppercase letters). Consider adding validation or default cases to prevent potentialNumberFormatException
.Apply this diff to enhance robustness:
else { + try { return Long.parseLong(memoryString); + } catch (NumberFormatException e) { + throw new LocalCIException("Invalid memory string format: " + memoryString, e); + } }src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java (2)
208-244
: Optimize executor service usage instopUnresponsiveContainer
Creating a new
ExecutorService
every timestopUnresponsiveContainer
is called may lead to resource overhead. Consider reusing a shared executor service or refactoring the method to avoid unnecessary executor creation.Apply this diff to use a shared executor service:
+private final ExecutorService executor = Executors.newSingleThreadExecutor(); public void stopUnresponsiveContainer(String containerId) { - try (ExecutorService executor = Executors.newSingleThreadExecutor()) { try { // Existing code... } finally { - executor.shutdown(); + // No need to shutdown the shared executor here } - } }Ensure the shared executor service is properly shut down when the application is stopped, possibly in a
@PreDestroy
method.
Line range hint
406-433
: Handle potential resource leaks inexecuteDockerCommand
In the
executeDockerCommand
method, if thedockerClient.execStartCmd
does not callonComplete
, theCountDownLatch
may not be decremented, leading to a potential indefinite wait inlatch.await()
. Consider adding a timeout or ensuring thatonComplete
is always called.Apply this diff to prevent indefinite blocking:
try { dockerClient.execStartCmd(execCreateCmdResponse.getId()).withDetach(detach).exec(new ResultCallback.Adapter<>() { @Override public void onNext(Frame item) { // Existing code... } + @Override + public void onError(Throwable throwable) { + latch.countDown(); + super.onError(throwable); + } + @Override + public void close() throws IOException { + latch.countDown(); + super.close(); + } }); } catch (ConflictException e) { throw new LocalCIException("Could not execute Docker command: " + String.join(" ", command), e); } try { - latch.await(); + if (!latch.await(60, TimeUnit.SECONDS)) { + throw new LocalCIException("Timeout while executing Docker command: " + String.join(" ", command)); + } } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new LocalCIException("Interrupted while executing Docker command: " + String.join(" ", command), e); }src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (3)
340-342
: Enhance exception logging inupdateLocalBuildAgentInformation
The catch block logs a generic error without specific information about the exception. Consider adding the exception to the log statement to aid in debugging.
Apply this diff to include the exception in the log:
catch (Exception e) { - log.error("Error while updating build agent information for agent with address {}", memberAddress); + log.error("Error while updating build agent information for agent with address {}", memberAddress, e); }
272-275
: Avoid logging sensitive or excessive informationThe log statement in the catch block logs the build job and executor service details, which may contain sensitive information or clutter the logs. Review the logged data to ensure it complies with logging policies.
Apply this diff to adjust the log statement:
- log.error("Couldn't add build job to thread pool: {}\n Concurrent Build Jobs Count: {} Active tasks in pool: {}, Concurrent Build Jobs Size: {}", buildJob, + log.error("Couldn't add build job to thread pool. Concurrent Build Jobs Count: {} Active tasks in pool: {}, Maximum Pool Size: {}", localProcessingJobs.get(), buildExecutorService.getActiveCount(), buildExecutorService.getMaximumPoolSize(), e);
578-582
: Simplify condition innodeIsAvailable()
methodThe condition in
nodeIsAvailable()
checks bothlocalProcessingJobs
andbuildExecutorService
active count against the maximum pool size. Since both represent the number of running tasks, consider simplifying the condition to avoid redundancy.Apply this diff to simplify the condition:
return localProcessingJobs.get() < buildExecutorService.getMaximumPoolSize() - && buildExecutorService.getActiveCount() < buildExecutorService.getMaximumPoolSize() && buildExecutorService.getQueue().isEmpty();
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java (1)
Line range hint
193-199
: Preserve interrupt status when handlingInterruptedException
In the
finishBuildJobExceptionally
method, when catchingInterruptedException
, consider resetting the thread's interrupt status to ensure proper thread interruption handling.Apply this diff to reset the interrupt status:
if (containerId != null) { buildJobContainerService.stopUnresponsiveContainer(containerId); } + if (exception instanceof InterruptedException) { + Thread.currentThread().interrupt(); + } }src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildAgentDockerService.java (2)
325-327
: Handle potential exceptions during Docker image removalWhile catching
NotFoundException
is appropriate, consider catching other exceptions that might occur during image removal, such asConflictException
, and log them accordingly to ensure reliable cleanup.Apply this diff to handle additional exceptions:
try { buildAgentConfiguration.getDockerClient().removeImageCmd(dockerImage).exec(); } catch (NotFoundException e) { log.warn("Docker image {} not found during cleanup", dockerImage); +} catch (ConflictException e) { + log.warn("Conflict occurred while removing Docker image {}: {}", dockerImage, e.getMessage()); +} catch (Exception e) { + log.error("Unexpected error while removing Docker image {}: {}", dockerImage, e.getMessage()); }
Line range hint
348-385
: Avoid potential race conditions in disk space checkIn
checkUsableDiskSpaceThenCleanUp()
, the method does not lock the Docker client or the image list during the cleanup process. Consider adding synchronization to prevent race conditions when multiple threads might modify the Docker images concurrently.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (5)
src/main/java/de/tum/cit/aet/artemis/buildagent/BuildAgentConfiguration.java
(8 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildAgentDockerService.java
(8 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java
(10 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java
(4 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java
(9 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildAgentDockerService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/BuildAgentConfiguration.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/BuildAgentConfiguration.java
Show resolved
Hide resolved
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java (1)
Line range hint
1-290
: Consider adding resilience patternsGiven that this service manages distributed build jobs and Docker containers, consider implementing additional resilience patterns:
- Circuit breaker pattern for Docker client operations
- Retry mechanism for failed executor submissions
- Health check mechanism for the build executor service
This would help handle temporary failures and improve system stability during connection issues.
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java (2)
208-245
: Consider extracting timeout values as configurable constantsThe implementation handles container stopping and killing gracefully with proper resource management. However, the timeout values (15s for stop, 10s for kill) could be made configurable.
+ private static final int CONTAINER_STOP_TIMEOUT_SECONDS = 15; + private static final int CONTAINER_KILL_TIMEOUT_SECONDS = 10; + private static final int CONTAINER_STOP_COMMAND_TIMEOUT_SECONDS = 20; public void stopUnresponsiveContainer(String containerId) { try (ExecutorService executor = Executors.newSingleThreadExecutor()) { try { Future<Void> future = executor.submit(() -> { - buildAgentConfiguration.getDockerClient().stopContainerCmd(containerId).withTimeout(15).exec(); + buildAgentConfiguration.getDockerClient().stopContainerCmd(containerId) + .withTimeout(CONTAINER_STOP_TIMEOUT_SECONDS).exec(); return null; }); - future.get(20, TimeUnit.SECONDS); + future.get(CONTAINER_STOP_COMMAND_TIMEOUT_SECONDS, TimeUnit.SECONDS); } // ... existing catch blocks ... try { Future<Void> killFuture = executor.submit(() -> { buildAgentConfiguration.getDockerClient().killContainerCmd(containerId).exec(); return null; }); - killFuture.get(10, TimeUnit.SECONDS); + killFuture.get(CONTAINER_KILL_TIMEOUT_SECONDS, TimeUnit.SECONDS); }
Line range hint
445-447
: Consider enhancing path validation patternThe current regex pattern might be too restrictive. Consider allowing additional valid path characters while maintaining security.
- if (path == null || path.contains("..") || !path.matches("[a-zA-Z0-9_*./-]+")) { + if (path == null || path.contains("..") || !path.matches("^[a-zA-Z0-9_\\-.*/@\\s]+$")) { throw new LocalCIException("Invalid path: " + path); }This pattern adds support for
@
and spaces while maintaining protection against path traversal.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java
(10 hunks)src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java
(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
🔇 Additional comments (7)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobManagementService.java (4)
33-33
: LGTM: Clean dependency injection changes
The replacement of direct ExecutorService with BuildAgentConfiguration aligns with the PR objective of managing connection lifecycles centrally.
Also applies to: 51-51
86-88
: LGTM: Constructor follows dependency injection best practices
The constructor properly initializes the BuildAgentConfiguration field through constructor injection, maintaining clean dependency management.
Line range hint 170-180
: LGTM: Improved exception handling and logging
The enhanced exception handling for cancelled jobs and the addition of the exception parameter to log.error improves error tracking and debugging capabilities.
Also applies to: 237-237
147-148
:
Check for uninitialized executor service
The direct use of buildAgentConfiguration.getBuildExecutor()
without a null check could lead to a NullPointerException.
Apply this diff to add a null check:
+if (buildAgentConfiguration.getBuildExecutor() == null) {
+ throw new LocalCIException("Build executor service is not initialized.");
+}
future = buildAgentConfiguration.getBuildExecutor().submit(buildJob);
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobContainerService.java (3)
62-62
: LGTM: Clean dependency injection of BuildAgentConfiguration
The refactoring to use BuildAgentConfiguration instead of direct DockerClient injection improves modularity and aligns with the PR's objective of managing Docker client lifecycle.
Also applies to: 80-83
400-403
: LGTM: Proper Docker command execution handling
The Docker command execution methods have been properly updated to use BuildAgentConfiguration while maintaining robust error handling and resource cleanup.
Also applies to: 406-439
452-454
: LGTM: Efficient container lookup implementation
The container listing and filtering implementation is clean and efficient, properly handling null cases and container name matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobExecutionService.java (2)
Line range hint
320-365
: Consider improving error handling and cleanup logicThe current implementation has several areas that could be enhanced:
- Extract the cleanup logic into a separate method to improve readability and maintainability
- Consider using a more structured approach for handling cleanup failures
- Implement proper exception chaining to preserve the root cause
Here's a suggested refactor:
finally { - buildJobContainerService.stopContainer(containerName); - - // Delete the cloned repositories - deleteCloneRepo(assignmentRepositoryUri, assignmentRepoCommitHash, buildJob.id(), assignmentRepositoryPath); - deleteCloneRepo(testRepositoryUri, assignmentRepoCommitHash, buildJob.id(), testsRepositoryPath); - // do not try to delete the temp repository if it does not exist or is the same as the assignment reposity - if (solutionRepositoryUri != null && !Objects.equals(assignmentRepositoryUri.repositorySlug(), solutionRepositoryUri.repositorySlug())) { - deleteCloneRepo(solutionRepositoryUri, assignmentRepoCommitHash, buildJob.id(), solutionRepositoryPath); - } - - for (int i = 0; i < auxiliaryRepositoriesUris.length; i++) { - deleteCloneRepo(auxiliaryRepositoriesUris[i], assignmentRepoCommitHash, buildJob.id(), auxiliaryRepositoriesPaths[i]); - } - - try { - deleteRepoParentFolder(assignmentRepoCommitHash, assignmentRepositoryPath, testRepoCommitHash, testsRepositoryPath); - } - catch (IOException e) { - msg = "Could not delete " + CHECKED_OUT_REPOS_TEMP_DIR + " directory"; - buildLogsMap.appendBuildLogEntry(buildJob.id(), msg); - log.error(msg, e); - } + cleanupResources(containerName, buildJob.id(), + new CleanupContext( + assignmentRepositoryUri, testRepositoryUri, solutionRepositoryUri, + auxiliaryRepositoriesUris, assignmentRepositoryPath, testsRepositoryPath, + solutionRepositoryPath, auxiliaryRepositoriesPaths, + assignmentRepoCommitHash, testRepoCommitHash + ) + ); } + private record CleanupContext( + VcsRepositoryUri assignmentRepositoryUri, VcsRepositoryUri testRepositoryUri, + VcsRepositoryUri solutionRepositoryUri, VcsRepositoryUri[] auxiliaryRepositoriesUris, + Path assignmentRepositoryPath, Path testsRepositoryPath, + Path solutionRepositoryPath, Path[] auxiliaryRepositoriesPaths, + String assignmentRepoCommitHash, String testRepoCommitHash + ) {} + + private void cleanupResources(String containerName, String buildJobId, CleanupContext ctx) { + List<Exception> suppressedExceptions = new ArrayList<>(); + + try { + buildJobContainerService.stopContainer(containerName); + } catch (Exception e) { + suppressedExceptions.add(e); + log.error("Failed to stop container {}", containerName, e); + } + + cleanupRepositories(buildJobId, ctx, suppressedExceptions); + + if (!suppressedExceptions.isEmpty()) { + LocalCIException ex = new LocalCIException("Cleanup failed with multiple errors"); + suppressedExceptions.forEach(ex::addSuppressed); + log.warn("Cleanup completed with {} errors", suppressedExceptions.size(), ex); + } + }
Line range hint
1-24
: Consider architectural improvements for better maintainabilityThe class has several areas that could benefit from architectural improvements:
- The class handles multiple concerns (build execution, Docker management, Git operations)
- The
runScriptAndParseResults
method has 12 parameters, making it hard to maintain- The logging pattern is duplicated across methods
Consider:
- Extracting Docker-specific operations into a dedicated
BuildDockerService
- Creating a
BuildContext
record to encapsulate build parameters- Implementing a proper logging facade to reduce duplication
Example of a
BuildContext
record:public record BuildContext( BuildJobQueueItem buildJob, String containerName, String containerId, RepositoryContext repositories, String assignmentRepoCommitHash, String testRepoCommitHash ) { public record RepositoryContext( VcsRepositoryUri assignmentUri, VcsRepositoryUri testUri, VcsRepositoryUri solutionUri, VcsRepositoryUri[] auxiliaryUris, Path assignmentPath, Path testsPath, Path solutionPath, Path[] auxiliaryPaths ) {} }Also applies to: 303-314
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobExecutionService.java
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobExecutionService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
🔇 Additional comments (1)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/BuildJobExecutionService.java (1)
303-306
: LGTM: Well-structured logging additions
The new logging statements enhance observability by tracking critical steps in the build process, following the established logging pattern.
Also applies to: 311-314
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (3)
343-345
: Consider using more specific exception typesWhile the error handling is improved, catching a generic
Exception
might mask specific issues. Consider catching more specific exceptions that could occur during the update process.- catch (Exception e) { + catch (IllegalStateException | HazelcastInstanceNotActiveException e) { log.error("Error while updating build agent information for agent with address {}", memberAddress, e); }
559-563
: Consider adding error handling for container cleanupWhile the service reinitialization looks good, the container cleanup operation might fail silently. Consider adding error handling to log any issues during cleanup.
buildAgentConfiguration.openBuildAgentServices(); // Cleanup docker containers - buildAgentDockerService.cleanUpContainers(); + try { + buildAgentDockerService.cleanUpContainers(); + log.info("Successfully cleaned up docker containers"); + } catch (Exception e) { + log.error("Failed to clean up docker containers", e); + }
582-586
: Consider caching the executor referenceWhile the availability check is comprehensive, calling
getBuildExecutor()
multiple times could be optimized by storing the reference locally.- var buildExecutorService = buildAgentConfiguration.getBuildExecutor(); + final var buildExecutorService = buildAgentConfiguration.getBuildExecutor(); log.debug("Currently processing jobs on this node: {}, active threads in Pool: {}, maximum pool size of thread executor : {}", localProcessingJobs.get(), buildExecutorService.getActiveCount(), buildExecutorService.getMaximumPoolSize()); - return localProcessingJobs.get() < buildExecutorService.getMaximumPoolSize() && buildExecutorService.getActiveCount() < buildExecutorService.getMaximumPoolSize() - && buildExecutorService.getQueue().isEmpty(); + final int maxPoolSize = buildExecutorService.getMaximumPoolSize(); + return localProcessingJobs.get() < maxPoolSize + && buildExecutorService.getActiveCount() < maxPoolSize + && buildExecutorService.getQueue().isEmpty();src/test/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.java (2)
135-137
: Consider making the dockerClientMock field final.The
dockerClientMock
field is only assigned once in@BeforeAll
and never modified afterward. Making it final would better express its immutable nature and thread-safety intentions.- private static DockerClient dockerClientMock; + private static final DockerClient dockerClientMock;Also applies to: 226-227
287-293
: Consider extracting the callback logic into a separate method.The lambda expression in the
doAnswer
block could be more readable if extracted into a named method that clearly expresses its purpose.+private static void simulateCommandCompletion(ResultCallback.Adapter<?> callback) { + callback.onComplete(); +} -doAnswer(invocation -> { - ResultCallback.Adapter<?> callback = invocation.getArgument(0); - callback.onComplete(); - return null; -}).when(execStartCmd).exec(any()); +doAnswer(invocation -> { + simulateCommandCompletion(invocation.getArgument(0)); + return null; +}).when(execStartCmd).exec(any());
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java
(9 hunks)src/test/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.java
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/test/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.java (1)
Pattern src/test/java/**/*.java
: test_naming: descriptive; test_size: small_specific; fixed_data: true; junit5_features: true; assert_use: assertThat; assert_specificity: true; archunit_use: enforce_package_rules; db_query_count_tests: track_performance; util_service_factory_pattern: true; avoid_db_access: true; mock_strategy: static_mocks; context_restart_minimize: true
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (1)
Pattern src/main/java/**/*.java
: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports
🔇 Additional comments (7)
src/main/java/de/tum/cit/aet/artemis/buildagent/service/SharedQueueProcessingService.java (4)
131-140
: LGTM: Constructor changes align with dependency injection best practices
The addition of BuildAgentConfiguration
and BuildAgentDockerService
as constructor dependencies follows proper dependency injection patterns and supports the reinitialization improvements.
272-275
: LGTM: Improved error handling with comprehensive executor metrics
The error handling now includes detailed metrics about the executor's state, which will help with debugging rejected executions. The retry mechanism with a maximum of 5 attempts provides a good balance between reliability and avoiding infinite retries.
355-356
: LGTM: Robust handling of executor configuration
The code properly handles both cases where the executor is available and when it's not, falling back to the thread pool size configuration. This makes the system more resilient to different states of the build agent.
524-525
: LGTM: Proper cleanup of resources during pause
The code properly closes the build executor and docker client when pausing the build agent, which aligns with the PR's objective of improving reinitialization.
src/test/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.java (3)
12-19
: LGTM: Import statements are well-organized and specific.
The new Mockito-related imports are correctly added and properly organized.
342-346
: LGTM: Build agent services mock setup is concise and clear.
The mockBuildAgentServices
method correctly sets up the Docker client mock for each test.
228-340
: Add unit tests for the Docker client mock setup.
The extensive Docker client mocking setup should be verified with unit tests to ensure it behaves correctly across all scenarios.
...est/java/de/tum/cit/aet/artemis/shared/base/AbstractSpringIntegrationLocalCILocalVCTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on TS3, always queued 10 jobs and paused artemis node 1 and 2. Jobs were successfully finished and no new jobs were queued to those nodes. After that i queued in 10 build jobs again and resumed one of the artemis nodes. Then 2 agents were working. Then I tried it again by using the resume all button. Everything works as described 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on TS3:
- Submitted a programming exercise 5 times with build agents active; all submissions were built successfully.
- Paused the build agents and submitted 4 more times; these submissions went into the queue as expected.
- Resumed the agents and observed that all queued submissions were built without issues.
works as expected 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code reapprove
Checklist
General
Server
Changes affecting Programming Exercises
Motivation and Context
We sometimes get this error when a build agent has been running for some time
Description
We reinitialize Docker Client and ThreadPoolExecutor when pausing build agent.
Steps for Testing
Prerequisites:
Testserver States
Note
These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.
Review Progress
Performance Review
Code Review
Manual Tests
Test Coverage
Screenshots
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests