-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spotVM and maxRunDuration feature to VM provisioning #492
base: develop
Are you sure you want to change the base?
Add spotVM and maxRunDuration feature to VM provisioning #492
Conversation
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Fixed
Show fixed
Hide fixed
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Fixed
Show fixed
Hide fixed
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Show resolved
Hide resolved
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Outdated
Show resolved
Hide resolved
...es/com/google/jenkins/plugins/computeengine/InstanceConfiguration/help-provisioningType.html
Outdated
Show resolved
Hide resolved
src/main/resources/com/google/jenkins/plugins/computeengine/InstanceConfiguration/config.jelly
Outdated
Show resolved
Hide resolved
return "Spot VM"; | ||
} | ||
|
||
public FormValidation doCheckMaxRunDurationSeconds(@QueryParameter String value) { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing POST/RequirePOST annotation Warning
return "Spot VM"; | ||
} | ||
|
||
public FormValidation doCheckMaxRunDurationSeconds(@QueryParameter String value) { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing permission check Warning
return "Standard"; | ||
} | ||
|
||
public FormValidation doCheckMaxRunDurationSeconds(@QueryParameter String value) { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing POST/RequirePOST annotation Warning
return "Standard"; | ||
} | ||
|
||
public FormValidation doCheckMaxRunDurationSeconds(@QueryParameter String value) { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing permission check Warning
src/main/java/com/google/jenkins/plugins/computeengine/ui/helpers/Utils.java
Fixed
Show fixed
Hide fixed
src/main/java/com/google/jenkins/plugins/computeengine/ui/helpers/Utils.java
Fixed
Show fixed
Hide fixed
recording and screenshots moved to the summary at the top ⬆️ |
I will work on the testing as,
Edit: all these points are completed ✔️ (automated tests are written both unittests and integration test). When maxRunDuration deletes the VM in GCP side, the controller also deletes the slave, which is correct no issues for builds ✅ (already covered in the integration test) |
@@ -58,7 +58,7 @@ public static void teardown() throws IOException { | |||
|
|||
@Test | |||
public void testGetImage() throws Exception { | |||
Image image = client.getImage("debian-cloud", "debian-9-stretch-v20180820"); | |||
Image image = client.getImage("debian-cloud", "debian-12-bookworm-v20241210"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debian-9
is already deprecated, this is throwing 404 image not found errors,
➜ ~ gcloud compute images list --project=debian-cloud --no-standard-images --show-deprecated | grep debian-9-stretch-v20180820
debian-9-stretch-v20180820 debian-cloud debian-9 DEPRECATED READY
? String.format( | ||
"projects/%s/global/images/%s", BOOT_DISK_PROJECT_ID, System.getenv("GOOGLE_BOOT_DISK_IMAGE_NAME")) | ||
: "projects/debian-cloud/global/images/family/debian-9"; | ||
: "projects/debian-cloud/global/images/family/debian-12"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to support the GOOGLE_BOOT_DISK_PROJECT_ID
and GOOGLE_BOOT_DISK_IMAGE_NAME
even for linux too, because the base debian images don't have java installed.
See the new document file being added in this PR integration-tests.md
...n/resources/com/google/jenkins/plugins/computeengine/ui/helpers/Standard/config-detail.jelly
Outdated
Show resolved
Hide resolved
…plugin into add-max-run-duration-to-vm-provisioning
} | ||
return null; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test credentials are created using this constructor, where it is passing an empty id
.
https://github.com/jenkinsci/google-oauth-plugin/blob/5367a8a1b9c97e14a5188150f897fa91da1d9777/src/main/java/com/google/jenkins/plugins/credentials/oauth/GoogleRobotPrivateKeyCredentials.java#L65-L70
which ends up at https://github.com/jenkinsci/credentials-plugin/blob/1e306e8f6c54f8eeef973b3387b8c223bd60ae1c/src/main/java/com/cloudbees/plugins/credentials/common/IdCredentials.java#L87-L96
thus having a random uuid.
seems like at somepoint the google-credential-plugin might have got this behavior (or sending blank id, instead of projectid).
As the integration tests in this repository are not run often, this never got brought up..
👋 Completed the PR from draft to ready, plz help review @jglick , @Vlatombe , @nevingeorgesunny |
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Show resolved
Hide resolved
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java
Outdated
Show resolved
Hide resolved
I am leaving my review as purely comments because I am going on vacation and don't want to block this while I'm off. |
(@Artmorse has also worked on this plugin recently.) |
@@ -66,8 +65,11 @@ | |||
*/ | |||
@Log | |||
public class ComputeEngineCloudRestartPreemptedIT { | |||
@ClassRule | |||
public static Timeout timeout = new Timeout(20 * TEST_TIMEOUT_MULTIPLIER, TimeUnit.MINUTES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the Timeout
wasn't working, the JenkinsRule was still timing out at 180s, so fixing it with setting a property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add @WithTimeout(1200)
at test level.
@@ -236,6 +245,19 @@ public void setCreateSnapshot(boolean createSnapshot) { | |||
this.createSnapshot = createSnapshot && this.oneShot; | |||
} | |||
|
|||
/** | |||
* Required for JCasC Compatibility, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JCasC never guarantees backwards compatibility, and users should be aware of that. IMO fix the test instead.
|
||
package com.google.jenkins.plugins.computeengine.ui.helpers; | ||
|
||
public enum ProvisioningTypeValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an enum here kind of defeats the purpose of having extensibility. I think it is not needed anyway as it is only used for casc compatibility which is not usually kept as per Bobby. But just in case you face this use case in the future, I would recommend either adding a specific check method returning boolean on the base implementation, or a marker interface that PreemptibleVm could implement
@@ -590,6 +627,11 @@ public static SshConfiguration defaultSshConfiguration() { | |||
return SshConfiguration.builder().customPrivateKeyCredentialsId("").build(); | |||
} | |||
|
|||
@SuppressWarnings("unused") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SuppressWarnings("unused") | |
@SuppressWarnings("unused") // jelly |
@@ -943,6 +985,11 @@ public FormValidation doCheckNumExecutorsStr( | |||
return FormValidation.ok(); | |||
} | |||
|
|||
@SuppressWarnings("unused") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SuppressWarnings("unused") | |
@SuppressWarnings("unused") // jelly |
assertEquals("Zero configurations found", 1, cloud.getConfigurations().size()); | ||
InstanceConfiguration configuration = cloud.getConfigurations().get(0); | ||
assertEquals( | ||
"Provisioning type is wrong", | ||
ProvisioningTypeValue.PREEMPTIBLE, | ||
configuration.getProvisioningType().getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you get rid of the enum (or not)
assertEquals("Zero configurations found", 1, cloud.getConfigurations().size()); | |
InstanceConfiguration configuration = cloud.getConfigurations().get(0); | |
assertEquals( | |
"Provisioning type is wrong", | |
ProvisioningTypeValue.PREEMPTIBLE, | |
configuration.getProvisioningType().getValue()); | |
assertThat(cloud.getConfigurations(), contains(instanceOf(PreemptibleVm.class))); |
@@ -66,8 +65,11 @@ | |||
*/ | |||
@Log | |||
public class ComputeEngineCloudRestartPreemptedIT { | |||
@ClassRule | |||
public static Timeout timeout = new Timeout(20 * TEST_TIMEOUT_MULTIPLIER, TimeUnit.MINUTES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add @WithTimeout(1200)
at test level.
Summary
This PR aims to enhance the VM provisioning capabilities of the
google-compute-engine-plugin
by introducing support for Spot VMs andmaxRunDuration
. This effort seeks to address the issues reported in #408, #473, and #358.Enhancements
maxRunDuration
feature, enabling the automatic deletion of VMs after a specified duration to optimize resource utilization and cost.Rationale
Although this plugin already has a background cleanup job for leftover orphan agent VMs, setting the
maxRunDuration
ensures that VMs deletion still happens even if the Jenkins controller itself has crashed. Besides this is an optional setting with no defaults enforced, so this feature is not trying to replace/interfere with the existing background job.Technical Background
The plugin now supports the following VM options:
maxRunDuration
.Why Maintain Support for Both Spot and Preemptible VMs?
Despite the similarities between Spot and Preemptible VMs, retaining support for both options is essential for the following reasons:
Testing Status
Manual Tests: Screen recording and screenshots
User experience configuration page
provisioning_type_impl_2.mov
VM Provisioned in GCP
→ Unit tests for the
scheduling()
method, ensuring the compatibility with the old and new configurations.New integration test for the Spot VM with Max Duration: logs
(cloudbees internal ticket link for back reference)
Submitter checklist