Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱Add initial Rosa machine pool integration tests #5214

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

PanSpagetka
Copy link

@PanSpagetka PanSpagetka commented Nov 12, 2024

What type of PR is this?
Adding tests. I am not sure what kind should apply so I am not applying any.,

What this PR does / why we need it:

Adding basic integration tests for ROSAMachinePoolReconciler. One test case for creating new machine pool and for deleting.

To be able to mock OCM and STS calls I had to do small refactoring.

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Nov 12, 2024
Copy link

linux-foundation-easycla bot commented Nov 12, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign richardcase for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-priority needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Nov 12, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @PanSpagetka!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-aws 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-aws has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 12, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @PanSpagetka. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 12, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch from 9e2fe4e to 2ce8f3d Compare November 12, 2024 09:49
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Nov 12, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch from 2ce8f3d to abce607 Compare November 12, 2024 10:03
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 12, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch 3 times, most recently from f2c32f1 to abd21bc Compare November 18, 2024 10:38
@PanSpagetka PanSpagetka changed the title WIP: 🌱Add initial Rosa machine pool integration tests 🌱Add initial Rosa machine pool integration tests Nov 18, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 18, 2024
@serngawy
Copy link
Contributor

@nrb would you add ok-to-test

main.go Outdated
@@ -238,6 +239,8 @@ func main() {
WatchFilterValue: watchFilterValue,
WaitInfraPeriod: waitInfraPeriod,
Endpoints: awsServiceEndpoints,
NewOCMClient: rosa.NewOCMClient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good idea to create the clients here in the main . why do you need to do that ? for integration test you may just mock the clients.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to expose and init the clients in main. We can init the clients (sts and ocm) internally in the struct and regards the integration test you can mock it similar to this here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I moved the initialization to SetupWithManager that seems to me as the only ROSAMachinePoolReconciler method where it fits. I am using same mechanism to create the mock implementation as the STS api.

pkg/rosa/idps.go Outdated
@@ -4,7 +4,6 @@ import (
"fmt"

cmv1 "github.com/openshift-online/ocm-sdk-go/clustersmgmt/v1"
"github.com/openshift/rosa/pkg/ocm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the github.com/openshift/rosa/pkg/ocm in order to avoid duplicate the effort of creating a lib that communicate with OCM. Please keep using it for this first integration test iteration

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use github.com/openshift/rosa/pkg/ocm directly here, because it doesn't expose interface that we could mock. So I had to create mockable interface at our side to be able to mock the ocm calls.

The guys that are developing github.com/openshift/rosa/pkg/ocm are planning to create (and I hope also expose) OCM calls as interface to increase the testability of OCM code. So we could get rid of this in the future. But I am not sure how long will it take them to do it.

pkg/rosa/idps.go Outdated
@@ -14,7 +13,7 @@ const (

// CreateAdminUserIfNotExist creates a new admin user withe username/password in the cluster if username doesn't already exist.
// the user is granted admin privileges by being added to a special IDP called `cluster-admin` which will be created if it doesn't already exist.
func CreateAdminUserIfNotExist(client *ocm.Client, clusterID, username, password string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the github.com/openshift/rosa/pkg/ocm in order to avoid duplicate the effort of creating a lib that communicate with OCM. Please keep using it for this first integration test iteration

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, just github duplication

@AndiDog
Copy link
Contributor

AndiDog commented Nov 19, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 19, 2024
Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the skeleton for RosaControlPlane_controller integration test as well, it is important as the RosaMachinePool integration test

main.go Outdated
@@ -238,6 +239,8 @@ func main() {
WatchFilterValue: watchFilterValue,
WaitInfraPeriod: waitInfraPeriod,
Endpoints: awsServiceEndpoints,
NewOCMClient: rosa.NewOCMClient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to expose and init the clients in main. We can init the clients (sts and ocm) internally in the struct and regards the integration test you can mock it similar to this here

token, url, err := ocmCredentials(ctx, rosaScope)
if err != nil {
return nil, err
return ocmclient{}, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil

}

// NewMockOCMClient creates a new empty ocm.Client without any real connection.
func NewMockOCMClient(ctx context.Context, rosaScope *scope.ROSAControlPlaneScope) (OCMClient, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for function params. in fact it not a function just variable

c := ocmclient{ ocmClient: ocm.Client{}, }

@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch from 2cbbf46 to cdc81e8 Compare November 26, 2024 08:14
@PanSpagetka
Copy link
Author

/test pull-cluster-api-provider-aws-test

Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments and please add test for rosacontrolplane_controller.go as well.

@@ -203,7 +211,7 @@ func (r *ROSAControlPlaneReconciler) reconcileNormal(ctx context.Context, rosaSc
}
}

ocmClient, err := rosa.NewOCMClient(ctx, rosaScope)
ocmClient, err := r.NewOCMClient(ctx, rosaScope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check for nil, r.NewOCMClient could be nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forget this, if r.NewOCMClient != nil

@@ -186,7 +194,7 @@ func (r *ROSAMachinePoolReconciler) reconcileNormal(ctx context.Context,
}
}

ocmClient, err := rosa.NewOCMClient(ctx, rosaControlPlaneScope)
ocmClient, err := r.NewOCMClient(ctx, rosaControlPlaneScope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, check for nil

Template: clusterv1.MachineTemplateSpec{
Spec: clusterv1.MachineSpec{
ClusterName: ownerCluster.Name,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the infrastructureRef pointing to RosaMachinePool to have proper test,

Kind: "ROSAMachinePool",
APIVersion: expinfrav1.GroupVersion.String(),
},
Spec: expinfrav1.RosaMachinePoolSpec{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should at least set the version, otherwise it should fail


result, err := r.Reconcile(ctx, req)

g.Expect(err).ToNot(HaveOccurred())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more validations, check for the RosaMachinePool.status

defer teardown()

deleteTime := metav1.NewTime(time.Now().Add(5 * time.Second))
rosaMachinePool.ObjectMeta.Finalizers = []string{"finalizer-rosa"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what are you testing here, but finalizer should be added by the RosaMachinePool controller

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


result, err := r.Reconcile(ctx, req)
g.Expect(err).ToNot(HaveOccurred())
g.Expect(result).To(Equal(ctrl.Result{}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to validate if the RosaMachinePool deleted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing that DeleteNodePool. It seems that reconcileDelete doesn't modify the ROSAMachinePool object directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata.deletionTimeStamp should be created and then the finalizer should be removed. try to Get the rosamachinepool and it should gives you error not exist. if you have issue most likely in node delete here

@@ -20,16 +22,142 @@ const (
ocmAPIURLKey = "ocmApiUrl"
)

type ocmclient struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCMClient not ocmclient and please move the OCMClient struct, interface and funcs to another file ex; ocmclient.go

pkg/rosa/idps.go Outdated
@@ -14,7 +13,7 @@ const (

// CreateAdminUserIfNotExist creates a new admin user withe username/password in the cluster if username doesn't already exist.
// the user is granted admin privileges by being added to a special IDP called `cluster-admin` which will be created if it doesn't already exist.
func CreateAdminUserIfNotExist(client *ocm.Client, clusterID, username, password string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, just github duplication

@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch from cdc81e8 to 23e5a17 Compare November 28, 2024 12:27
@PanSpagetka PanSpagetka marked this pull request as draft November 28, 2024 12:56
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 28, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch 2 times, most recently from 975b9c4 to ee2cdfd Compare December 2, 2024 15:50
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 2, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch 2 times, most recently from df4b31b to 8dd9c18 Compare December 9, 2024 11:35

result, err := r.Reconcile(ctx, req)
g.Expect(err).ToNot(HaveOccurred())
g.Expect(result).To(Equal(ctrl.Result{}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata.deletionTimeStamp should be created and then the finalizer should be removed. try to Get the rosamachinepool and it should gives you error not exist. if you have issue most likely in node delete here

"sigs.k8s.io/controller-runtime/pkg/client"

"sigs.k8s.io/cluster-api-provider-aws/v2/pkg/cloud/scope"
)

const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to move this part to the new ocmClinet file. to avoid conflict and rebase another effort in progress to change that PR

@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch 3 times, most recently from 59c2caf to 0034cfe Compare December 12, 2024 10:38
@PanSpagetka PanSpagetka marked this pull request as ready for review December 16, 2024 08:14
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 16, 2024
@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch 3 times, most recently from 59adac6 to 55d1d29 Compare December 16, 2024 14:21
Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we almost there :) , few comments and please run make lint-fix before pushing your changes to fix all the golang-ci job issue

@@ -203,7 +211,7 @@ func (r *ROSAControlPlaneReconciler) reconcileNormal(ctx context.Context, rosaSc
}
}

ocmClient, err := rosa.NewOCMClient(ctx, rosaScope)
ocmClient, err := r.NewOCMClient(ctx, rosaScope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forget this, if r.NewOCMClient != nil

@@ -424,14 +432,16 @@ func (r *ROSAControlPlaneReconciler) reconcileClusterVersion(rosaScope *scope.RO
return nil
}

scheduledUpgrade, err := rosa.CheckExistingScheduledUpgrade(ocmClient, cluster)
c := ocmClient.(*ocm.Client)
scheduledUpgrade, err := rosa.CheckExistingScheduledUpgrade(c, cluster)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better naming; rosaOCMClient := ocmClient.(*ocm.Client)
OR scheduledUpgrade, err := rosa.CheckExistingScheduledUpgrade(ocmClient.(*ocm.Client) , cluster)

if err != nil {
return fmt.Errorf("failed to get existing scheduled upgrades: %w", err)
}

if scheduledUpgrade == nil {
ack := (rosaScope.ControlPlane.Spec.VersionGate == rosacontrolplanev1.Acknowledge || rosaScope.ControlPlane.Spec.VersionGate == rosacontrolplanev1.AlwaysAcknowledge)
scheduledUpgrade, err = rosa.ScheduleControlPlaneUpgrade(ocmClient, cluster, version, time.Now(), ack)
c := ocmClient.(*ocm.Client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -334,7 +340,8 @@ func (r *ROSAMachinePoolReconciler) reconcileMachinePoolVersion(machinePoolScope
}

if scheduledUpgrade == nil {
scheduledUpgrade, err = rosa.ScheduleNodePoolUpgrade(ocmClient, clusterID, nodePool, version, time.Now())
c := ocmClient.(*ocm.Client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

// This is set by CAPI MachinePool reconcile
test.old.OwnerReferences = []metav1.OwnerReference{
{
Name: ownerMachinePool(i).Name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually in k8s project we use the rand lib to generate a random string. If you can change it will be better otherwise add TODO note to change that.

InstanceType: "m5.large",
},
}
oc := ownerCluster(9)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here you are passing 9 to be similar to the rosaMachinePool naming, its better to use the rand lib generate random string and pass it to all the test CRs

g.Expect(err).NotTo(HaveOccurred())
return nodePool, true, nil
}).Times(1)
m.DeleteNodePool("rosa-control-plane-9", "node-pool-1").DoAndReturn(func(clusterId string, nodePoolID string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this name "rosa-control-plane-9" come from ? if it is the cp CR then better is cp.Name


cpPh, err := patch.NewHelper(cp, testEnv)
cp.Status.Ready = true
cp.Status.ID = "rosa-control-plane-9"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rosa-control-plane-9" is repeated many places in the code please define var OR ref to it from RosaControlPlane CR.


machinePoolScope.Close()
time.Sleep(50 * time.Millisecond)
m := &expinfrav1.ROSAMachinePool{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better naming rosaMachinePool := &expinfrav1.ROSAMachinePool{}

m := &expinfrav1.ROSAMachinePool{}
key := client.ObjectKey{Name: mp.Name, Namespace: ns.Name}
err4 := testEnv.Get(ctx, key, m)
g.Expect(err4).ToNot(HaveOccurred())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this, After deleting the RosaMachinePool we expect to have error IsNotFound Please check if the error is IsNotFound error using the API mentioned not any other error

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/exp/controllers/rosamachinepool_controller.go#L295 the only function that might delete the CR is DeleteNodePool that I am mocking. I can mock it in a way that it deletes the machinePool CR, but is that right? Does calling ocm actually delete the CR?

@PanSpagetka PanSpagetka force-pushed the rosa-initial-integration-test branch from 55d1d29 to e027cf5 Compare December 17, 2024 12:56
Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Robin, looks good as initial test-cases, will add more test-cases later.

@serngawy
Copy link
Contributor

@AndiDog @damdo would you review and lgtm this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants