Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Fix flexible-ipam e2e config #6324

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gran-vmv
Copy link
Contributor

@gran-vmv gran-vmv commented May 13, 2024

Use k8s version 1.30.1-1.1.
Supress clean up error when initialize e2e.
Enhance redeploy k8s determination logic.
Unmask kubelet service after installation.

@gran-vmv gran-vmv self-assigned this May 14, 2024
@gran-vmv gran-vmv added area/test/e2e Issues or PRs related to Antrea specific end-to-end testing. area/test/infra Issues or PRs related to test infrastructure (Jenkins configuration, Ansible playbook, Kind wrappers labels May 14, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe what's being fixed here?
I saw 4 unrelated changes in the commit but it's unclear to me what bugs are behind them.

clean_ns "monitoring"
clean_ns "antrea-ipam-test"
clean_ns "antrea-test"
clean_ns "monitoring" || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean_ns already ignores some delete errors, why it needs to futher ignore errors here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In flexible-ipam testbed, the k8s may not be deployed at this moment, thus we need to ignore the error.

Comment on lines 856 to 860
if [[ -n $POD_CIDRS ]]; then
for POD_CIDR in "${POD_CIDRS[@]}"; do
if [[ $POD_CIDR =~ .*:.* ]]
then
(( HAS_IPV6++ ))
else
(( HAS_IPV4++ ))
fi
done
if [[ ${IP_MODE} == "ipv4" ]]; then
(( HAS_IPV4-- ))
elif [[ ${IP_MODE} == "ipv6" ]]; then
(( HAS_IPV6-- ))
else
(( HAS_IPV4++ ))
(( HAS_IPV4-- ))
(( HAS_IPV6-- ))
fi
if [ ${HAS_IPV4} -eq ${INITIAL_VALUE} ] && [ ${HAS_IPV6} -eq ${INITIAL_VALUE} ]; then
return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is fine but I don't see what's wrong with the previous code too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added a check if [[ -n $POD_CIDRS ]]; then for this code, to skip this check when kubectl cannot get the POD_CIDRS.
Previously the redeployment will be skipped here if k8s is not deployed and POD_CIDRS is empty.

@@ -896,7 +898,7 @@ function redeploy_k8s_if_ip_mode_changes() {
else
NODE_IP_STRING="${IPV4S[i]},${IPV6S[i]}"
fi
ssh -o StrictHostKeyChecking=no -i "${WORKDIR}/id_rsa" -n ubuntu@${IPV4S[i]} "echo \"KUBELET_EXTRA_ARGS=--node-ip=${NODE_IP_STRING}\" | sudo tee /etc/default/kubelet; sudo systemctl restart kubelet"
ssh -o StrictHostKeyChecking=no -i "${WORKDIR}/id_rsa" -n ubuntu@${IPV4S[i]} "echo \"KUBELET_EXTRA_ARGS=--node-ip=${NODE_IP_STRING}\" | sudo tee /etc/default/kubelet; sudo systemctl unmask kubelet; sudo systemctl restart kubelet"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find any code masking the service, why it needs to be unmasked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we installed kubelet 1.28.9, the service is in masked state, we need to unmask it first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this service always masked in newer K8s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service may be masked for the K8s package in the new repo.
We changed the apt repo in testbed VM since old K8s repo is unavailable:
https://apt.kubernetes.io/
->
https://pkgs.k8s.io/core:/stable:/v1.30/deb/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have the K8s apt repository update in this script? If test.sh will also re-install K8s in IPAM testbed, I feel it's unnecessary to require users to update the https://pkgs.k8s.io/core:/stable:/v1.30/deb to a newer one manually on testbeds in the future.

@gran-vmv gran-vmv requested a review from tnqn May 16, 2024 05:40
ci/jenkins/test.sh Outdated Show resolved Hide resolved
@gran-vmv gran-vmv force-pushed the ipam-e2e-k8s branch 9 times, most recently from bf8726c to c277a71 Compare May 22, 2024 08:02
ci/jenkins/test.sh Outdated Show resolved Hide resolved
ci/jenkins/test.sh Outdated Show resolved Hide resolved
@@ -896,7 +898,7 @@ function redeploy_k8s_if_ip_mode_changes() {
else
NODE_IP_STRING="${IPV4S[i]},${IPV6S[i]}"
fi
ssh -o StrictHostKeyChecking=no -i "${WORKDIR}/id_rsa" -n ubuntu@${IPV4S[i]} "echo \"KUBELET_EXTRA_ARGS=--node-ip=${NODE_IP_STRING}\" | sudo tee /etc/default/kubelet; sudo systemctl restart kubelet"
ssh -o StrictHostKeyChecking=no -i "${WORKDIR}/id_rsa" -n ubuntu@${IPV4S[i]} "echo \"KUBELET_EXTRA_ARGS=--node-ip=${NODE_IP_STRING}\" | sudo tee /etc/default/kubelet; sudo systemctl unmask kubelet; sudo systemctl restart kubelet"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this service always masked in newer K8s?

ci/jenkins/test.sh Outdated Show resolved Hide resolved
@gran-vmv gran-vmv force-pushed the ipam-e2e-k8s branch 2 times, most recently from e40a45b to f531d23 Compare May 23, 2024 05:42
@gran-vmv gran-vmv added the action/backport Indicates a PR that requires backports. label May 23, 2024
@gran-vmv gran-vmv force-pushed the ipam-e2e-k8s branch 3 times, most recently from f89026e to 904ad2e Compare May 30, 2024 03:05
@gran-vmv
Copy link
Contributor Author

/test-all
/test-flexible-ipam-e2e

@gran-vmv gran-vmv force-pushed the ipam-e2e-k8s branch 2 times, most recently from 594eb1c to cf81121 Compare June 6, 2024 03:45
@luolanzone
Copy link
Contributor

luolanzone commented Jun 17, 2024

@gran-vmv the new triggered test is still failed, but I think it should be caused by another PR #6010, which should be fixed by this one #6452, you may rerun after the PR is merged.

@luolanzone
Copy link
Contributor

@gran-vmv is this fix related with following failures?

--- FAIL: TestAntreaIPAM/testAntreaIPAMPodConnectivityDifferentNodes (691.76s)
--- FAIL: TestAntreaIPAM/testAntreaIPAMOVSRestartSameNode (56.44s)
--- FAIL: TestConnectivity/testPingLargeMTU (21.18s)

I saw these failures in both v1.15.2 and 2.0.1 patch releases e2e jobs, not sure they are flaky tests or caused by testbed upgrades.

Use k8s version 1.30.1-1.1.
Supress clean up error when initialize e2e.
Enhance redeploy k8s determination logic.

Signed-off-by: gran <[email protected]>
@gran-vmv
Copy link
Contributor Author

/test-flexible-ipam-e2e

Copy link
Contributor

This PR is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports. area/test/e2e Issues or PRs related to Antrea specific end-to-end testing. area/test/infra Issues or PRs related to test infrastructure (Jenkins configuration, Ansible playbook, Kind wrappers lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants