generated from RedHatQuickCourses/template-showroom-rh12025
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
01dd425
commit 7412576
Showing
3 changed files
with
278 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,104 @@ | ||
= 3 | ||
= Reviewing Cluster Upgrades | ||
:prewrap!: | ||
|
||
temp | ||
When reviewing a must-gather, it's very important to review the output of the `omc get clusterversion` command to identify the current Cluster Version, if an install is currently progressing, any errors in the Status field, and if there have been any failed installations which can be causing issues. | ||
|
||
[#gettingstarted] | ||
To get started we will be running the `omc get clusterversion` command and then running the command a second time and outputting to yaml. We specifically want to look at the History section which will show every upgrade ever performed on the cluster. In the example below we see three upgrades with the `4.14.18` upgrade showing a state of `Partial`. | ||
|
||
.clusterversion | ||
==== | ||
[source,bash] | ||
---- | ||
$ omc get clusterversion | ||
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS | ||
version 4.14.27 True False 17d Cluster version is 4.14.27 | ||
$ omc get clusterversion -o yaml | ||
[ | ||
history: | ||
- completionTime: null | ||
image: fr2.icr.io/armada-master/ocp-release:4.13.43-x86_64 | ||
startedTime: "2024-07-04T06:21:36Z" | ||
state: Partial | ||
verified: false | ||
version: 4.13.43 | ||
- completionTime: "2024-07-04T06:21:36Z" | ||
image: fr2.icr.io/armada-master/ocp-release:4.12.58-x86_64 | ||
startedTime: "2024-06-26T16:25:53Z" | ||
state: Partial | ||
verified: false | ||
version: 4.12.58 | ||
- completionTime: "2024-06-26T16:25:53Z" | ||
image: fr2.icr.io/armada-master/ocp-release:4.12.56-x86_64 | ||
startedTime: "2024-06-05T17:06:12Z" | ||
state: Partial | ||
verified: false | ||
version: 4.12.56 | ||
... | ||
---- | ||
==== | ||
|
||
[#partialupgrade] | ||
A Partial upgrade is the result of manifests failing to be applied, objects not being updated or deleted, or items missing that result in the upgrade looping as it tries to progress past the issue. This then result in Cluster Operators remaining on an older version as seen in an example from an actual Customer case. | ||
|
||
.clusteroperators | ||
==== | ||
[source,bash] | ||
---- | ||
$ omc get clusteroperators | ||
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE | ||
console 4.13.43 True False False 1d | ||
csi-snapshot-controller 4.13.43 True False False 239d | ||
dns 4.12.40 True False False 239d | ||
image-registry 4.13.43 True False False 134d | ||
ingress 4.13.43 True False False 8d | ||
insights 4.13.43 True False False 99d | ||
kube-apiserver 4.13.43 True False False 239d | ||
kube-controller-manager 4.13.43 True False False 239d | ||
kube-scheduler 4.13.43 True False False 239d | ||
kube-storage-version-migrator 4.13.43 True False False 8d | ||
marketplace 4.13.43 True False False 239d | ||
monitoring 4.13.43 True False False 159d | ||
network 4.12.40 True False False 239d | ||
node-tuning 4.13.43 True False False 8d | ||
openshift-apiserver 4.13.43 True False False 239d | ||
openshift-controller-manager 4.13.43 True False False 239d | ||
openshift-samples 4.13.43 True False False 8d | ||
operator-lifecycle-manager 4.13.43 True False False 239d | ||
operator-lifecycle-manager-catalog 4.13.43 True False False 239d | ||
operator-lifecycle-manager-packageserver 4.13.43 True False False 17h | ||
service-ca 4.13.43 True False False 239d | ||
storage 4.13.43 True False False 239d | ||
---- | ||
==== | ||
|
||
After reviwing the Cluster Version and the Cluster Operators we next want to move to the `cluster-version-operator` pod located in the `openshift-cluster-version` namespace. There you can review the logs to see where the upgrade is stalling. In the example below we can see that the logs show the upgrade process is getting stuck on the DNS and Network Operator which matches what we see in the ClusterOperator status. | ||
|
||
.cluster-version-operator | ||
==== | ||
[source,bash] | ||
---- | ||
I0718 14:20:33.366086 1 sync_worker.go:978] Precreated resource clusteroperator "network" (511 of 615) | ||
I0718 14:20:33.404169 1 sync_worker.go:978] Precreated resource clusteroperator "dns" (522 of 615) | ||
I0718 14:20:33.404237 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.404254 1 sync_worker.go:987] Running sync for namespace "openshift-network-operator" (505 of 615) | ||
I0718 14:20:33.449849 1 sync_worker.go:1007] Done syncing for namespace "openshift-network-operator" (505 of 615) | ||
I0718 14:20:33.449956 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.450181 1 sync_worker.go:987] Running sync for customresourcedefinition "networks.operator.openshift.io" (506 of 615) | ||
I0718 14:20:33.498554 1 sync_worker.go:1007] Done syncing for customresourcedefinition "networks.operator.openshift.io" (506 of 615) | ||
I0718 14:20:33.498601 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.498614 1 sync_worker.go:987] Running sync for customresourcedefinition "egressrouters.network.operator.openshift.io" (507 of 615) | ||
I0718 14:20:33.545449 1 sync_worker.go:1007] Done syncing for customresourcedefinition "egressrouters.network.operator.openshift.io" (507 of 615) | ||
I0718 14:20:33.545495 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.545507 1 sync_worker.go:987] Running sync for customresourcedefinition "operatorpkis.network.operator.openshift.io" (508 of 615) | ||
I0718 14:20:33.593751 1 sync_worker.go:1007] Done syncing for customresourcedefinition "operatorpkis.network.operator.openshift.io" (508 of 615) | ||
I0718 14:20:33.593790 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.593799 1 sync_worker.go:987] Running sync for clusterrolebinding "default-account-cluster-network-operator" (509 of 615) | ||
I0718 14:20:33.641898 1 sync_worker.go:1007] Done syncing for clusterrolebinding "default-account-cluster-network-operator" (509 of 615) | ||
I0718 14:20:33.642013 1 sync_worker.go:708] Dropping status report from earlier in sync loop | ||
I0718 14:20:33.642033 1 sync_worker.go:987] Running sync for deployment "openshift-network-operator/network-operator" (510 of 615) | ||
I0718 14:20:33.696357 1 sync_worker.go:1007] Done syncing for deployment "openshift-network-operator/network-operator" (510 of 615) | ||
I0718 14:20:33.696477 1 sync_worker.go:987] Running sync for clusteroperator "network" (511 of 615) | ||
E0718 14:20:33.696795 1 task.go:117] error running apply for clusteroperator "network" (511 of 615): Cluster operator network is updating version | ||
---- | ||
==== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,170 @@ | ||
= 3 | ||
= Reviewing Installed Operators | ||
:prewrap!: | ||
|
||
temp | ||
[#operators] | ||
To view all of the operators installed on a cluster we will be utilizing the `omc get operators` command which will output all of the operators like we see in the following example. The default is show all operators but you can narrow them down by running specifying the namespace with the `-n` option. | ||
|
||
.operators | ||
==== | ||
[source,bash] | ||
NAME AGE | ||
ansible-automation-platform-operator.aap 1y | ||
citrix-ingress-controller-operator.openshift-operators 1y | ||
cluster-kube-descheduler-operator.openshift-kube-descheduler-op 1y | ||
datagrid.openshift-operators 256d | ||
falcon-operator-rhmp.falcon-operator 1y | ||
falcon-operator.falcon-operator 1y | ||
grafana-operator.openshift-operators 1d | ||
openshift-gitops-operator.openshift-operators 1y | ||
openshift-pipelines-operator-rh.openshift-operators 1y | ||
portworx-certified.openshift-operators 1y | ||
quay-operator.openshift-operators 1y | ||
==== | ||
|
||
[#csv] | ||
Next will will look at the CluserServiceVersion (CSV) which represents a particular version of a running operator on a cluster. It includes metadata such as name, description, version, repository link, labels, icon, etc. It declares owned/required CRDs, cluster requirements, and install strategy that tells the Operator Lifecycle Manager how to create required resources and set up the operator as a deployment. | ||
|
||
In this example we will look at the CSVs installed in the `aap` namespace. | ||
|
||
.csv | ||
==== | ||
[source,bash] | ||
omc get csv -n aap | ||
NAME DISPLAY VERSION REPLACES PHASE | ||
aap-operator.v2.4.0-0.1692675723 Ansible Automation Platform 2.4.0+0.1692675723 aap-operator.v2.3.0-0.1692727374 Succeeded | ||
datagrid-operator.v8.5.1 Data Grid 8.5.1 datagrid-operator.v8.5.0 Succeeded | ||
falcon-operator.v0.6.2 CrowdStrike Falcon Platform - Operator 0.6.2 Succeeded | ||
grafana-operator.v5.12.0 Grafana Operator 5.12.0 grafana-operator.v5.11.0 Succeeded | ||
openshift-gitops-operator.v1.12.5 Red Hat OpenShift GitOps 1.12.5 openshift-gitops-operator.v1.12.4 Succeeded | ||
portworx-operator.v24.1.1 Portworx Enterprise 24.1.1 portworx-operator.v24.1.0 Succeeded | ||
quay-operator.v3.9.8 Red Hat Quay 3.9.8 quay-operator.v3.9.6 Succeeded | ||
==== | ||
|
||
[#subscription] | ||
A Subscription represents an intention to install an operator. It is the CustomResource that relate an operator to a CatalogSource. Subscriptions describe which channel of an operator package to subscribe to, and whether to perform updates automatically or manually. If set to automatic, the Subscription ensures OLM will manage and upgrade the operator to ensure the latest version is always running in the cluster. | ||
|
||
In this example we will look at the Subscription for the `ansible-automation-platform-operator` which will show us the Channel, installPlanApproval, name, source, sourceNamespace, and startingCSV. | ||
|
||
Additionally, under status, it provided the InstallPlan. | ||
|
||
.subscription | ||
==== | ||
[source,bash] | ||
$ omc get subscriptions -n aap ansible-automation-platform-operator -o yaml | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: Subscription | ||
metadata: | ||
creationTimestamp: "2023-06-29T21:11:28Z" | ||
generation: 5 | ||
labels: | ||
operators.coreos.com/ansible-automation-platform-operator.aap: "" | ||
name: ansible-automation-platform-operator | ||
namespace: aap | ||
resourceVersion: "700220891" | ||
uid: fe232c2b-5c33-405a-929b-419a4191aeee | ||
spec: | ||
channel: stable-2.4-cluster-scoped | ||
installPlanApproval: Manual | ||
name: ansible-automation-platform-operator | ||
source: redhat-operators | ||
sourceNamespace: openshift-marketplace | ||
startingCSV: aap-operator.v2.3.0-0.1686242173 | ||
status: | ||
catalogHealth: | ||
- catalogSourceRef: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: CatalogSource | ||
name: certified-operators | ||
namespace: openshift-marketplace | ||
resourceVersion: "700220797" | ||
uid: bd368dbe-e081-42d9-b9e6-278ee26d372a | ||
healthy: true | ||
lastUpdated: "2024-04-28T06:45:35Z" | ||
- catalogSourceRef: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: CatalogSource | ||
name: community-operators | ||
namespace: openshift-marketplace | ||
resourceVersion: "700220851" | ||
uid: 753a9d96-bc3c-4499-aecd-3f68e9420a3d | ||
healthy: true | ||
lastUpdated: "2024-04-28T06:45:35Z" | ||
- catalogSourceRef: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: CatalogSource | ||
name: redhat-marketplace | ||
namespace: openshift-marketplace | ||
resourceVersion: "700220809" | ||
uid: 77a4db7f-6abf-4e49-9ed1-a30a89c46d2d | ||
healthy: true | ||
lastUpdated: "2024-04-28T06:45:35Z" | ||
- catalogSourceRef: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: CatalogSource | ||
name: redhat-operators | ||
namespace: openshift-marketplace | ||
resourceVersion: "700220844" | ||
uid: a786e921-ac4c-4a06-a119-577591414821 | ||
healthy: true | ||
lastUpdated: "2024-04-28T06:45:35Z" | ||
conditions: | ||
- lastTransitionTime: "2024-04-28T06:45:35Z" | ||
message: all available catalogsources are healthy | ||
reason: AllCatalogSourcesHealthy | ||
status: "False" | ||
type: CatalogSourcesUnhealthy | ||
- lastTransitionTime: "2023-09-07T22:13:50Z" | ||
reason: RequiresApproval | ||
status: "True" | ||
type: InstallPlanPending | ||
currentCSV: aap-operator.v2.4.0-0.1693440031 | ||
installPlanGeneration: 5 | ||
installPlanRef: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: InstallPlan | ||
name: install-pqvgf | ||
namespace: aap | ||
resourceVersion: "427979356" | ||
uid: e4880b85-62bc-4a9d-ba46-affeb6244577 | ||
installedCSV: aap-operator.v2.4.0-0.1692675723 | ||
installplan: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: InstallPlan | ||
name: install-pqvgf | ||
uuid: e4880b85-62bc-4a9d-ba46-affeb6244577 | ||
lastUpdated: "2024-04-28T06:45:35Z" | ||
state: UpgradePending | ||
==== | ||
|
||
[#installplan] | ||
Finally, we will look at the InstallPlan which defines a set of resources to be created in order to install or upgrade to a specific version of a ClusterService defined by a CSV. | ||
|
||
.installplan | ||
==== | ||
[source,bash] | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: InstallPlan | ||
metadata: | ||
creationTimestamp: "2023-09-07T22:13:33Z" | ||
generateName: install- | ||
generation: 1 | ||
labels: | ||
operators.coreos.com/ansible-automation-platform-operator.aap: "" | ||
name: install-pqvgf | ||
namespace: aap | ||
ownerReferences: | ||
- apiVersion: operators.coreos.com/v1alpha1 | ||
blockOwnerDeletion: false | ||
controller: false | ||
kind: Subscription | ||
name: ansible-automation-platform-operator | ||
uid: fe232c2b-5c33-405a-929b-419a4191aeee | ||
resourceVersion: "427979731" | ||
uid: e4880b85-62bc-4a9d-ba46-affeb6244577 | ||
spec: | ||
approval: Manual | ||
approved: false | ||
clusterServiceVersionNames: | ||
- aap-operator.v2.4.0-0.1693440031 | ||
generation: 5 | ||
==== |