Skip to content

Commit

Permalink
Merge pull request #207 from fmherschel/angi-ScaleOut
Browse files Browse the repository at this point in the history
Angi scale out
  • Loading branch information
fmherschel authored Oct 17, 2023
2 parents 6c379ab + c1f89d6 commit 0cb631a
Show file tree
Hide file tree
Showing 112 changed files with 4,752 additions and 359 deletions.
2 changes: 1 addition & 1 deletion SAPHanaSR-angi.spec
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ License: GPL-2.0
Group: Productivity/Clustering/HA
AutoReqProv: on
Summary: Resource agents to control the HANA database in system replication setup
Version: 1.2.0
Version: 1.2.2
Release: 0
Url: https://www.suse.com/c/fail-safe-operation-of-sap-hana-suse-extends-its-high-availability-solution/

Expand Down
2 changes: 1 addition & 1 deletion SAPHanaSR-tester.spec
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ License: GPL-2.0
Group: Productivity/Clustering/HA
AutoReqProv: on
Summary: Test suite for SAPHanaSR clusters
Version: 1.1.0
Version: 1.2.1
Release: 0
Url: https://www.suse.com/c/fail-safe-operation-of-sap-hana-suse-extends-its-high-availability-solution/

Expand Down
13 changes: 11 additions & 2 deletions man/SAPHanaSR-ScaleOut.7
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.\" Version: 1.001
.\"
.TH SAPHanaSR-ScaleOut 7 "09 May 2023" "" "SAPHanaSR-angi"
.TH SAPHanaSR-ScaleOut 7 "18 Sep 2023" "" "SAPHanaSR-angi"
.\"
.SH NAME
SAPHanaSR-ScaleOut \- Tools for automating SAP HANA system replication in
Expand Down Expand Up @@ -315,8 +315,17 @@ to return in time.
23. The SAP HANA Fast Restart feature on RAM-tmpfs as well as HANA on persistent
memory can be used, as long as they are transparent to SUSE HA.
.PP
24. The SAPHanaController RA, the SUSE HA cluster and several SAP components
24. The SAP HANA site name is from 2 up to 32 characters long. It starts with a
character or number. Subsequent characters may contain dash and underscore.
.PP
25. The SAPHanaController RA, the SUSE HA cluster and several SAP components
need read/write access and sufficient space in the Linux /tmp filesystem.
.PP
26. SAP HANA Native Storage Extension (NSE) is supported.
Important is that this feature does not change the HANA topology or interfaces.
In opposite to Native Storage Extension, the HANA Extension Nodes are changing
the topology and thus currently are not supported.
Please refer to SAP documentation for details.
.PP
.\"
.SH BUGS
Expand Down
2 changes: 1 addition & 1 deletion man/SAPHanaSR-ScaleOut_basic_cluster.7
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ If the cluster uses disk-less SBD, the no-quorum-policy 'suicide' is required.

The crm basic parameter default-resource-stickiness defines the 'stickiness'
score a resource gets on the node where it is currently running. This prevents
the cluster from moving resources around whithout an urgent need during a
the cluster from moving resources around without an urgent need during a
cluster transition. The correct value depends on number of resources, colocation
rules and resource groups. Particularly additional groups colocated to the
HANA primary master resource can affect cluster decisions.
Expand Down
55 changes: 55 additions & 0 deletions man/SAPHanaSR-showAttr-adoc.8.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
= SAPHanaSR-showAttr-adoc(1)
Lars Pinne, Fabian Herschel
v1.001
:doctype: manpage
:manmanual: SAPHanaSR-showAttr-adoc
:mansource: SAPHanaSR-showAttr-adoc
:man-linkstyle: pass:[blue R < >]

== Name

SAPHanaSR-showAttr - Shows Linux cluster attributes for SAP HANA system replication.

== Synopsis

SAPHanaSR-showAttr [ --help | --version | --path2table ]

SAPHanaSR-showAttr\fR [ --sid=SID[:INO] ] [ --select=SELECTION ] [ --sort=FIELD ] [ --format=FORMAT ] [ --cib=OFFLINE_CIB_FILE ]

==

SAPHanaSR-showAttr shows Linux cluster attributes for SAP HANA system replication automation.
The overall system replication (SR) state is shown as well as the HANA state
on each node.
Because the HANA srHook methods srConnectionChanged() and preTakeover() are
used, respective information shows up as well.
The information is fetched from the Linux cluster information base (CIB), not
from HANA directly.
Fields to be shown can be specified by pre-defined selections via command line option.

The output shows four sections, containing all or some of the listed
fields:

Global section

*global (Global)*::
constant

*cib-time*::
date and time of record

== Exit status

TODO

*0*::
TODO 0

*1*::
TODO 1

== Resources

== Copying

Copyright (C) 2014 {author}. +
8 changes: 5 additions & 3 deletions man/SAPHanaSR-showAttr.8
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.\" Version: 1.001
.\"
.TH SAPHanaSR-showAttr 8 "08 May 2023" "" "SAPHanaSR"
.TH SAPHanaSR-showAttr 8 "04 Oct 2023" "" "SAPHanaSR"
.\"
.SH NAME
SAPHanaSR-showAttr \- Shows Linux cluster attributes for SAP HANA system replication.
Expand Down Expand Up @@ -267,12 +267,14 @@ Value: [ 4 | 3 | 2 | 1 | 0 ]
This field contains the return code of landscapHostConfiguration.py. The
parameter does not tell you if the secondary system is ready for a takeover.
The meaning is different from common Linux return codes.
The SAPHanaController and SAPHanaTopology RAs will interpret return code 1 as
NOT-RUNNING (or ERROR) and return codes 2+3+4 as RUNNING.
.br
4 = OK - Everything looks perfect on the HANA primary.
.br
3 = WARNING - A HANA Host Auto-Failover is taking place.
3 = WARNING - An internal HANA action is ongoing, e.g. host auto-failover.
.br
2 = INFO - The landscape is completely functional, but the actual role of the host differs from the configured role.
2 = INFO - The landscape is completely functional, but the actual host role differs from the configured role.
.br
1 = DOWN - There are not enough active hosts.
.br
Expand Down
8 changes: 7 additions & 1 deletion man/SAPHanaSR.7
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.\" Version: 1.001
.\"
.TH SAPHanaSR 7 "08 May 2023" "" "SAPHanaSR-angi"
.TH SAPHanaSR 7 "18 Sep 2023" "" "SAPHanaSR-angi"
.\"
.SH NAME
SAPHanaSR \- Tools for automating SAP HANA system replication in scale-up setups.
Expand Down Expand Up @@ -284,6 +284,12 @@ character or number. Subsequent characters may contain dash and underscore.
23. The SAPHanaController RA, the SUSE HA cluster and several SAP components
need read/write access and sufficient space in the Linux /tmp filesystem.
.PP
24. SAP HANA Native Storage Extension (NSE) is supported.
Important is that this feature does not change the HANA topology or interfaces.
In opposite to Native Storage Extension, the HANA Extension Nodes are changing
the topology and thus currently are not supported.
Please refer to SAP documentation for details.
.PP
.\"
.SH BUGS
.\" TODO
Expand Down
2 changes: 1 addition & 1 deletion man/SAPHanaSR_basic_cluster.7
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ configurations might match specific needs.

The crm basic parameter default-resource-stickiness defines the 'stickiness'
score a resource gets on the node where it is currently running. This prevents
the cluster from moving resources around whithout an urgent need during a
the cluster from moving resources around without an urgent need during a
cluster transition. The correct value depends on number of resources, colocation
rules and resource groups. Particularly additional groups colocated to the
HANA primary master resource can affect cluster decisions.
Expand Down
6 changes: 3 additions & 3 deletions man/SAPHanaSR_maintenance_examples.7
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,7 @@ This procedure can be used to update RAs, HANA HADR provider hook scripts and re
\fB*\fR Remove left-over maintenance attribute from overall Linux cluster.

This could be done to avoid confusion caused by different maintenance procedures.
See above overview on maintenance procedures whith running Linux cluster.
See above overview on maintenance procedures with running Linux cluster.
Before doing so, check for cluster attribute maintenance-mode="false".
.PP
.RS 4
Expand All @@ -400,7 +400,7 @@ Before doing so, check for cluster attribute maintenance-mode="false".
\fB*\fR Remove left-over standby attribute from Linux cluster nodes.

This could be done to avoid confusion caused by different maintenance procedures.
See above overview on maintenance procedures whith running Linux cluster.
See above overview on maintenance procedures with running Linux cluster.
Before doing so for all nodes, check for node attribute standby="off" on all nodes.
.PP
.RS 4
Expand All @@ -416,7 +416,7 @@ Before doing so for all nodes, check for node attribute standby="off" on all nod
\fB*\fR Remove left-over maintenance attribute from resource.

This should usually not be needed.
See above overview on maintenance procedures whith running Linux cluster.
See above overview on maintenance procedures with running Linux cluster.
.PP
.RS 4
# SAPHanaSR-showAttr
Expand Down
4 changes: 2 additions & 2 deletions man/ocf_suse_SAPHana.7
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.\" Version: 0.160.1
.\"
.TH ocf_suse_SAPHana 7 "27 Jun 2022" "" "OCF resource agents"
.TH ocf_suse_SAPHana 7 "03 Oct 2022" "" "OCF resource agents"
.\"
.SH NAME
SAPHana \- Manages takeover between two SAP HANA databases with system replication.
Expand Down Expand Up @@ -46,7 +46,7 @@ landscapeHostConfiguration.py has some detailed output about HANA system status
and node roles. For our monitor the overall status is relevant. This overall
status is reported by the return code of the script:
0: Internal Fatal, 1: ERROR, 2: WARNING, 3: INFO, 4: OK
The SAPHana resource agent will interpret return code 0 as FATAL, 1 as not-running
The SAPHana resource agent will interpret return code 0 as FATAL, 1 as NOT-RUNNING
(or ERROR) and return codes 2+3+4 as RUNNING.
.PP
3. \fBhdbnsutil\fR
Expand Down
4 changes: 2 additions & 2 deletions man/ocf_suse_SAPHanaController.7
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.\" Version: 1.001
.\"
.TH ocf_suse_SAPHanaController 7 "09 Aug 2022" "" "OCF resource agents"
.TH ocf_suse_SAPHanaController 7 "04 Oct 2023" "" "OCF resource agents"
.\"
.SH NAME
SAPHanaController \- Manages takeover between two SAP HANA databases with system replication.
Expand Down Expand Up @@ -50,7 +50,7 @@ landscapeHostConfiguration.py has some detailed output about HANA system status
and node roles. For our monitor the overall status is relevant. This overall
status is reported by the return code of the script:
0: Internal Fatal, 1: ERROR, 2: WARNING, 3: INFO, 4: OK
The SAPHanaController resource agent will interpret return code 0 as FATAL, 1 as not-running
The SAPHanaController resource agent will interpret return code 0 as FATAL, 1 as NOT-RUNNING
(or ERROR) and return codes 2+3+4 as RUNNING.
.br
Note: Some conditions cause HANA stopping to work, but not reporting an error. E.g. filesystem filled up.
Expand Down
42 changes: 36 additions & 6 deletions man/ocf_suse_SAPHanaFilesystem.7
Original file line number Diff line number Diff line change
Expand Up @@ -223,9 +223,6 @@ clone cln_SAPHanaFil_SLE_HDB00 rsc_SAPHanaFil_SLE_HDB00 \\
meta clone-node-max="1" notify="true" interleave="true"
.RE
.PP
* Example configuration for a SAPHanaFilesystem resource on HANA scale-out.
.PP
The HANA consists of two sites with several nodes each. An additional cluster node
* Example configuration for a SAPHanaFilesystem resource for HANA scale-out.
.PP
The HANA consists of two sites with several nodes each. An additional cluster node
Expand All @@ -249,7 +246,7 @@ clone cln_SAPHanaFil_SLE_HDB00 rsc_SAPHanaFil_SLE_HDB00 \\
.br
meta clone-node-max="1" notify="true" interleave="true"
.PP
location SAPHanaFil_not_on_majority_maker cln_SAPHanaFIL_SLE_HDB00 -inf: vm-majority
location SAPHanaFil_not_on_majority_maker cln_SAPHanaFil_SLE_HDB00 -inf: vm-majority
.RE
.PP
* Example on showing the current SAPHanaFilesystem rescource configuration on scale-out.
Expand Down Expand Up @@ -311,7 +308,7 @@ for the NFS server in use. See manual pages nfs(5) and fstab(5) for details.
nfs1:/export/SLE/shared/ /hana/shared/SLE/ auto defaults,rw,hard,proto=tcp,intr,noatime,vers=4,lock 0 0
.RE
.PP
* Example for temporarily blocking HANA filesystems.
* Example for temporarily blocking HANA access to local filesystems.
.PP
This could be done for testing the SAPHanaFilesystem RA integration.
Blocking the HANA filesystem is dangerous. This test should not be done on production
Expand Down Expand Up @@ -340,6 +337,37 @@ Note: Understand the impact before trying.
5. Check HANA and Linux cluster for clean idle state.
.RE
.PP
* Example for temporarily blocking HANA access to NFS filesystems.
.PP
This could be done for testing the SAPHanaFilesystem RA integration.
Blocking the HANA filesystem is dangerous. This test should not be done on production
systems.
Used TCP port is 2049. See also SUSE TID 7000524.
.br
Note: Understand the impact before trying.
.PP
.RS 2
1. Check HANA and Linux cluster for clean idle state.
.PP
2. On secondary, block /hana/shared/SLE/ filesystem.
.RS 2
# sync /hana/shared/SLE/
.br
# iptables -I OUTPUT -p tcp -m multiport --ports 2049 -j ACCEPT
.br
Note: The ACCEPT needs to be replaced by appropriate action.
.RE
.PP
3. Check system log for SAPHanaFilsystem entries.
.PP
4. On secondary, unblock /hana/shared/SLE/ filesystem.
.RS 2
# iptables -D OUTPUT -p tcp -m multiport --ports 2049 -j ACCEPT
.RE
.PP
5. Check HANA and Linux cluster for clean idle state.
.RE
.PP
.\"
.SH FILES
.TP
Expand Down Expand Up @@ -409,7 +437,9 @@ Please report any other feedback and suggestions to [email protected].
.br
https://documentation.suse.com/sbp/sap/ ,
.br
https://www.suse.com/support/kb/doc/?id=000019904
https://www.suse.com/support/kb/doc/?id=000019904 ,
.br
https://www.suse.com/support/kb/doc/?id=000016649
.PP
.\"
.SH AUTHORS
Expand Down
5 changes: 2 additions & 3 deletions man/ocf_suse_SAPHanaTopology.7
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,10 @@ The resource agent uses the following interfaces provided by SAP:
landscapeHostConfiguration.py has some detailed output about HANA system
status and node roles. For our monitor the overall status is relevant. This
overall status is reported by the return code of the script:
0: Internal Fatal 1: ERROR 2: WARNING 3: INFO (maybe a switch of the resource
running) 4: OK
0: Internal Fatal, 1: ERROR, 2: WARNING, 3: INFO (e.g. host auto-failover happened), 4: OK
.br
The SAPHanaTopology resource agent will interpret return codes 1 as
NOT-RUNNING (or 1 failure) and return codes 2+3+4 as RUNNING.
NOT-RUNNING (or ERROR) and return codes 2+3+4 as RUNNING.
SAPHanaTopology scans the output table of landscapeHostConfiguration.py to
identify the roles of the cluster node. Roles means configured and current
role of the nameserver as well as the indexserver.
Expand Down
23 changes: 22 additions & 1 deletion man/susChkSrv.py.7
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,27 @@ It does not touch any fencing device.
# crm_attribute -t status -N 'node2' -G -n terminate
.RE
.PP
\fB*\fR Example for killing HANA hdbindexserver process.
.PP
This could be done for testing the HA/DR provider hook script integration.
Killing HANA processes is dangerous. This test should not be done
on production systems.
Please refer to SAP HANA documentation. See also manual page killall(1).
.br
Note: Understand the impact before trying.
.PP
1. Check HANA and Linux cluster for clean idle state.
.PP
2. On secondary master name server, kill the hdbindexserver process.
.RS 2
# killall -9 hdbindexserver
.RE
.PP
3. Check the nameserver tracefile for srServiceStateChanged() events.
.PP
4. Check HANA and Linux cluster for clean idle state.
.RE
.PP
.\"
.SH FILES
.TP
Expand Down Expand Up @@ -423,7 +444,7 @@ Please report any other feedback and suggestions to [email protected].
\fBocf_suse_SAPHanaTopology\fP(7) , \fBocf_suse_SAPHanaController\fP(7) ,
\fBSAPHanaSR-hookHelper\fP(8) ,
\fBSAPHanaSR-manageProvider\fP(8) , \fBcrm\fP(8) , \fBcrm_attribute\fP(8) ,
\fBpython3\fP(8) ,
\fBpython3\fP(8) , \fBkillall\fP(1) ,
.br
https://help.sap.com/docs/SAP_HANA_PLATFORM?locale=en-US
.br
Expand Down
Loading

0 comments on commit 0cb631a

Please sign in to comment.