-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should fence_mpath agent be utilized instead of the fence_scsi agent? #26
Comments
Description of fence_mpath agent and how it functions compared to fence_scsi: fence_mpath: new fence agent for dm-multipath based on mpathpersist |
I'd still see if you can debug your specific issue. I don't know of anyone using fence_mpath for this type of setup, and there are plenty of folks using this guide with success. Please note what I mentioned about diverse heartbeat network paths. |
Thanks @ewwhite I will try to debug some more... still trying to understand how the pcs resource start and stop timeouts affect failover as the suggested 90 seconds seems like a very large value (IIRC the TCP session timeout for NFS is only like 60 seconds). |
So I placed node#2 (cluster-nas2) into standby, then shut it down completely. When I subsequently startup node#2 again it causes pacemaker to crash on node#1. Below is the excerpt from the syslog on node#1 showing the sequence: Apr 8 01:35:41 svr-lf-nas1 crmd[2850]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE |
Can you show me the pcs resource creation string you used for the fencing? Maybe also the cluster creation string... and also your hosts files? |
Any updates? @rcproam |
Thanks so much for following-up on this @ewwhite and my apologies for the delay. My spare time has been focused on tax preparations this week.
Anythom, I did try configuring the fence_mpath agent devices but unfortunately unfencing no worky for me :-\
Will try to revert back to fence_scsi agent tonight and provide the info you requested.
BTW, are you receiving Email to your @ewwhite.net address? I had sent an Email last week. If you’re located in Chicago maybe we could meet up one day? Would like to learn more about your consulting business in case I have the opportunity to refer some new business to you.
|
This is not an issue with the current design. Possibly label as enhancement?
In particular, due to the documented issue "RHEL 7 High Availability and Resilient Storage Pacemaker cluster experiences a fence race condition between nodes during network outages while using fence_scsi with multipath storage", would it be more reliable to utilize the fence_mpath agent than the fence_scsi agent?
I've encountered an issue very similar to the issue described here: https://access.redhat.com/solutions/3201072
Red Hat recommends utilizing the fence_mpath agent instead of fence_scsi to resolve this particular issue, however fence_mpath is more complex to configure, and may likely come with its own unique caveats/issues.
https://access.redhat.com/articles/3078811
Still need to test the fence_mpath agent with my particular buildout to confirm whether or not it resolves the fencing / scsi reservation issue I've encountered, but I'm opening this issue in case others might have time to test the fence_mpath agent before I can.
The text was updated successfully, but these errors were encountered: