Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NRPE: add check: check_sriov_numvfs.py #670

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions charmhelpers/contrib/charmsupport/nrpe.py
Original file line number Diff line number Diff line change
Expand Up @@ -520,3 +520,15 @@ def remove_deprecated_check(nrpe, deprecated_services):
for dep_svc in deprecated_services:
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
log('Deprecated service: {}'.format(dep_svc))
nrpe.remove_check(shortname=dep_svc)


def add_sriov_numvfs_checks(nrpe, sriov_numvfs):
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
"""
Add checks for openvswitch
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
:param NRPE nrpe: NRPE object to add check to
:param str sriov_numvfs: space separated list of <interface>:<numvfs>
"""
nrpe.add_check(
shortname='sriov_numvfs',
description='Check SRIOV numvfs',
check_cmd='check_sriov_numvfs.py {}'.format(sriov_numvfs))
13 changes: 13 additions & 0 deletions charmhelpers/contrib/network/files/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright 2014-2015 Canonical Limited.
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
101 changes: 101 additions & 0 deletions charmhelpers/contrib/network/files/check_sriov_numvfs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
#!/usr/bin/env python3
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
"""
Check to verify the number of virtual functions (VFs) active on network
interfaces. Based on /sys/class/net/<interface>/device/sriov_numvfs and
/sys/class/net/<interface>/device/sriov_totalvfs.

usage: check_sriov_numvfs.py [-h] sriov_numvfs [sriov_numvfs ...]

Check sriov numvfs configuration

positional arguments:
sriov_numvfs format: <interface>:<numvfs>

For each interfaces:numvfs pair given it verifies that:
1. VFs are enabled
2. VFs are configured are the same as specified in the check parameter
3. VFs configured do not exceed the maximum available on the interface

Example: ./check_sriov_numvfs.py ens3f0:32 ens3f1:32 ens6f0:32 ens6f1:32
"""

import argparse
import os.path
import sys


DEVICE_TEMPLATE = "/sys/class/net/{0}"
SRIOV_NUMVFS_TEMPLATE = "/sys/class/net/{0}/device/sriov_numvfs"
SRIOV_TOTALVFS_TEMPLATE = "/sys/class/net/{0}/device/sriov_totalvfs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - why are we checking this one? This is the total number of VFs that the PF supports? Customers could have valid reasons for configuring less VFs than the number the PF supports, e.g. the less VFs you have, the more channels/queues you have per VF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the number of VFs that the PF supports.
We are checking this to make sure we are not trying to configure more VFs than the PF supports. This could happen if we have different NICs in a group of servers with different capabilities.
Currently, if the number of VFs set on the PF is higher than totalvfs, it just sets it to totalvfs (silent fail).

If numvfs > totalvfs then numvfs != sriov_numvfs will also fail, however the help text will report a more accurate reason and therefore hopefully help to resolve it faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more on why this is a critical alert when the requested VFs cannot be provided by the card? It seems conceivable to me that 2 cards with different capabilities across different units have the same interface name but one can serve 128 VFs and another 64. Setting the value to 128 maxes out both available VFs but it is fewer than 128*2.



def get_interface_setting(file):
""" Return the value content of a setting file as int """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
with open(file) as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to guard against potential errors here (e.g. permission denied or the file not existing)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L66 checks if the files do exist. But I added an exception catching wrapper around the check now to be safe.

value = f.read()
return int(value)


def check_interface_numvfs(iface, numvfs):
""" Check SRIOV numvfs config """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
sriov_numvfs_path = SRIOV_NUMVFS_TEMPLATE.format(iface)
sriov_totalvfs_path = SRIOV_TOTALVFS_TEMPLATE.format(iface)
msg = []
if os.path.exists(DEVICE_TEMPLATE.format(iface)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this works, the structure feels a little unwieldly. I would personally suggest refactoring it, so that rather than getting to 3-4 levels of indentation deep, you instead return early where possible. e.g. if the device template path doesn't exist, just return msg at that point and then the rest of the function can be de-dented by a level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved, should be more straightforward now

if (not os.path.exists(sriov_totalvfs_path) or
not os.path.exists(sriov_numvfs_path)):
msg.append("{}: VFs are disabled or not-available".format(iface))
else:
sriov_numvfs = get_interface_setting(sriov_numvfs_path)
sriov_totalvfs = get_interface_setting(sriov_totalvfs_path)
if numvfs != sriov_numvfs:
msg.append(
"{}: Number of VFs on interface ({}) does not match check ({})"
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
.format(iface, sriov_numvfs, numvfs)
)
if numvfs > sriov_totalvfs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on or around line 29.

msg.append(
"{}: Maximum number of VFs available on interface ({}) is lower than the check ({})"
.format(iface, sriov_totalvfs, numvfs)
)
return msg


def parse_sriov_numvfs(device_numvfs):
""" Parse parameters and check format """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
msg = "parameter format must be '<interface>:<numvfs>', e.g. ens3f0:32, given; {}".format(device_numvfs)
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
assert device_numvfs != '', msg
parts = device_numvfs.split(':')
assert len(parts) == 2, msg
assert len(parts[0]) > 0, msg
assert int(parts[1]) > 0, msg
iface = str(device_numvfs.split(':')[0])
numvfs = int(device_numvfs.split(':')[1])
return (iface, numvfs)


def parse_args():
"""Parse command-line options."""
parser = argparse.ArgumentParser(description="Check sriov numvfs configuration")
parser.add_argument("sriov_numvfs", nargs='+', help="format: <interface>:<numvfs>")
args = parser.parse_args()
return args


def main():
"""Parse args and check the sriov numvfs."""
args = parse_args()
error_msg = []

for device_numvfs in args.sriov_numvfs:
iface, numvfs = parse_sriov_numvfs(device_numvfs)
error_msg += check_interface_numvfs(iface, numvfs)
if error_msg:
print("CRITICAL: {} problems detected\n".format(len(error_msg)) + "\n".join(error_msg))
sys.exit(2)

print("OK: sriov_numvfs set to " + ", ".join(args.sriov_numvfs))
SPFZ marked this conversation as resolved.
Show resolved Hide resolved


if __name__ == "__main__":
main()
9 changes: 9 additions & 0 deletions tests/contrib/charmsupport/test_nrpe.py
Original file line number Diff line number Diff line change
Expand Up @@ -494,3 +494,12 @@ def test_copy_nrpe_checks_nrpe_files_dir(self):
self.patched['copy2'].assert_called_once_with(
'filea',
'/usr/local/lib/nagios/plugins/filea')

@patch.object(nrpe.NRPE, 'add_check')
def test_add_sriov_numvfs_checks(self, mock_nrpe_add_check):
foo = nrpe.NRPE()
nrpe.add_sriov_numvfs_checks(foo, 'ens3f0:32 ens3f1:32')
mock_nrpe_add_check.assert_called_once_with(
shortname='sriov_numvfs',
description='Check SRIOV numvfs',
check_cmd='check_sriov_numvfs.py ens3f0:32 ens3f1:32')
13 changes: 13 additions & 0 deletions tests/contrib/network/files/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright 2014-2015 Canonical Limited.
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
82 changes: 82 additions & 0 deletions tests/contrib/network/files/test_check_sriov_numvfs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import mock
import os
import tempfile
import unittest

from charmhelpers.contrib.network.files import check_sriov_numvfs

__author__ = 'Stephan Pampel <[email protected]>'


class TestCheckSriovNumfs(unittest.TestCase):

interface_folder = os.path.join(tempfile.gettempdir(), 'ens3f0')
sriov_numvfs_file = os.path.join(interface_folder, 'sriov_numvfs')
sriov_totalvfs_file = os.path.join(interface_folder, 'sriov_totalvfs')

def __init__(self, *args, **kwargs):
super(TestCheckSriovNumfs, self).__init__(*args, **kwargs)
if not os.path.exists(self.interface_folder):
os.mkdir(self.interface_folder)
with open(self.sriov_numvfs_file, "w") as f:
f.write("32")
with open(self.sriov_totalvfs_file, "w") as f:
f.write("64")

def __del__(self):
if os.path.exists(self.interface_folder):
for file in os.listdir(self.interface_folder):
os.remove(os.path.join(self.interface_folder, file))
os.removedirs(self.interface_folder)

def test_parameter_parsing(self):
""" check if the sriov_numvfs parameter is parsed correctly """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
iface, numvfs = check_sriov_numvfs.parse_sriov_numvfs('ens3f0:32')
self.assertEqual(iface, 'ens3f0')
self.assertEqual(numvfs, 32)

for param in ['', ':', ':test', ':32', 'ens3f2', 'a:1:b']:
self.assertRaises(AssertionError, check_sriov_numvfs.parse_sriov_numvfs, param)

for param in ['ens3f2:', 'ens3f2:test']:
self.assertRaises(ValueError, check_sriov_numvfs.parse_sriov_numvfs, param)

def test_check_interface_numvfs_no_interface(self):
""" check should report nothing if the interface does not exists """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
self.assertListEqual(
check_sriov_numvfs.check_interface_numvfs('no-interface', 0), []
)

@mock.patch('charmhelpers.contrib.network.files.check_sriov_numvfs.DEVICE_TEMPLATE', interface_folder)
def test_check_interface_numvfs_vfs_disabled(self):
""" check if virtual functions are disabled """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved
self.assertListEqual(
check_sriov_numvfs.check_interface_numvfs('ens3f0', 0),
['ens3f0: VFs are disabled or not-available']
)

@mock.patch('charmhelpers.contrib.network.files.check_sriov_numvfs.DEVICE_TEMPLATE', interface_folder)
@mock.patch('charmhelpers.contrib.network.files.check_sriov_numvfs.SRIOV_NUMVFS_TEMPLATE', sriov_numvfs_file)
@mock.patch('charmhelpers.contrib.network.files.check_sriov_numvfs.SRIOV_TOTALVFS_TEMPLATE', sriov_totalvfs_file)
def test_check_interface_numvfs_vfs_enabled(self):
""" check if virtual functions are enabled """
SPFZ marked this conversation as resolved.
Show resolved Hide resolved

# check numvfs correct
self.assertListEqual(
check_sriov_numvfs.check_interface_numvfs('ens3f0', 32),
[]
)

# check numvfs != check
self.assertListEqual(
check_sriov_numvfs.check_interface_numvfs('ens3f0', 16),
['ens3f0: Number of VFs on interface (32) does not match check (16)']
)

# check numvfs > sriov_totalvfs
with open(self.sriov_numvfs_file, "w") as f:
f.write("128")
self.assertListEqual(
check_sriov_numvfs.check_interface_numvfs('ens3f0', 128),
['ens3f0: Maximum number of VFs available on interface (64) is lower than the check (128)']
)