-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-4859] Apply sysctl from the charm #369
Conversation
"vm.swappiness": "0", | ||
"net.ipv4.tcp_retries2": "5", | ||
} | ||
if OpenSearchDistribution.running_as_lxc(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to check sysctl anyway, it doesn't matter on which platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I cannot apply all the options above for LXC; but I can if we have a VM or host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to check the sysctl on all platforms after setting. It doesn't matter what platform or exit codes are received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@delgod, so to clarify, it is better to keep the original behavior, where we check the settings and then, block if the user did not apply it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phvalguima I think the following flow can work in our case:
- check the current values of each sysctl setting (regardless of the platform)
- make the distinction between:
- the settings that need to be
equal
to a value (tcp_retries2
, orswappiness
) - the settings that need to be
greater
than a value (max_map_count >= 262144
, file-max)- we should only set these if that's the case, otherwise we are decreasing the quality of the settings already existing in the environment
- the settings that need to be
- This should result in a set of
invalid
settings. - attempt to set these settings (all the invalid set or one at a time), and re-check the new values, so you can accurately add each in the missing_requirements if any:
- I prefer the missing_requirements we had, as the message is more tailored towards the values that need to be set, not just that the setting could not be set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Pedro - I have a few comments
Configure the system settings required by [OpenSearch](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/), | ||
we'll do that by creating and setting a [`cloudinit-userdata.yaml` file](https://juju.is/docs/olm/juju-model-config) on the model. | ||
As well as setting some kernel settings on the host machine. | ||
``` | ||
cat <<EOF > cloudinit-userdata.yaml | ||
cloudinit-userdata: | | ||
postruncmd: | ||
- [ 'echo', 'vm.max_map_count=262144', '>>', '/etc/sysctl.conf' ] | ||
- [ 'echo', 'vm.swappiness=0', '>>', '/etc/sysctl.conf' ] | ||
- [ 'echo', 'net.ipv4.tcp_retries2=5', '>>', '/etc/sysctl.conf' ] | ||
- [ 'echo', 'fs.file-max=1048576', '>>', '/etc/sysctl.conf' ] | ||
- [ 'sysctl', '-p' ] | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we remove this? This will still set userdata on EC2 instances - where swappiness etc can be set directly on the VM. (on LXD it will only set tcp_retries)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need it then?
Running within LXC
We cannot set vm.max_map_count
, vm.swappiness
and fs.file-max
as they must be set on the underlying host. So, the cloud-init is useless there.
Running within VMs
We will set all these values via charm.
So, in which situation I need the cloud-init back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am keeping the commands to set the configs in sysctl for the LXC case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on LXD:
- tcp_retries can be set using userdata through the model.
The rest will not be set, but will not fail.
on EC2 / VM / Baremetal:
- all sysctl can be applied using userdata through the model
- users may choose to set better values than the ones we suggest, e.g higher value of max_map_count than our minimum set by the charm.
On the code:
- we ensure that the values are correct, (they could have been changed manually in machines), if they're not - we set them
- sysctl calls can fail for whatever reason, having them on the userdata already is 1 additional layer of safety.
fs.file-max=1048576 | ||
EOT | ||
|
||
sudo sysctl -p | ||
|
||
juju model-config --file=./cloudinit-userdata.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
sysctl.Config(name).configure(config) | ||
except sysctl.CommandError as e: | ||
logger.error(f"Failed to apply sysctl config: {e}") | ||
missing_requirements.append(str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with a bulk set, This will result in the BlockedStatus message, as follows:
Unable to set params: vm.swappiness, vm.max_map_count.
In which case we don't need missing_requirements
"vm.swappiness": "0", | ||
"net.ipv4.tcp_retries2": "5", | ||
} | ||
if OpenSearchDistribution.running_as_lxc(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phvalguima I think the following flow can work in our case:
- check the current values of each sysctl setting (regardless of the platform)
- make the distinction between:
- the settings that need to be
equal
to a value (tcp_retries2
, orswappiness
) - the settings that need to be
greater
than a value (max_map_count >= 262144
, file-max)- we should only set these if that's the case, otherwise we are decreasing the quality of the settings already existing in the environment
- the settings that need to be
- This should result in a set of
invalid
settings. - attempt to set these settings (all the invalid set or one at a time), and re-check the new values, so you can accurately add each in the missing_requirements if any:
- I prefer the missing_requirements we had, as the message is more tailored towards the values that need to be set, not just that the setting could not be set
Production deployments will be using VMs instead of containers. In this case, we do not need to delegate the
sysctl
setup to the user but could rather do it on the charm.Containers, on the other hand, cannot set all the types of sysctl:
This PR adds the sysctl lib and uses it to set all the parameters if we are running directly on the host, or only
net.ipv4.tcp_retries2
otherwise.Closes #359