Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #37367 - Switch to 'network' directive instead of ifcfg #9961

Merged
merged 1 commit into from
Apr 19, 2024

Conversation

ShimShtein
Copy link
Member

Use kickstart's network directive to create interfaces according to foreman's DB instead of using ifcfg files.

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think of introducing a macro for this. Previously used a snippet.

app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
app/services/foreman/renderer/scope/macros/kickstart.rb Outdated Show resolved Hide resolved
@ekohl
Copy link
Member

ekohl commented Dec 19, 2023

Also, isn't there a Redmine issue for this?

@ShimShtein
Copy link
Member Author

Had to fix snapshot tests and added a meets_requirement method to deal with requirements a bit more cleaner.

@ekohl, ready for another round.

app/models/operatingsystems/redhat.rb Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really wise to put so much kickstart logic into macros? They make it harder to maintain for users if they need changes. Why can't this be a snippet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are putting in the macro file logic about how to translate foreman objects into kickstart language. I think the translation is not use-case dependent. For example iface.ip will always be translated to network --ip=#{iface.ip}, so this logic can be hard-coded into our macros. I do think that we could add modification flags later on, if we see that there are parts of the Interface object that should not always be translated. This will leave enough flexibility to the user, but will maintain integrity with the kickstart language.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should get rid of this template. RHV is deprecated and going away and oVirt doesn't look too healthy these days.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an issue that should be properly discussed in the community (started a thread: https://community.theforeman.org/t/proposal-to-drop-support-for-ovirt/36324) and anyway it's a matter for a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to only drop the kicjstart tenplate. The whole of oVirt is a separate discussion. I don't know if you even still can provision with this template or if we provide functionality for something that no longer exists.

Having said that, I also noticed I can no longer compile the oVirt SDK on Fedora so that alone can be a reason to drop it.

I'll weigh in on the RFC later, when I'm home

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I'm not too keen on mapping Anaconda capabilities into the operating system. Anaconda has complex cherry picking and if anything, it deserves its own class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what I implemented here is in tune with what you are saying: I didn't create a hard-coded capability matrix with Anaconda (which is problematic, I agree).
I have implemented only a version check, which is not directly related to Anaconda. Theoretically we could implement such a check in the base class, but I have avoided it, since it can be a challenge on its own, and currently I don't see too much value for it.
The usage of the version check is implemented inside the kickstart template, and there I am using the versions for both Fedora and Rhel-compatible systems, according to the documentation. This way the logic for each compatibility check is kept with the usage.

@ShimShtein
Copy link
Member Author

@ekohl ready for another round.
Thanks a lot for keeping me honest and well-documented :)

@ShimShtein
Copy link
Member Author

[test unit]

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about it more, I'm against a macro and instead want to see a snippet. This allows small fixes by users and experimentation with new features by copying. A macro doesn't allow that.

app/models/operatingsystems/redhat.rb Outdated Show resolved Hide resolved
@stejskalleos stejskalleos self-assigned this Jan 29, 2024
end

test 'should skip non-managed interfaces' do
iface = FactoryBot.build(:nic_base, :primary => true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be actually non-managed?

Suggested change
iface = FactoryBot.build(:nic_base, :primary => true)
iface = FactoryBot.build(:nic_base, :primary => true, :managed => false)

@ShimShtein
Copy link
Member Author

@ekohl Ready for the next round. Refactored the helper to a snippet

end

def render_template(iface, host:, use_slaac:, static:, static6:)
@snippet ||= File.read(File.expand_path('../../../../../app/views/unattended/provisioning_templates/snippet/kickstart_network_interface.erb', __dir__))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't Rails.root.join() be easier?

@@ -105,6 +105,7 @@ def host4dhcp
name: 'snapshot-ipv4-dhcp-el7',
subnet: FactoryBot.build(:subnet_ipv4_dhcp_for_snapshots),
interfaces: [ipv4_interface])
host.stubs(:managed_interfaces).returns(host.interfaces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to build a host with the right interfaces. I think it means ipv4_interface needs to be marked as managed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

managed_interfaces is based on DB scopes, it does not work correctly for non-saved hosts:

      host = FactoryBot.build(:managed_host,
        name: 'snapshot-ipv6-dhcp-el7',
        interfaces: [FactoryBot.build(:managed_interface, managed: true)])
      host.managed_interfaces
      # => []

      host.save!
      host.managed_interfaces
      # => [<Nic::Managed ... >]

else
network_options.push("--nodns")
end
# start with provisioning interface, then other non-bond interfaces and the bonds at the end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this is needed? Is it a requirement by Anaconda?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. By default only the first interface is activated, and I want it to be the provisioning interface.
  2. I can't find it now, but I have read somewhere in the docs, that the commands are executed in the order they are written, so I would prefer to have all the "daughter" interfaces defined before I define them in the bonds.

Comment on lines 39 to 40
if @iface.respond_to?(:mtu)
network_options.push("--mtu=#{@iface.mtu}") if @iface.mtu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if @iface.respond_to?(:mtu)
network_options.push("--mtu=#{@iface.mtu}") if @iface.mtu
if @iface.respond_to?(:mtu) && @iface.mtu
network_options.push("--mtu=#{@iface.mtu}")

end

# dual stack MTU check
raise("IPv4 and IPv6 subnets have different MTU") if subnet4 && subnet6 && subnet4.mtu.present? && subnet6.mtu.present? && subnet4.mtu != subnet6.mtu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should implement this check in @iface.mtu instead. Perhaps not in this PR because I don't know how big the implications are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 268 to 270
assert_match(/--nameserver/, actual)
assert_match(/192.168.42.2/, actual)
assert_match(/192.168.42.3/, actual)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally a test only has a single assertion. This is a much stricter check. It's showing up more often in this file, so only commenting once.

Suggested change
assert_match(/--nameserver/, actual)
assert_match(/192.168.42.2/, actual)
assert_match(/192.168.42.3/, actual)
assert_includes(actual, '--nameserver=192.168.42.2,192.168.42.3')

@@ -72,4 +76,21 @@ def pxe_kernel_options(params)
def minor_version_help
'0, 6.1810'
end

def fedora?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern with these is that they're no longer a pure implementation of the OperatingSystem interface. So you can end up with a template that calls it on a Debian OS or something else. Perhaps make fedora? and rhel_compatible? private methods and implement meets_requirement? on OperatingSystem? For Debian/Ubuntu I can see similar uses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the meets_requirement to the base OperatingSytem class, although each family would need to override it with the correct keyword arguments to make it more readable. In the current state, the code will not throw an error, but would return false for families that are not overriding the default implementation.
I do prefer this method of implementation, since it will make throw meaningful exceptions when incorrect arguments are used, unless you would like to see something like:

host.operatingsystem = ubuntu_20_04
do_something if host.operatingsystem.meets_requirement(rhel: 8)

As for fedora? and rhel_compatible?: I do think that these methods should be public, but on the derived class. So I would like to allow the following code:

do_something if host.operatingsystem.is_a(RedHat) && host.operatingsystem.fedora?

app/models/operatingsystems/redhat.rb Outdated Show resolved Hide resolved
@ShimShtein ShimShtein force-pushed the rhel9_networking branch 3 times, most recently from 159687d to f7cf7ba Compare February 29, 2024 10:19
@ShimShtein
Copy link
Member Author

Now the tests should pass

ekohl
ekohl previously requested changes Mar 19, 2024
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not a fan of the meets_requirement method. Getting the API right is just too hard IMHO. What if you also need to specify ranges, like upper bounds? I'd prefer to get rid of it and make the templates slightly more verbose. It's only used twice in your patch, so writing that out is probably easier for long term support.

name == 'Fedora'
end

def rhel_compatible?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we place Amazon Linux here? It's within the Red Hat ecosystem, but it's based on Fedora with its own independent release cycle.

Since you're including this in core, we're on the hook unlike if we kept the logic in the templates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the templates ignore Amazon Linux distinction.
If we want to do such a distinction, we should add a new method amazon? and have it in the meets_requirement method too.

I don't think it should be done in this PR though. If you think it's important, I'll open a new PR that will be based on this one.

@ShimShtein
Copy link
Member Author

I'm still not a fan of the meets_requirement method. Getting the API right is just too hard IMHO. What if you also need to specify ranges, like upper bounds? I'd prefer to get rid of it and make the templates slightly more verbose. It's only used twice in your patch, so writing that out is probably easier for long term support.

We have quite a lot of version specifications in the code, just search for os_major in the code. Once we have this infra in place, I can refactor the other templates too.
As for upper bounds, we can add modifiers if that is needed:

os.meets_requirement(:at_most, rhel: 10)

I think it's readable, and will make a lot of sense.

@stejskalleos
Copy link
Contributor

Failing unit tests might be related:

NicTest#test_0033_nic with both subnet and subnet6 should be valid if MTU is consistent between subnets [test/models/nic_test.rb:526]

[2024-03-20T11:59:47.998Z] Minitest::Assertion: #<Nic::Managed id: nil, mac: "00:33:45:ab:01:96", ip: "241.84.94.35", type: "Nic::Managed", name: "nick", host_id: 114, subnet_id: 1018350845, domain_id: nil, attrs: {}, created_at: nil, updated_at: nil, provider: nil, username: nil, password: nil, virtual: false, link: true, identifier: "eth946", tag: "", attached_to: "", managed: true, mode: "balance-rr", attached_devices: "", bond_options: "", primary: false, provision: false, compute_attributes: {}, ip6: nil, subnet6_id: 1018350846> errors: Ip does not match selected subnet

@ShimShtein
Copy link
Member Author

[test unit]

@ShimShtein
Copy link
Member Author

rerunning the tests solved it

Copy link
Member

@ares ares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one question from my side

@ShimShtein
Copy link
Member Author

Ready for hopefully final round of review

@ekohl
Copy link
Member

ekohl commented Apr 8, 2024

I think it's readable, and will make a lot of sense.

I don't think it does and really think it deserves its own PR with its own discussion. Introducing it here makes the whole code inconsistent and doesn't give a good idea of whether the API really is a good one. Once again, please modify the PR to not include it.

@ShimShtein
Copy link
Member Author

@ekohl switched to explicit version limits.

@ShimShtein
Copy link
Member Author

[test unit]

@ares
Copy link
Member

ares commented Apr 12, 2024

@ShimShtein does this fix the issue for RHEL 8 and 9? in the recent version I didn't see OS condition. I believe Anaconda behaves differently when it comes to configuration of interfaces in 7,8 and 9. In 8 it still generates ifcfg files from network commands, in 9 it generates nm connection files.

So in 8 we had issue with sometimes the files created by Anaconda conflicting with the ones we generate through our snippet in %post.
In 9 we ended up with nm connection (from Anaconda) and ifcfg files (from our %post).
In 7 Anaconda didn't create the files at all so our %post snippet was the only thing creating this.

Needless to say, we need to have a solution that works for all 7,8,9

@ShimShtein
Copy link
Member Author

does this fix the issue for RHEL 8 and 9?

@ares Yes, this is the whole idea of this PR. Instead of always generating ifcfg files in the %post step, we switch to requesting anaconda to define the interfaces for us. Anaconda is the one to choose the proper way to do it - for RHEL 7 it will use the ifcfg files, and for RHEL 8+ it can use the nm records for defining the interfaces. In either way, instead of having the installation procedure (anaconda) creating a default record for us and then overriding the behavior manually, we request a correct configuration from anaconda.
Effectively I am removing the old ifcfg template altogether from the %post step here and instead I am adding the network anaconda command for each interface that is defined in Foreman.

@stejskalleos
Copy link
Contributor

QEs confirmed that the changes are OK and work as expected.
@ShimShtein Please rebase, squash commits, and add Fixes #... to the commit message.
@ares @ekohl, I will merge the PR today if there are no other objections.

@ShimShtein
Copy link
Member Author

Fixed the new rocky snapshots too. Should be OK

@stejskalleos stejskalleos dismissed ekohl’s stale review April 19, 2024 06:59

The requested changes have been done or dropped meanwhile

@stejskalleos stejskalleos merged commit ca34010 into theforeman:develop Apr 19, 2024
54 checks passed
@stejskalleos
Copy link
Contributor

Thanks @ShimShtein, @ekohl, and rest for the help!

@ares
Copy link
Member

ares commented Apr 23, 2024

Is there a reason why this does not have redmine issue?

@ekohl
Copy link
Member

ekohl commented Apr 23, 2024

The commit has one, just the PR doesn't. The bot doesn't check for that

@ares ares changed the title Switch to 'network' directive instead of ifcfg Fixes #37367 - Switch to 'network' directive instead of ifcfg Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants