feat: rewrite snapshot.py as an Ansible module / add support for thin origins #58

trgill · 2024-05-20T03:46:20Z

Enhancement: rewrite snapshot.py as an Ansible module / add support for thin origins

Reason:
#43
#57

Result:
Referenced issues fixed

Issue Tracker Tickets (Jira or BZ if any):

library/snapshot.py

tasks/main.yml

spetrosi · 2024-05-20T08:03:43Z

tests/tests_set_thin_basic.yml

+
+        - name: Assert no changes for create snapset
+          assert:
+            that: not snapshot_cmd["changed"]


This check won't work. It's impossible to check if include_role resulted in a change in Ansible. There is no proper way to check idempotency other than looking at the playbook output.

@richm added a boolean response from the module to indicate if any action was taken. Rich, please let me know what changes I should make.

https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_general.html#creating-a-module
The module should return a dict. That dict should have a field called changed which is true if the module changed something or false otherwise.

This check won't work. It's impossible to check if include_role resulted in a change in Ansible. There is no proper way to check idempotency other than looking at the playbook output.

But in this case, when the role calls this module, it uses register: snapshot_cmd, so the test playbook can check. Otherwise, you are correct - there is no way to get at the internal state. Several of the roles register the module output or otherwise use some sort of internal variable to indicate the changed state of the role for the purpose of testing for idempotency.

spetrosi · 2024-05-20T08:05:17Z

library/snapshot.py

@@ -1,22 +1,146 @@
-from __future__ import print_function
+#!/usr/bin/python


Do you plan to add python unit tests for this module?

Yes. I'll add some unit tests.

library/snapshot.py

spetrosi · 2024-05-22T10:31:26Z

[citest]

tasks/main.yml

library/snapshot.py

richm · 2024-05-29T19:59:01Z

library/snapshot.py

+
+
+EXAMPLES = r"""
+# Create Snapshots of all VGs


This should be an example of using the module directly, not using the role.
Note that using the module directly is not supported, but module scanning tools such as ansible-test and ansible-doc may throw an error if this documentation is not in the correct format.

Ok. The errors are a bit difficult to understand. I was following the example in the storage role. Currently I'm seeing a lot of errors for the docs: https://github.com/linux-system-roles/snapshot/actions/runs/9502379067/job/26190204498 - it looks like the storage role is using a different "-collection-version 1.78.2" vs the snapshot role is using "--collection-version 1.79.0"

I also tried following the examples in: https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_general.html

But had trouble with errors with that format too.

trgill · 2024-05-29T20:33:51Z

tasks/main.yml

+        snapshot_lvm_mount_options: "{{ snapshot_lvm_mount_options | d(omit) }}"
+        snapshot_lvm_fstype: "{{ snapshot_lvm_fstype | d(omit) }}"
+        snapshot_lvm_mountpoint: "{{ snapshot_lvm_mountpoint | d(omit) }}"
+        snapshot_lvm_set: "{{ snapshot_lvm_set | to_json }}"


@richm is there a better way to pass snapshot_lvm_set?

I'd like it to be None if it is not set, but the way I've done it here I'm getting "{}\0" when I read it in the python.

Then if you define snapshot_lvm_set as a list of dict in the module, you can just use

snapshot_lvm_set: "{{ snapshot_lvm_set | d(omit) }}"

So, on the module side, I will remove the code that expects JSON and use dict()?

When I define it as you suggested, the module gets snapshot_lvm_set set to a string, right? But the string is no longer JSON (the values have single quotes).

So, on the module side, I will remove the code that expects JSON and use dict()?

yes

When I define it as you suggested, the module gets snapshot_lvm_set set to a string, right?

No, the module should get snapshot_lvm_set as a dict.

But the string is no longer JSON (the values have single quotes).

in defaults/main.yml you should have

snapshot_lvm_set: {}

richm · 2024-05-29T22:54:00Z

library/snapshot.py

+        snapshot_lvm_snapset_name=dict(type="str"),
+        snapshot_lvm_mount_options=dict(type="str"),
+        snapshot_lvm_mountpoint=dict(type="str"),
+        snapshot_lvm_set=dict(type="str"),


This should be a list of dict - see the storage role for examples e.g. https://github.com/linux-system-roles/storage/blob/main/library/blivet.py#L2131

snapshot_lvm_set=dict(type="list", elements="dict", options=dict(name=dict(type="str"), vg=dict(type="str"), lv=dict(type="str"), percent_space_required=dict(type="int"), ....

Sorry - not a list of dict but just a dict

richm · 2024-06-10T15:56:48Z

ping - any updates?

trgill · 2024-06-11T13:15:09Z

tests/verify-role-failed.yml

@@ -14,7 +14,7 @@
          __snapshot_failed_params.get('snapshot_lvm_percent_space_required')
          }}"
        snapshot_lvm_all_vgs: "{{
-          __snapshot_failed_params.get('snapshot_all')
+          __snapshot_failed_params.get('snapshot_lvm_all_vgs') | d(false)


@richm how should I default this to false when snapshot_lvm_all_vgs is not defined. Currently it defaults to "" and that causes an error when the module expects it to be a boolean.

Suggested change

__snapshot_failed_params.get('snapshot_lvm_all_vgs') | d(false)

__snapshot_failed_params.get('snapshot_lvm_all_vgs', false)

richm · 2024-06-13T16:51:47Z

tasks/main.yml

        verbosity: 2

-    - name: Parse raw output
+    - name: Set result


I don't think you need to do this - I believe the register: snapshot_cmd in the snapshot task creates the global variable snapshot_cmd

ok - fixed.

richm · 2024-06-18T00:48:38Z

@trgill This PR should fix all of those ansible-test (and flake8, pylint, black) issues. trgill#6 I didn't test it but hopefully the errors will be obvious

A big change was using module.run_command - which means most of the methods required the addition of the module argument - the way other modules handle this is by declaring a module class e.g. SnapshotModule(AnsibleModule) and putting all of the methods in this class - then you can just use self.module.run_command everywhere instead of having to pass module as a method argument.

richm · 2024-06-18T13:12:50Z

[citest]

trgill · 2024-06-19T13:56:44Z

[citest bad]

Signed-off-by: Todd Gill <[email protected]>

linux-system-roles#43 Signed-off-by: Todd Gill <[email protected]>

trgill · 2024-07-02T01:33:41Z

[citest bad]

trgill · 2024-07-02T01:37:39Z

@richm I rebased on your changes, will that include the fixes needed?

richm · 2024-07-02T13:21:17Z

@richm I rebased on your changes, will that include the fixes needed?

You'll also need to merge this PR - trgill#7

richm · 2024-07-02T14:50:56Z

[citest] - note that centos7 tests will fail due to the move to vault.centos.org

richm · 2024-07-02T17:02:30Z

tests/tests_thin_basic.yml

+            thin_pool_size: '3g'
+            size: "1g"
+            thin: true
+          - name: lv2


Is it enough to test with 1 thin pool? There seems to be a problem using multiple thin pools with the storage subsystem in el7 - perhaps something to do with blivet on el7? I get the following error:

Failed to commit changes to disk: Process reported exit code 3: --virtualsize may not be zero. Run `lvcreate --help' for more information.

@vojtechtrefny @japokorn any ideas? The storage role thin pool test uses only 1 volume - https://github.com/linux-system-roles/storage/blob/main/tests/tests_create_thinp_then_remove.yml#L37-L43 - which is why it passes on el7

If you really want to test with multiple volumes, then we'll need to define one list of volumes for el7, and another list for el8 and later.

yes, I'll update the test to only use 1 pool.

I'll test multiple pools manually.

@vojtechtrefny @japokorn it looks like if I add:

thin_pool_size: '3g'

whenever I allocate a volume that uses the thin pool it works as expected on RHEL 7 and Centos 7.

Is it enough to test with 1 thin pool? There seems to be a problem using multiple thin pools with the storage subsystem in el7 - perhaps something to do with blivet on el7?

We have quite an old version of blivet on EL7 so there could be a lot of bugs that were fixed between 7 and 8, but I don't remember anything specific about thin provisioning that could be causing this.

The storage role thin pool test uses only 1 volume - https://github.com/linux-system-roles/storage/blob/main/tests/tests_create_thinp_then_remove.yml#L37-L43 - which is why it passes on el7

Looks like something we should add to our test suite.

it looks like if I add: thin_pool_size: '3g' whenever I allocate a volume that uses the thin pool it works as expected on RHEL 7 and Centos 7.

We have some logic in the storage role to calculate size of logical volumes and thin pools when not specified here so it's possible this is somehow broken with older blivet. Can you please check @japokorn?

I'll try to test this scenario with the storage role and I'll report an upstream issue for us to track this.

Looks like the problem was actually invalid size for the second LV:

- name: lv2 thin_pool_name: thin_pool size: "15"

size: "15" is interpreted as 15 B and on CentOS 7 this is rounded down to 0 KiB so that's why lvcreate is complaining about --virtualsize may not be zero. With newer blivet on Fedora and CentOS 9/10 we automatically adjust this size to the minimal size which explains why it was failing only on CentOS 7.

trgill · 2024-07-08T20:00:33Z

[citest]

trgill · 2024-07-09T12:48:15Z

[citest bad]

richm · 2024-07-09T12:53:26Z

@trgill centos-7 testing is broken currently - it is being worked on - the only way to test that is locally with tox-lsr

trgill · 2024-07-09T12:54:55Z

@richm Ok - I'll run centos-7 tests locally

richm · 2024-07-09T23:21:12Z

In the latest commit, I still get errors with the thin pool tests on centos-7:

tests/tests_set_thin_basic.yml - Failed to commit changes to disk: (FSError('format failed: 1',), '/dev/mapper/test_vg2-lv7')

and tests_thin_basic.yml doesn't have thin_pool_size for all of the test volumes

Update to ignore thinpool LVs and support think provisioned sources. Signed-off-by: Todd Gill <[email protected]>

it returns rc, is_snapshot and should not be called as a boolean function. Signed-off-by: Todd Gill <[email protected]>

Signed-off-by: Todd Gill <[email protected]>

trgill · 2024-07-10T13:51:38Z

I updated the thin tests to only use 1 source VG. I tested locally on centos-7 and it works.

trgill · 2024-07-11T13:00:54Z

@richm I think this is ready to merge?

richm · 2024-07-13T21:47:59Z

[citest]

trgill · 2024-07-15T12:56:58Z

[citest bad]

richm · 2024-07-15T14:53:28Z

not sure what the problem is with centos-9, but tests pass locally. centos-7 passes locally also.

trgill requested review from spetrosi and richm as code owners May 20, 2024 03:46

trgill changed the title ~~Module work~~ rewrite snapshot.py as an Ansible module / add support for thin origins May 20, 2024

github-advanced-security bot found potential problems May 20, 2024

View reviewed changes

library/snapshot.py Fixed Show fixed Hide fixed

library/snapshot.py Fixed Show fixed Hide fixed

library/snapshot.py Fixed Show fixed Hide fixed

spetrosi reviewed May 20, 2024

View reviewed changes

trgill force-pushed the module_work branch from 27e8d00 to d6d9975 Compare May 22, 2024 01:03

github-advanced-security bot found potential problems May 22, 2024

View reviewed changes

library/snapshot.py Fixed Show fixed Hide fixed

library/snapshot.py Fixed Show fixed Hide fixed

library/snapshot.py Fixed Show fixed Hide fixed

trgill changed the title ~~rewrite snapshot.py as an Ansible module / add support for thin origins~~ feat: rewrite snapshot.py as an Ansible module / add support for thin origins May 22, 2024

trgill force-pushed the module_work branch from d6d9975 to f8531ae Compare May 22, 2024 13:05

trgill force-pushed the module_work branch from f8531ae to 2a807af Compare May 29, 2024 15:13