Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to import ansible collections with "duplicate key value violates unique" error #1986

Closed
hao-yu opened this issue Oct 1, 2024 · 1 comment · Fixed by #1992 · May be fixed by #1987
Closed

Failed to import ansible collections with "duplicate key value violates unique" error #1986

hao-yu opened this issue Oct 1, 2024 · 1 comment · Fixed by #1992 · May be fixed by #1987

Comments

@hao-yu
Copy link

hao-yu commented Oct 1, 2024

Version
satellite-6.15.3.1-2.el8sat.noarch
python3.11-pulpcore-3.39.15-1.el8pc.noarch
python3.11-pulp-ansible-0.20.8-1.el8pc.noarch

Description of problem:
Pulp raised the following error when performing an incremental import for Ansible collections.

Errors:
 {"traceback"=>"  File \"/usr/lib/python3.11/site-packages/pulpcore/tasking/tasks.py\", line 61, in _execute_task
   result = func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/pulpcore/app/tasks/importer.py\", line 380, in import_repository_version
   for a_result in _import_file(os.path.join(rv_path, filename), res_class, retry=True):
 File \"/usr/lib/python3.11/site-packages/pulpcore/app/tasks/importer.py\", line 268, in _import_file
   a_result = resource.import_data(data, raise_errors=True)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 813, in import_data
   result = self.import_data_inner(
            ^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 882, in import_data_inner
   raise row_result.errors[-1].error
 File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 748, in import_row
   self.save_instance(instance, new, using_transactions, dry_run)
 File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 491, in save_instance
   instance.save()
 File \"/usr/lib/python3.11/site-packages/pulpcore/app/models/base.py\", line 160, in save
   return super().save(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib64/python3.11/contextlib.py\", line 81, in inner
   return func(*args, **kwds)
          ^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django_lifecycle/mixins.py\", line 169, in save
   save(*args, **kwargs)
 File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 814, in save
   self.save_base(
 File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 877, in save_base
   updated = self._save_table(
             ^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 1020, in _save_table
   results = self._do_insert(
             ^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 1061, in _do_insert
   return manager._insert(
          ^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/models/manager.py\", line 87, in manager_method
   return getattr(self.get_queryset(), name)(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/models/query.py\", line 1805, in _insert
   return query.get_compiler(using=using).execute_sql(returning_fields)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/models/sql/compiler.py\", line 1822, in execute_sql
   cursor.execute(sql, params)
 File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 67, in execute
   return self._execute_with_wrappers(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 80, in _execute_with_wrappers
   return executor(sql, params, many, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 84, in _execute
   with self.db.wrap_database_errors:
 File \"/usr/lib/python3.11/site-packages/django/db/utils.py\", line 91, in __exit__
   raise dj_exc_value.with_traceback(traceback) from exc_value
 File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 89, in _execute
   return self.cursor.execute(sql, params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File \"/usr/lib/python3.11/site-packages/psycopg/cursor.py\", line 723, in execute
   raise ex.with_traceback(None)\n", "description"=>"duplicate key value violates unique constraint \"unique_is_highest\"\nDETAIL:  Key (collection_id, is_highest)=(019245d8-6ab2-7bfc-9015-fa80c3082eee, t) already exists."} 

How reproducible:
Tricky

Is this issue a regression from an earlier version:
No

Steps to Reproduce:

To simulate the issue, we need to hack the Pulp code so that Pulp will generate the rows in certain order.

  1. Edit "/usr/lib/python3.11/site-packages/pulp_ansible/app/modelresource.py" and add the "def export" method to class "CollectionVersionContentResource".

Why not changing the "def set_up_queryset" instead? See the additional notes section.

class CollectionVersionContentResource(BaseContentResource):
    <snip>
    def export(self, queryset, *args, **kwargs):
        if queryset:
            queryset = queryset.order_by("-is_highest")

        return super().export(queryset, *args, **kwargs)
  1. Restart pulpcore services
systemctl restart pulpcore* 
  1. Create an ansible collection repository and sync the following collection and version.
collections:
- name: ansible.posix
  version: 1.5.4
  1. Create a content view, attach the ansible collection repo and then publish version 1.0

  2. Export the content view version 1.0

6 . Sync the ansible collection repository again with new versions.

collections:
- name: ansible.posix
  version: ">=1.5.4"
  1. Publish the content view version 2.0

  2. Perform an incremental export for the content view version 2.0

  3. Import the content view version 1.0 to another Satellite.

  4. Import the content view version 2.0 to another Satellite.

Actual behavior:
raise ex.with_traceback(None)\n", "description"=>"duplicate key value violates unique constraint "unique_is_highest"\nDETAIL:  Key (collection_id, is_highest)=(019245d8-6ab2-7bfc-9015-fa80c3082eee, t) already exists."} 

Expected behavior:
Import successfully.

Additional info:
This is the root cause:

After the first complete import, the "ansible_collectionversion" table in the disconnected Satellite should have the following row:

pulpcore=# select name, version, is_highest from ansible_collectionversion;
 name  | version | is_highest 
-------+---------+------------
 posix | 1.5.2   | t
(1 row) 

During the incremental import, if the incremental json file has the following order then version (1.6.0, t) will be inserted first and causes error. It is because (1.5.4, t) is still not updated in the disconnected Satellite.

{
  "namespace": ansible",
  "name": "posix",
  "version": "1.6.0",
  "is_highest": "1",
}
...
{
  "namespace": ansible",
  "name": "posix",
  "version": "1.5.4",
  "is_highest": "0",
} 

While looking for a solution,

I tried to order the rows by "is_highest" like below, but "set_up_queryset" doesn't honour it.

    def set_up_queryset(self):
        """
        :return: CollectionVersion content specific to a specified repo-version.
        """
        return CollectionVersion.objects.filter(pk__in=self.repo_version.content).order_by("is_highest")

It is because the queryset result will later be processed in batch while writing it to a json file to save memory. The batch processing code re-fetch the data using their PKs so ordering is lost. See https://github.com/pulp/pulpcore/blob/main/pulpcore/app/importexport.py#L55-L56

The workaround is to wrap the "export()" method of the model resource to perform the re-ordering.

@mdellweg
Copy link
Member

mdellweg commented Oct 1, 2024

I wonder how much "is_latest" is a red herring here.

hao-yu added a commit to hao-yu/pulp_ansible that referenced this issue Oct 3, 2024
Avoid "duplicate key value violates unique" error
when importing ansible collections.

fixes pulp#1986
mdellweg added a commit to mdellweg/pulp_ansible that referenced this issue Oct 9, 2024
@mdellweg mdellweg linked a pull request Oct 9, 2024 that will close this issue
mdellweg added a commit that referenced this issue Oct 9, 2024
mdellweg added a commit to mdellweg/pulp_ansible that referenced this issue Oct 28, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

Fixes pulp#1986
mdellweg added a commit to mdellweg/pulp_ansible that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

Fixes pulp#1986
mdellweg added a commit to mdellweg/pulp_ansible that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to pulp#1986
mdellweg added a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986
patchback bot pushed a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

(cherry picked from commit 0474b11)
patchback bot pushed a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

(cherry picked from commit 0474b11)
mdellweg added a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

(cherry picked from commit 0474b11)
mdellweg added a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

[noissue]

(cherry picked from commit 0474b11)
mdellweg added a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

[noissue]

(cherry picked from commit 0474b11)
mdellweg added a commit that referenced this issue Oct 29, 2024
This field is not correct anyway and it is removed in newer versions. We
only set it to False here hoping it will not be worse but also as the
minimal possible change.

This is a backportable version of the fix.

relates to #1986

[noissue]

(cherry picked from commit 0474b11)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants