Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata version 2.0 not supported, probably unintentionally #17168

Open
dnicolodi opened this issue Nov 25, 2024 · 1 comment · May be fixed by #17174
Open

Metadata version 2.0 not supported, probably unintentionally #17168

dnicolodi opened this issue Nov 25, 2024 · 1 comment · May be fixed by #17174
Labels
bug 🐛 requires triaging maintainers need to do initial inspection of issue

Comments

@dnicolodi
Copy link

I was reading the code in warehouse.forklift.metadata and I found that it tries to support metadata version 2.0:

SUPPORTED_METADATA_VERSIONS = {"1.0", "1.1", "1.2", "2.0", "2.1", "2.2", "2.3", "2.4"}

def _validate_metadata(metadata: Metadata, *, backfill: bool = False):
# Add our own custom validations ontop of the standard validations from
# packaging.metadata.
errors: list[InvalidMetadata] = []
# We restrict the supported Metadata versions to the ones that we've implemented
# support for.
if metadata.metadata_version not in SUPPORTED_METADATA_VERSIONS:
errors.append(
InvalidMetadata(
"metadata-version",
f"{metadata.metadata_version!r} is not a valid metadata version",
)
)

However this additional validation function is called the metadata validation in the metadata parser in packaging has already been done:

# Validate the metadata using our custom rules, which we layer ontop of the
# built in rules to add PyPI specific constraints above and beyond what the
# core metadata requirements are.
_validate_metadata(metadata, backfill=backfill)

packaging does not support metadata version 2.0, which technically does not exist as a standard. Therefore trying to parse metadata version 2.0 results in an exception being raised:

>>> import warehouse.forklift.metadata
>>> warehouse.forklift.metadata.parse(b'''\
... Metadata-Version: 2.0
... Name: foo
... Version: 1.2.3
... ''')

    | packaging.metadata.InvalidMetadata: '2.0' is not a valid metadata version

Apparently, supporting metadata version 2.0 is not necessary anymore, otherwise AFAIK if someone would have tried to upload to PyPI a package using metadata 2.0, they would have encountered an error. However, if it is decided to keep metadata 2.0 support, monkeypatching packaging may be the easiest way forward. This seems to work:

>>> import packaging.metadata
>>> packaging.metadata._VALID_METADATA_VERSIONS = ['1.0', '1.1', '1.2', '2.0', '2.1', '2.2', '2.3', '2.4']
>>> import warehouse.forklift.metadata
>>> warehouse.forklift.metadata.parse(b'''\
... Metadata-Version: 2.0
... Name: foo
... Version: 1.2.3
... ''')
<packaging.metadata.Metadata object at 0x10228c2d0>

but I haven't verified that this approach works as intended in all aspects.

@dnicolodi dnicolodi added bug 🐛 requires triaging maintainers need to do initial inspection of issue labels Nov 25, 2024
@di
Copy link
Member

di commented Nov 25, 2024

Indeed, looks like we have inadvertently dropped support for Metadata 2.0:

SELECT
    FORMAT_DATE('%Y-%m', upload_time) AS upload_month,
    count(*) AS count
  FROM
    `bigquery-public-data.pypi.distribution_metadata`
  WHERE metadata_version = '2.0'
   AND DATE(upload_time) >= DATE_SUB(CURRENT_DATE(), INTERVAL 24 MONTH)
  GROUP BY 1
ORDER BY 1
upload_month count
2022-11 23
2022-12 266
2023-01 132
2023-02 106
2023-03 159
2023-04 219
2023-05 174
2023-06 157
2023-07 95
2023-08 196
2023-09 115
2023-10 143
2023-11 182
2023-12 61
2024-01 177
2024-02 51
2024-03 75

(note, nothing after March 2024 which correlates with #15631)

I think, given the low historic counts here & lack of outcry from users, we can just consider this deprecated and drop it from SUPPORTED_METADATA_VERSIONS rather than special-case it.

dnicolodi added a commit to dnicolodi/warehouse that referenced this issue Nov 25, 2024
The metadata version is first validated by `packaging` and then by the
additional validation code in warehouse.forklift.metadata. Therefore,
only metadata versions supported by `packaging` can be supported.

Metadata version 2.0 has never officially been codified. `packaging`
does not accept 2.0 as a valid metadata version. To avoid confusion,
remove 2.0 from the list of supported metadata versions.

While at it, describe this fact in a comment, and add an assert.

Fixes pypi#17168.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 requires triaging maintainers need to do initial inspection of issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants