Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBOM #403

Open
lackhove opened this issue Dec 4, 2024 · 4 comments
Open

SBOM #403

lackhove opened this issue Dec 4, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@lackhove
Copy link

lackhove commented Dec 4, 2024

While the standalone builds already contain licensing info in the PYTHON.json file it would be great if it could also contain a full SBOM in a standard format such as SPDX, similar to the binaries on python.org.

@Edward-Knight
Copy link
Contributor

Second this. I'm currently working on manually writing a CycloneDX SBOM for a specific build of PBS we use downstream, following which I'll be working on some automated tooling. I'll share both as and when I can to hopefully get the ball rolling on upstreaming this 👍

@zanieb
Copy link
Member

zanieb commented Dec 11, 2024

thank you!

@Edward-Knight
Copy link
Contributor

I've attached an SBOM I've made for one of the past builds of PBS. It's the first SBOM I've written and I don't have particular expertise in this area, but I believe it is a good starting point.

Notable exclusions

The SBOM is mostly "complete", with a few exceptions I've outlined below. Of course one could always include more information and fill every possible field, but that way lies madness.

  • It does not include "dependency" information (under /dependencies)
    • This just allows you to add relationships to the already declared "components". It should be a simple case to say all of the libraries are dependencies of CPython, and CPython itself is a dependency of PBS, however I'm not sure if some of the libraries are subdependencies (e.g. some of the X or tk related libs)
  • It does not include license or copyright information (neither for the top-level component under /metadata/component/{licenses,copyright} or for dependencies under /components[]/{licenses,copyright})
    • This information is already tracked in this project, so shouldn't be too difficult to include
  • It does not include build information (e.g. compiler name and version, tooling versions, information about the CI system etc)
    • It isn't very clear where this information would be included. I believe the /formulation section would be most appropriate, but also it seems like this is excluded from most SBOMs, instead left to a different document (e.g. an MBOM)
      • This information isn’t included in https://github.com/CycloneDX/bom-examples
      • From what I can tell both the v1 cyclonedx-conan and v2 conan cyclonedx plugin don't include this information
      • Adoptium OpenJDK builds use metadata/tools and metadata/properties for this (incorrectly)
        • E.g. metadata/tools: {"name": "MacOS Compiler", "version": "clang (clang/LLVM from Xcode 15.2)"}
        • E.g. metadata/properties: {"name": "OS version", "value": "Darwin 23.6.0"}

Guide to the format

Since a large blob of JSON can be intimidating at first, I'll quickly go over the general structure. At the top level there is some generic boilerplate to do with the SBOM itself:

{
    "$schema": "https://cyclonedx.org/schema/bom-1.6.schema.json",
    "bomFormat": "CycloneDX",
    "specVersion": "1.6",
    "serialNumber": "urn:uuid:f6a24d2b-f989-426f-a9af-19324d8b6949",
    "version": 1,
    "metadata": {
        "timestamp": "2024-12-16T16:00:00+00:00",
        "component": {...}
    },
    "components": [...]
}

There is a "metadata component" and another list of "components". These all follow the same format. The "metadata component" is the one the SBOM is about, i.e. a particular build of PBS:

{
    "type": "application",
    "mime-type": "application/zstd",
    "bom-ref": "pkg:generic/python-build-standalone@20231002?download_url=https://github.com/indygreg/python-build-standalone/releases/download/20231002/cpython-3.10.13+20231002-x86_64_v2-unknown-linux-gnu-pgo-full.tar.zst&checksum=sha256:f1121cc0fccb1c5e867923f39e3e7d6413720554ec079eac022f5fc69e7ee83a",
    "authors": [
        {
            "name": "Gregory Szorc",
            "email": "[email protected]"
        }
    ],
    "name": "python-build-standalone",
    "version": "20231002",
    "description": "This project produces self-contained, highly-portable Python distributions. These Python distributions contain a fully-usable, full-featured Python installation: most extension modules from the Python standard library are present and their library dependencies are either distributed with the distribution or are statically linked.",
    "hashes": [
        {
            "alg": "SHA-256",
            "content": "f1121cc0fccb1c5e867923f39e3e7d6413720554ec079eac022f5fc69e7ee83a"
        }
    ],
    "purl": "pkg:generic/python-build-standalone@20231002?download_url=https://github.com/indygreg/python-build-standalone/releases/download/20231002/cpython-3.10.13+20231002-x86_64_v2-unknown-linux-gnu-pgo-full.tar.zst&checksum=sha256:f1121cc0fccb1c5e867923f39e3e7d6413720554ec079eac022f5fc69e7ee83a",
    "externalReferences": [
        {
            "url": "https://github.com/indygreg/python-build-standalone",
            "type": "vcs"
        },
        {
            "url": "https://github.com/indygreg/python-build-standalone/releases/tag/20231002",
            "type": "release-notes"
        },
        {
            "url": "https://github.com/indygreg/python-build-standalone/releases/download/20231002/cpython-3.10.13+20231002-x86_64_v2-unknown-linux-gnu-pgo-full.tar.zst",
            "type": "distribution",
            "hashes": [
                {
                    "alg": "SHA-256",
                    "content": "f1121cc0fccb1c5e867923f39e3e7d6413720554ec079eac022f5fc69e7ee83a"
                }
            ]
        }
    ]
}

This SBOM is for a specific release tarball and the mime-type and hashes reflect this. There are some self-explanatory fields like name, version, description, and author (which I imagine will change to manufacturer and point to "Astral Software Inc." for future releases). The bom-ref, purl, and "distribution" type external reference all have duplicate information pointing to the exact build (as python-build-standalone@20231002 on its own does not uniquely define a release).

The array of components is then a list of the dependencies. I wrote the CPython one by hand, and have included a short bit of Python code that I used to generate the rest from the downloads.py list. As an example, here is the one for CPython (which has more detail than the other dependencies):

{
    "type": "application",
    "mime-type": "application/x-xz",
    "bom-ref": "pkg:generic/[email protected]?download_url=https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tar.xz&checksum=sha256:5c88848668640d3e152b35b4536ef1c23b2ca4bd2c957ef1ecbb053f571dd3f6",
    "manufacturer": {
        "name": "Python Software Foundation",
        "address": {
            "country": "US",
            "region": "Oregon",
            "locality": "Beaverton",
            "postalCode": "OR 97008",
            "streetAddress": "9450 SW Gemini Dr. ECM# 90772"
        },
        "url": [
            "https://www.python.org"
        ]
    },
    "name": "CPython",
    "version": "3.10.13",
    "hashes": [
        {
            "alg": "SHA-256",
            "content": "5c88848668640d3e152b35b4536ef1c23b2ca4bd2c957ef1ecbb053f571dd3f6"
        }
    ],
    "purl": "pkg:generic/[email protected]?download_url=https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tar.xz&checksum=sha256:5c88848668640d3e152b35b4536ef1c23b2ca4bd2c957ef1ecbb053f571dd3f6",
    "externalReferences": [
        {
            "url": "https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tar.xz",
            "type": "distribution",
            "hashes": [
                {
                    "alg": "SHA-256",
                    "content": "5c88848668640d3e152b35b4536ef1c23b2ca4bd2c957ef1ecbb053f571dd3f6"
                }
            ]
        }
    ]
}

As noted above, all the "components" have a similar structure, the only difference is that I've set the manufacturer field instead of the author field.

Attachments and useful links

@Edward-Knight
Copy link
Contributor

I've been looking at software that consumes SBOMs (in this case Dependency-Track), and it seems like the generic purls (package URLs) aren't used for matching against vulnerability databases. My testing shows that CPE (Common Platform Enumeration) IDs do work though - using a CPE of cpe:2.3:a:openssl:openssl:3.0.11:*:*:*:*:*:*:* for openssl does correctly link to some vulnerabilities. We can't automatically construct these as with generic purls, but adding them to metadata for a component and using them where available is probably a good idea

@charliermarsh charliermarsh added the enhancement New feature or request label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants