Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new spec for go package URLs #338

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

maceonthompson
Copy link

The current PURL specification for Go was created before Go 1.11 modules and thus has namespace inconsistencies and lacks semantic versioning.

Although in many cases a module path corresponds directly to the URL of the hosting repository, that is not always true. The URL formed from the module path may be an endpoint that serves a redirect to the true host. This indirection protects projects that for whatever reason must change their hosting provider: their module names will continue to work. Consequently, it is undesirable to encode any aspect of the underlying hosting system as part of the PURL.

In essence, all Go modules form a single namespace. Since it is used by the majority of Go programmers, we propose to represent this namespace by the empty string. Though not included in this commit, other namespaces could be possible and would represent package managers and/or build tools that are alternatives to the go command.

The go type proposed here fixes the current issues by removing the namespace, using valid Go module versions (including pseudoversions), and adds some extra functionality to encode optional information about specific builds (GOOS, GOARCH, etc).

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

Copy link
Contributor

@matt-phylum matt-phylum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #196 #294 #308

This is a breaking change that affects all software utilizing PURL for Go. Personally, I don't think there's anything fundamentally wrong with pkg:golang except that the description is outdated, and I'm sure it can be fixed without making this level of breaking change. Maintaining the separation of namespace and name and putting the entire Go package ID into the PURL name makes PURLs difficult for human users to work with.

------
``go`` for Go modules:

- The ``namespace`` field is empty and implies the go mod proxy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the field empty or does it imply the go mod proxy? It can't be both.

- The ``name`` will be the full module path.
- The ``subpath`` will represent the package path within a module.
- The ``version`` will be a valid go version or pseudoversion, or empty.
- Additional Build information for binaries can be included as ``qualifiers`` (i.e VCS info, go version info, GoArch/GoOS info etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional information should be explicitly defined here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactlty. be specific in the spec, so we all are on the same page.

``go`` for Go modules:

- The ``namespace`` field is empty and implies the go mod proxy.
- The ``name`` will be the full module path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably specify that it is case sensitive. pkg:golang incorrectly states that it is not case sensitive and must be lowercased.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. this is what the whole #308 is about.
Please don't repeat the mistakes from the past.

@maceonthompson
Copy link
Author

See also #196 #294 #308

Thanks for pointing at these! This is essentially a combination of #196 and #308 (with the addition of qualifiers for build info). They go into more detail than this proposal, but especially in the case of namespaces #63 (comment) is a good example as to why dropping name in favor of an entirely coded namespace would be more useful. I understand that having a bunch of %2F in the PURL is ugly for humans, but is (we feel) necessary to ensure that go PURLs are consistent (which is to say that go module -> PURL is injective, a go module cannot be represented by different PURLs).

Say you have a module with the path host.com/maybeuser/module.
With the current type definition, both pkg:golang/host.com/maybeuser/module and pkg:golang/host.com/maybeuser%2Fmodule, could represent that module. In order for PURLs to canonically and uniquely define go modules in the way that they are defined on pkg.go.dev or the go module proxy, they must be unique as well.

@matt-phylum
Copy link
Contributor

Say you have a module with the path host.com/maybeuser/module.
With the current type definition, both pkg:golang/host.com/maybeuser/module and pkg:golang/host.com/maybeuser%2Fmodule, could represent that module. In order for PURLs to canonically and uniquely define go modules in the way that they are defined on pkg.go.dev or the go module proxy, they must be unique as well.

I think the better solution to this problem is that pkg:golang/host.com/maybeuser%2Fmodule stays illegal. It'd be better if the documentation explicitly stated it were illegal, but based on the examples and test cases the correct form is pkg:golang/host.com/maybeuser/module, and based on the reference parsing and formatting algorithms it's clear that these PURLs are distinct.

However, "a go module cannot be represented by different PURLs" is not generally the case:

  • The PURL spec describes a canonical format for PURLs, but users and even commonly used PURL implementations often get this wrong and produce non-canonical PURLs which must still be considered equal. For example, pkg:golang/host%2Ecom/maybeuser/module is a non-canonical, valid, PURL which refers to the same package.
  • A PURL may have qualifiers which may or may not be critical to the PURL. A PURL with a ?goarch is a different PURL which refers to the same module, but a PURL with a ?repository_url (or however the module proxy is specified) is a different PURL which may refer to a different module (probably more likely in other ecosystems).

@jkowalleck jkowalleck added Proposed new type type: golang Proposed new type as well as component discussions labels Nov 8, 2024
@jkowalleck
Copy link
Member

jkowalleck commented Nov 8, 2024

This is a breaking change that affects all software utilizing PURL for Go.

I'd disagree. In fact, it is non-breaking, as it adds a completely new purl type. Therefore, no breaking changes are introduced.

@matt-phylum
Copy link
Contributor

It is breaking because no existing PURL software expects pkg:go, and new PURL software will not expect pkg:golang. This creates a compatibility problem where either the PURL is rejected as an unrecognized type or software on different sides of the breakage don't understand each other. If this is merged, all software that works with Go PURLs will need to be updated to accept both types of Go PURL and convert before they interoperate again.

@jkowalleck
Copy link
Member

It is breaking because no existing PURL software expects pkg:go [...]

this is true to every newly proposed PURL Type :-)
And none of them is a breaking change - neither in spec nor in behaviour.

this PR is trying to add a new type go. the existing golang is not touched at all.

@matt-phylum
Copy link
Contributor

The problem is that this is not a new type. The go type is intended to replace golang.

@jkowalleck
Copy link
Member

The problem is that this is not a new type.

it is not? Could you point me to the existing go type?

The go type is intended to replace golang.

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

@matt-phylum
Copy link
Contributor

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

From the PR description:

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

golang is the type currently used for Go modules and packages. For example: https://github.com/anchore/syft/blob/3c070e0ad9d69c0f2191be52e2f2fb4904bcd558/syft/pkg/cataloger/golang/package_test.go#L24 . This PR is introducing a second, more preferred type for the same purpose.

@jkowalleck
Copy link
Member

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

From the PR description:

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

which is a behavioural change in a downstream application. This is out of scope of this spec, and not in our hands at all - we have no authority there.

golang is the type currently used for Go modules and packages. For example: https://github.com/anchore/syft/blob/3c070e0ad9d69c0f2191be52e2f2fb4904bcd558/syft/pkg/cataloger/golang/package_test.go#L24 . This PR is introducing a second, more preferred type for the same purpose.

exactly this paragraph makes it clear: this is a non-breaking change.

Causing no breaking change is the whole point of introducing a new purl type, instead of modifying an exising one.

@jkowalleck
Copy link
Member

jkowalleck commented Nov 8, 2024

I'm sure it can be fixed without making this level of breaking change.

i don't think so. #308 makes this clear: the existing spec has flaws that require breaking changes to fix them

The only way to fix golang is

  • a) introduce breaking changes in the existing purl-type << undesired !!!
  • b) introduce a new purl-type << feasible
  • c)
    1. have the PURL spec modified to allow versioning of purl-types << burocratic efforts that might lead to nothing
    2. if c)1. was successful: craft a purl-type golang version 2
    3. else fall back to a) or b)

@matt-phylum
Copy link
Contributor

Introducing a new type for an existing type is a breaking change to the PURL ecosystem. Implementations that use golang can continue to use golang and their golang PURLs will still be golang PURLs, but PURL has no negotiation mechanism where all the software that's going to read the PURLs agrees with the software that writes the PURLs on whether to use go or golang to describe Go dependencies.

If you start writing SBOMs that have go, they will be processed incorrectly by software that doesn't support go. If you continue writing SBOMs that have golang, they will be processed incorrectly by software that doesn't support golang. If you combine SBOMs using software that doesn't understand that go and golang are really the same type, the dependencies will be duplicated in the output. If you query go or golang packages against a vulnerability database, you have a 50/50 chance of finding the vulnerabilities unless the database understands both and converts golang to go.

Keeping golang is incompatible with the "a go module cannot be represented by different PURLs" goal of this PR.

You cannot just fix a PURL type by introducing a new type. Even if PURL libraries are updated to support transparently upgrading the old type into the new type on read, any software that is comparing pre-canonicalized PURL strings will need updates.

the existing spec has flaws that require breaking changes to fix them

What are the flaws that require breaking changes? #308 is about the path being incorrectly converted to lowercase, which is much more easily fixed by just not doing that.

@jkowalleck
Copy link
Member

jkowalleck commented Nov 8, 2024

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

how?

If a tool that produced purls would change it's behaviour by using the new purl-type, where they've used the other one before - this would be a breaking change in that very tool.
This is out of the scope of the purl spec -- we do not have authority there.

Implementations that use golang can continue to use golang and their golang PURLs will still be golang PURLs, but PURL has no negotiation mechanism where all the software that's going to read the PURLs agrees with the software that writes the PURLs on whether to use go or golang to describe Go dependencies.

So?
This is true to every purl type that is added over time.
An implementation written 2 years ago might not know the purl type that was defined yesterday.
This is by design and was never an issue. This is out of the scope of the purl spec -- we do not have authority there.

Keeping golang is incompatible with the "a go module cannot be represented by different PURLs" goal of this PR.

A PR tells a story, and the effective patch gets updated along with the discussions on a PR.
the initial PR description is usually not updated in accordance with the effective patch.

(PS: I review the content of the PR. and at the time of review, I saw no breaking change.
I was starting the "breaking" discussion in expectation that you'd agree that is no longer a breaking change, based on the current state of the PR.
I am happy we are discussing the topic anyway, i might be wrong, and I still need to learn.)

You cannot just fix a PURL type by introducing a new type. Even if PURL libraries are updated to support transparently upgrading the old type into the new type on read, any software that is comparing pre-canonicalized PURL strings will need updates.

how comes?

the existing spec has flaws that require breaking changes to fix them

What are the flaws that require breaking changes? #308 is about the path being incorrectly converted to lowercase, which is much more easily fixed by just not doing that.

the curerent golang spec says: the path MUST be lowercased.
This is wrong in terms of actual go dependency management: the path MUST NOT be lowercased.
Changing MUST to MUST NOT in golang purl-type is a breaking change of the specification.

``go`` for Go modules:

- The ``namespace`` field is empty and implies the go mod proxy.
- The ``name`` will be the full module path.
Copy link
Member

@jkowalleck jkowalleck Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The ``name`` will be the full module path.
- The ``name`` is the full module path. It MUST be unmodified, and follow the `Go Module Reference <https://go.dev/ref/mod#go-mod-file-ident>`_.

this change would close #308

Copy link
Contributor

@matt-phylum matt-phylum Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- - The ``name`` will be the full module path.
+ - The ``name`` is the full module path. In case of an URL: protocol MUST be lowercased; host-part MUST be lowercased; path-part MUSTbe unmodified, as it is case-sensitive.

this change would close #308

I don't think this is correct.

  1. I don't think it's legal to include a protocol in the module path. Go makes some HTTPS requests to resolve a VCS URL to download the package from (usually this is delegated to the proxy).
  2. The host part is also part of the case sensitive module path. It should not be lowercased. Uppercase characters are currently forbidden by Go for modules. I don't think it's worthwhile or really correct for the PURL spec to be specifying how to convert an invalid module path into a valid module path, I don't think it's worthwhile for the PURL spec to be specifying how to validate Go module paths, this doesn't cover all the restrictions, and this may cause problems if Go ever changes the restrictions for some reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re 1: I see. i was wrong there. Adjusted my suggestion for the protocol.
re 2: the host-part is, per URL-spec case-insensitive, and is normalized to lowercase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as Go is concerned, it's usually a host-part but it has additional restrictions and it is case sensitive: https://go.dev/ref/mod#go-mod-file-ident

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will modify my change-suggestion accordingly. does it fit better, now?


- The ``namespace`` field is empty and implies the go mod proxy.
- The ``name`` will be the full module path.
- The ``subpath`` will represent the package path within a module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The ``subpath`` will represent the package path within a module.
- The ``subpath`` is the unmodified package path within a module.

- The ``namespace`` field is empty and implies the go mod proxy.
- The ``name`` will be the full module path.
- The ``subpath`` will represent the package path within a module.
- The ``version`` will be a valid go version or pseudoversion, or empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The ``version`` will be a valid go version or pseudoversion, or empty.
- The ``version`` may be a valid go version or pseudoversion, omitted when empty.

@matt-phylum
Copy link
Contributor

Adding a new type for a new type is much different than adding a new type for an existing type. An old tool not recognizing a truly new type is expected, but an old tool not recognizing Go PURLs anymore because a tool producing the data says that golang is now spelled go is a breaking change. You can argue that this isn't a breaking change in the PURL spec itself because it doesn't change golang, but it necessitates a breaking change in every current implementation of Go PURLs and complicates implementations of Go PURL consuming software as long as there are both go and golang PURLs going around.

Changing "MUST be lowercased" to "MUST NOT be lowercased" is a much less impactful change than this. From what I've seen, names with uppercase characters are uncommon, and an outdated implementation that is incorrectly lowercasing is still working correctly for all names that do not contain uppercase characters to lowercase. I would even say that on a larger scale it is not a breaking change because:

  • An outdated PURL producer that incorrectly lowercases an ID containing capitals produces the wrong PURL, but today those producers are producing exactly the same PURL and calling it correct despite referring to the wrong package.
  • An outdated PURL consumer that incorrectly lowercases an ID containing capitals reads the wrong ID, but today those consumers are already reading exactly the same ID and calling it correct despite referring to the wrong package.

In both cases, the PURL is still parsed successfully and the meaning of the PURL is unchanged with respect to the current "MUST be lowercased" spec. The only differences would be that the canonical form changes¹ and a new consumer receiving a PURL from an old producer might be more likely to expect that the ID refers to the correct package, but since there is no good way for an outdated consumer to recover the correct ID after an outdated producer lowercases it, any consumer that relies on getting the correct ID (eg to resolve the package files) is likely already broken and not lowercasing the name can only improve the behavior in that situation.

This causes the same alignment problems as introducing a go type, except that if the correct ID is lowercase, no problem occurs because lowercasing is already producing the correct PURL.

¹ Due to underspecification in the text and tests, I wouldn't trust incoming PURLs to be in the canonical form as my implementation understands it. There are numerous minor differences in which characters are escaped when (and sometimes how), so if you're accepting PURLs from an external source, even if you don't expect user-entered, non-canonical PURLs in that source, you should be canonicalizing those PURLs yourself if your application depends on them all being canonical for the same definition of canonical.

@matt-phylum
Copy link
Contributor

Go isn't the only ecosystem that has this problem of incorrect name normalization rules in this repo. I'm also aware of:

@zpavlinovic
Copy link

zpavlinovic commented Nov 14, 2024

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

If this is indeed true, then there is something really wrong with PURL: it does not allow for evolution. On the one hand, we cannot add modifications to the existing specification that could introduce breaking changes. On the other hand, we cannot introduce a new type because somehow that is a breaking change as well. So one is pretty much stuck with slight variations of the initial spec. Specs should be allowed to evolve just the way the software does.

There should really be a way to add versioning on top of PURL itself. What is being proposed here might in essence be just that for the go spec.

@pombredanne
Copy link
Member

@maceonthompson Thanks for putting this together! this makes a lot sense, and we have an issue with Go alright. Let me look at the comments in details and come back with my 2 cents!

@pombredanne
Copy link
Member

@matt-phylum re:

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

I am not sure that's hte case, but a new type vs. updating the existing type demands some careful thinking :)


pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
pkg:go/github.com%2Fjmorion%[email protected]#api
pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a likely problem with the use of subpath: there is no way to determine where the module ends and the package starts in the general case, is there?
For instance, in the path google.golang.org/genproto/googleapis/api/annotations how can I determine safely that google.golang.org/genproto is a module and that googleapis/api/annotations is a package inside this module? I need either a go proxy lookup or a full filesystem to locate a go.mod/go.sum file, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way, if the module's code is available to you, to determine from a package import where module path ends and the package path begins by making HTTP requests.

I think the use of the subpath here is good because it puts the burden of determining this on whatever generates the PURL, which is likely aware of Go and either has the module paths or is most likely to be able to find the module path from the full package path. Then if you want to use a tool that checks PURLs against a database of information about modules (eg vulnerabilities), the tool already has all the information it needs. Otherwise, either the tool would need to make external API calls to figure out the module path of the PURL or the database would need to have an entry for every package in the module.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add to @matt-phylum's comment. If a tool is producing a PURL for a Go artifact, then it can use go version, Debug.BuildInfo, or packages.Load to get information about the package and its corresponding module. The encoding proposed here then makes it clear what the modules and packages are.

@pombredanne
Copy link
Member

BTW, an elephant in the room is whether the distinction between a namespace and name makes sense not only here, but also in the whole spec, globally.

I found myself using a variable with a "namespace/name" substring more often than not.
Then, how to split this in optional namespace and name could become a type-specific distinction, but the general concept would be that of "namespace/name", which could look like:

With this the whole google.golang.org/genproto/googleapis/api/annotations would be the namespace/name and would not have a specific split in Go, all would be in the name?
(and the same could apply where relevant to other package types)

It could have a minimal impact on the spec.


pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
pkg:go/github.com%2Fjmorion%[email protected]#api
pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to include all the buildinfo structure as qualifiers?
If so, this would only apply in a built binary?

Copy link
Member

@jkowalleck jkowalleck Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.
If so, all the qualifiers MUST be documented in the type-spec.

currently it reads:

Additional Build information for binaries can be included as qualifiers (i.e VCS info, go version info, GoArch/GoOS info etc)

I am afraid this documentation is insufficient.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will expand on this.

pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
pkg:go/github.com%2Fjmorion%[email protected]#api
pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck
pkg:go/golang.org%2Fx%[email protected]?goversion=1.23.2#cmd/govulncheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the Go module versions always to be prefixed with a v?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A version identifies an immutable snapshot of a module, which may be either a release or a pre-release. Each version starts with the letter v, followed by a semantic version.
-- https://go.dev/ref/mod#versions

version could also be a pseudo-version -- a git-tag, a git-commit-hash, or something like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pseudoversion is a special kind of version that also starts with a v: https://go.dev/doc/modules/version-numbers#pseudo-version-number

I think for Go modules, including when using the Go module system to refer to something that predates modules, the version always starts with a v. In which case, versions that don't start with v would only be used with older tools like Dep?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a version exists, it should be a valid Go module version. It should start with a v.

Note that hashes should not be permitted, they are not a valid Go version (resolution of hash commits in go tooling is a convenience feature).

@pombredanne
Copy link
Member

@matt-phylum you wrote

See also:

This is a breaking change that affects all software utilizing PURL for Go. Personally, I don't think there's anything fundamentally wrong with pkg:golang except that the description is outdated, and I'm sure it can be fixed without making this level of breaking change.

Thanks for the links! I tend to think along the same lines, and we can likely salvage the golang type.

Maintaining the separation of namespace and name and putting the entire Go package ID into the PURL name makes PURLs difficult for human users to work with.

I need to pounder this. See my other comment wrt. the namespace/name above in #338 (comment)

@zpavlinovic
Copy link

zpavlinovic commented Nov 18, 2024

However, "a go module cannot be represented by different PURLs" is not generally the case:

  • The PURL spec describes a canonical format for PURLs, but users and even commonly used PURL implementations often get this wrong and produce non-canonical PURLs which must still be considered equal. For example, pkg:golang/host%2Ecom/maybeuser/module is a non-canonical, valid, PURL which refers to the same package.
  • A PURL may have qualifiers which may or may not be critical to the PURL. A PURL with a ?goarch is a different PURL which refers to the same module, but a PURL with a ?repository_url (or however the module proxy is specified) is a different PURL which may refer to a different module (probably more likely in other ecosystems).

It is fine that PURL spec allows for more flexibility, but there should be only one way the Go module and package information is encoded. This simplifies the work for clients. It is easy to drop qualifiers from a PURL. It is annoying to generate multiple module+package encodings to see if the incoming PURL applies to your code.

In general, this proposal tries to make it simple and clear to generate and accurately check against PURLs. It might not be the most user-friendly solution, but tools that render PURLs can easily prettify the output. We believe this is worth the sacrifice.

@rhalar
Copy link

rhalar commented Nov 19, 2024

Could it also be clarified how standard library packages are to be represented?

Go has special handling for these, and the 'module' is never explicitly required when using them. But the module does exist for std and cmd
https://github.com/golang/go/blob/master/src/go.mod#L1
https://github.com/golang/go/blob/master/src/cmd/go.mod#L1

Go uses stdlib when reporting vulnerabilities though
https://vuln.go.dev/ID/GO-2024-3105.json

but the exact module name would make more sense we believe.

@matt-phylum
Copy link
Contributor

stdlib is probably good, if it needs to be specified (maybe specifying the compiler is enough). If it were std or cmd it would be a special case for tools that are not necessarily Go-centric to handle the case that pkg:golang/[email protected] is related to stdlib v1.2.3. Maybe pkg:golang/[email protected]#cmd is best if somebody is going to care that it's using the cmd package. However it's done it should be documented.

OSV is already using pkg:golang/stdlib, but that doesn't mean it's necessarily right. https://osv.dev/vulnerability/GO-2024-3105

@rhalar
Copy link

rhalar commented Nov 19, 2024

I'm not sure I follow how it would make it easier for tools?
The problem I have with pkg:golang/[email protected]#cmd is that cmd and std aren't packages, but modules.

@matt-phylum
Copy link
Contributor

Are they modules? They have go.mod files in their source code, but they aren't included in your go.mod and when you import their packages the compiler recognizes that you're trying to use the standard library and provides its own copy of the code from $GOROOT/src instead of downloading a module: https://github.com/golang/go/blob/8f22369136b264567955fb86cff491c247b45b8b/src/cmd/go/internal/modload/build.go#L42-L46

@zpavlinovic
Copy link

I'm not sure I follow how it would make it easier for tools? The problem I have with pkg:golang/[email protected]#cmd is that cmd and std aren't packages, but modules.

The cmd and std should not appear as modules in any of the actual go tools/artifacts. These modules are used internally to do vendoring of trusted packages developed independently elsewhere.

I think we can safely use stdlib for standard library or go tools (which are again in standard library) artifacts. We will add this to the proposal.

@rhalar
Copy link

rhalar commented Nov 20, 2024

I believe I get the argument, and I will defer to your judgement as the more knowledgeable about Go concepts and internals.

However, I do still feel that it would make sense for PURLs to use the "modules". PURLs are primarily unique identifiers so does it really matter what Go tooling does and how the internals work?
stdlib comes a bit out of nowhere and it doesn't seem that different from just using std, which seems more consistent at least, even if it technically doesn't follow the same rules. It still identifies the artifact in a unique way. Omitting any of the two is also fine, which would result in just, e.g.

pkg:golang/crypto/ecdsa@...

But I'm guessing that would cause issues, and it also doesn't follow the current convention for PURLs.

As I said, I'll defer to better judgements, just giving a mostly uneducated opinion. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposed new type type: golang Proposed new type as well as component discussions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants