Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mono-Repo with different types not scanned correctly #1576

Open
pschichtel opened this issue Jan 17, 2025 · 18 comments
Open

Mono-Repo with different types not scanned correctly #1576

pschichtel opened this issue Jan 17, 2025 · 18 comments

Comments

@pschichtel
Copy link

I have tested two projects layouts:

  1. projects in separate folders like this:

    • backend/build.gradle.kts
    • frontend/package.json
  2. projects nested within eachother like this:

    • build.gradle.kts
    • frontend/package.json

Both the documentation as well as the ChatGPT assistant suggest, that a command as simple as cdxgen . should automatically find all projects and generate a combined SBOM for them.

The command does seem to find the projects in both cases, at least the log contains things related to npm and gradle, however:

In case (1) the resulting bom.json contains components for all dependencies of both projects, however when importing the bom into a dependency-track project only one of the projects is part of the dependency tree.
In case (2) the resulting bom.json only contained components for one of the projects, the root project.

@prabhu
Copy link
Collaborator

prabhu commented Jan 17, 2025

Can you share a sample repo to reproduce the issue? You must be facing two limitations:

  1. There must be a build.gradle in the root, since there is some hardcoded path in a few places.
  2. Automatic installation for npm (when there are no lock files) is limited to just one I think.

A range of samples will help improve this feature significantly.

@pschichtel
Copy link
Author

I can provide example projects next week

@pschichtel
Copy link
Author

https://github.com/pschichtel/cdxgen-reproducer

the repo contains 2 projects, each once in the nested structure and in the side-by-side structure as described above.

when importing the either bom into dependency-track it shows the dependency tree only with the npm project, but the components still include e.g. ktor from the gradle project.

@pschichtel
Copy link
Author

For reference screenshots of the side-by-side project version in our dependency track installation (the nested version is identical):

Image

Image

@prabhu
Copy link
Collaborator

prabhu commented Jan 21, 2025

Thank you for the samples. This exactly hits two different limitations in gradle and npm. Fixing this is a non-trivial task, especially testing since every single line of change could break something somewhere for someone. Will you be interested in contributing a PR working with us? Or we can keep this open and see if anyone is willing to sponsor.

@pschichtel
Copy link
Author

I personally would be willing to give it a shot, however this would definitely need some initial pointers since I have absolutely no clue on where exactly these limits are and where to start. I'd also have to check with my company if time can be allocated to this, especially since workarounds exists (e.g. scanning each project individually and merging the boms)

@pschichtel
Copy link
Author

I've had a look at the codebase with a colleague. 11.1.2 btw fixes part of the problem: The dependency tree is now complete, just incorrectly structured (npm project always becomes the root project, no matter where it lives in the folder tree).

I think we know where this would need changes to work differently, but together with all the other issues we faced with the tool today we are unlikely to continue with it. We will be focusing our effort on using more specialized sbom generators for the various different types of projects and merging the individual sboms using cyclonedx-cli. This seems to be the conclusion in other departments as well.

@prabhu
Copy link
Collaborator

prabhu commented Jan 23, 2025

Have you tried running cdxgen with ordered types. Example: -t gradle -t npm? cdxgen does outperform most specialized generators btw. You can feel free to generate various sboms and use this tool to compare them.

https://github.com/AppThreat/custom-json-diff

@pschichtel
Copy link
Author

When restricting it to just these two types the resulting sbom cannot be imported into dependency track (haven't checked why specifically).

@prabhu
Copy link
Collaborator

prabhu commented Jan 23, 2025

If there are no validation errors on the cdxgen side, possible there are some uncaught validation errors on the DT side. Is that reproducible using your sample repo?

@prabhu
Copy link
Collaborator

prabhu commented Jan 26, 2025

Is Dependency Track the only platform not supporting dangling trees? I think we can remove a whole category of bugs by simply not having a root node for dependency trees representing monorepos.

@prabhu
Copy link
Collaborator

prabhu commented Jan 27, 2025

@malice00 any ideas how we can approach the parent component problem in monorepos?

@malice00
Copy link
Contributor

Is Dependency Track the only platform not supporting dangling trees? I think we can remove a whole category of bugs by simply not having a root node for dependency trees representing monorepos.

I'm not familiar with anything other than DT, but yes, DT only shows a tree starting at the root-component. We could leave it out for monorepos, but in DT that would mean no tree at all... Also, isn't a root-component mandatory? Haven't checked that in the specs yet...

@malice00 any ideas how we can approach the parent component problem in monorepos?

Maybe we could use the parameters --project-XX to create a root-project in case there are multiple projects found in the working tree? If you agree, I could give that a try...

@pschichtel
Copy link
Author

For the nested case you could rely on the directory structure to infer the relation between projects, for the side-by-side you'd definitely need a synthetic root.

What about introducing a "synthetic" project type, that would look or a special file (e.g. cdxgen.yaml or so), and simply produce a component without any dependencies based on a description inside the file. then, together with the folder structure, dependencies between those could be inferred.

@malice00
Copy link
Contributor

malice00 commented Jan 27, 2025

That was another idea I had, but might take a little longer to implement. We could then have the layout dictated by the user:

group: xx
name: xx
type: synthetic
subs:
  - path: component1
    group: xx_1   # in case overriding is wanted
    name: xx_1   # in case overriding is wanted
    type: gradle
  - path: component2
    type: npm
  - path: dir3/another_component
    type: npm

Maybe something like this?

Question is though, if this could be added to the current configuration format, of if maybe a v2 is necessary for this...

@prabhu
Copy link
Collaborator

prabhu commented Jan 28, 2025

How about we add a split mode to cdxgen to generate separate SBOM files per type and optionally create an index file to link them together. Such a feature is also useful for ML users since each individual BOM would be smaller.

Aggregate could then become a separate command or the users could feel free to use other cli tools?

@prabhu
Copy link
Collaborator

prabhu commented Jan 28, 2025

Or, we get rid of logic like below that attempts to create a single parent.

parentComponent = parentComponent.components[0];

In postgen, we create the parent by using parent-ref from the cli arguments or have some kind of project type hierarchy to decide which type becomes the parent. Example, if there are java and npm packages, we make the java (maven) the parent?

@malice00
Copy link
Contributor

Another way to go is check the paths of the projects and if they are not nested, add a synthetic component (eg using the dir-name for the component) and plug the projects into that.
It does bring the question on what to do with the --project-XX-parameters, in case they are set... Where should these be set? On the synthetic component? And what if we would like to override the sub-projects as well?
I do really like the idea of the config-file (had been playing with something like that in my head even before this issue came up), but I think that needs to be thought out some more and have some good default if the configuration is 'incomplete'... But I'm open to give something a go -- am stuck a bit on (repo-)testing my implementation of cocoapods, so a change of subject would be welcome... 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants