-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go binary analysis to find source packages #2
Comments
I've recently described a few workflows that I use when invoking GoReSym on an unknown Go executable: http://www.williballenthin.com/post/analyzing-go-programs-with-goresym/ Notably: |
@williballenthin Thank you ++ for dropping by! 🙇 your insights are super useful... Here the goal would be to integrate goresym as a library in a larger Go executable (or maybe reuse as-is?). The initial usage would be: given some Go binary found in a codebase that we analyze for origin using a ScanCode.io pipeline, I want to:
Later:
NB: overall this is similar to related "binary" analysis we are working on for Java, JS and ELFs. |
I'm not sure that GoReSym is setup to be used as a library today, but I suspect that @stevemk14ebr (the primary author) might be convinced to add support. If you decide to go this route using GoReSym and need any support or changes, please don't hesitate to open an issue at that repository so we can plan and implement. |
Hi @williballenthin , I used https://github.com/goretk/gore but I will try to use GoReSym as well, if you want I could also contribute to GoReSym, to add the functionality we need |
@pombredanne, we can use https://github.com/goretk/gore and the GetPackages method from this library to implement this functionality |
I will be working on making GoReSym easy to use as a library soon for integration with other tools we use on flare. Until that occurs, if you decide to use it, you could subprocess to it and parse the json output. An advantage it may have over other projects is its focus on recovery of information even with obfuscated and malformed binaries. |
PR: #3 |
@CatalinStratu IMHO the next steps are to:
@williballenthin @stevemk14ebr What's been your rationale to start GoreSym when gore was there in the first place? @stevemk14ebr I reckon you mentioned redress as an inspiration in https://www.mandiant.com/resources/blog/golang-internals-symbol-recovery |
@CatalinStratu one things that may be missing is a proper test suite.
You initial intuition to collect existing pre-built binaries test cases is IMHO a good one but these cannot be in the main git repo here as these would be too big. We could use a git module for these.... So IMHO you start to collect a few test binaries, prebuilt by others and with a well known origin, source code and license for testing |
@CatalinStratu for tests, there may be several Go pre-built binaries availabel in Linux distro (fedora, Alpine and debian) for contrainer-related things The key here is to track carefully the origin and license of ALL these test files, ideally using an ABOUT file for each, see https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/bashlex.py.ABOUT for an example. Note that the test suite in an external repo will be also usable by GoReSym. |
Motivations for GoReSym were primarily to enable better support of obfuscated binaries as we see them often, and additionally to base our code off of the Go runtime itself (hence why GoReSym is in Go) which enables faster updates for new runtime versions and more confidence in correctness of parsing. |
@stevemk14ebr, could I somehow contribute to GoReSym? I made some improvements on my Fork(https://github.com/CatalinStratu/GoReSym), I made some good improvements, will you be able to do a PR? |
I would love additional contributions. Looking at your commit some of the changes involve renames and removing casts from files I copied from the upstream go source. As I copied these directly from upstream I'm not willing to change those for maintenance reasons, if it's good enough for Go it's good enough for us. I would accept contributions to the modified parts of the runtime (mostly within objfile) or main. |
I made a PR, I will be grateful for your comments. |
@CatalinStratu I provided a bunch of comments there... I am not sure if you have the time to complete this? @TG1999 See also mandiant/GoReSym#49 |
At this stage I think I can call this done based on these two PRs:
|
I would like to analyze a Go binary and find which source packages were used to build it and more. The goal there is to recognize a binary as from Go, extract things out of it, map these things back to sources and open source source repos, and eventually inject that in the flow to create an SBOM in ScanCode/ScanCode.io
Go binaries can be either ELFs, PE or Mach-O. The initial focus should be on ELFs
To get the details of what's in a binary there are a couple avenues:
The initial1st step is to determine the list of all third-party Go modules included in a binary. I would like to use a CLI tool with a CLI UI similar to that of ScanCode Toolkit, python-inspector and nuget-inspector that would:
Some candidate libraries include:
Beyond this for Go strings, see mandiant/flare-floss#845 by @Arker123 and mandiant/flare-floss#807
The text was updated successfully, but these errors were encountered: