Module Idea: TruffleHog #922
Replies: 11 comments 2 replies
-
This is a great idea, however there are some challenges in implementing it. Right now, our github module searches github for files containing the target domain and raises them to httpx, which pulls down the raw file and raises it as an HTTP_RESPONSE. By default this event type is not displayed, but it is distributed internally to modules such as excavate and secretsdb which parse it for URLS, domain names, secrets, etc. Looting the target's entire github is something I've wanted to do for a while, but the difficulty is that it's very hard to know for sure whether the github org is truly owned by the company. This is the same difficulty as we have with buckets. We sort of loot them as we see them rather than searching for them, because it's very easy to step out of scope and generate garbage data. So basically, we are searching github for secrets etc. via the |
Beta Was this translation helpful? Give feedback.
-
Hi @TheTechromancer, For either module a good way to validate would be:
A list of potential organization names could be created using all of the discovered DNS_NAME events that are in scope |
Beta Was this translation helpful? Give feedback.
-
Ohh interesting. I like that approach; try the API and if it returns any in-scope domains, we can assume it's in-scope. If we're doing it that way we can also try:
|
Beta Was this translation helpful? Give feedback.
-
@domwhewell-sage I have just noticed a bug in Anyway, it's a small bug and an easy fix, but it highlights the need for some polishing in this area. Secrets-patterns-db was a cool idea but it's not being actively maintained. Rather than maintaining our own regexes, I would ideally like to have a weekly CI pipeline that aggregates the latest signatures from all the competing "secrets mining" tools, cleans/dedups them, and publishes them in a JSON file for the BBOT module to consume. The only reason I haven't done this already is simply because I haven't found time. I am not inherently opposed to implementing trufflehog directly, but I think there's the potential for us to create something much more powerful if we can build it natively into BBOT's recursion. |
Beta Was this translation helpful? Give feedback.
-
Ok that sounds good, by implementing truffle hog we may loose other goodies that could be in the Also an interesting side note from blackhat while at a GitHub copilot talk, they also mentioned that they would be using AI to detect generic secrets https://docs.github.com/en/enterprise-cloud@latest/code-security/secret-scanning/enabling-ai-powered-generic-secret-detection |
Beta Was this translation helpful? Give feedback.
-
Migrating to discussion. |
Beta Was this translation helpful? Give feedback.
-
Not sure, but maybe this helps in any way: |
Beta Was this translation helpful? Give feedback.
-
If GitHub and Docker are being raised as |
Beta Was this translation helpful? Give feedback.
-
Trufflehog could still be used to ingest these I'm thinking of docker here as trufflehog is good and fast at pulling secrets out of docker image layers |
Beta Was this translation helpful? Give feedback.
-
Is there any way we could have a module that downloads the code repos and then gives them to trufflehog offline? |
Beta Was this translation helpful? Give feedback.
-
Closing as completed. |
Beta Was this translation helpful? Give feedback.
-
Description
I see bbot discovers and reports github repos, S3 buckets, AzureStorage etc.
In my testing I haven't seen it scan the github repos it discovers for secrets.. trufflehog has an organization flag which will enumerate all repos belonging to an organization and search them for secrets. It also has flags to include members and the repository's and to include forks which may be useful.
Beta Was this translation helpful? Give feedback.
All reactions