-
Notifications
You must be signed in to change notification settings - Fork 8
Usage
python -m pip install --upgrade pyredactkit
Note : If you are using govtech version, you can use redactor
from the powershell terminal instead. E.g:
redactor 'this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com'
To redact a text glob from terminal.
pyredactkit 'this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com'
This will create a redacted file and a hashshadow file which you can later use to unredact them.
// .hashshadow.json
{
"5f7aa522-86e5-4ca7-83ae-09fbb5a1044b": "[email protected]",
"983b017a-98a5-4763-aa6d-a8ad69db20bc": "github.com",
"a9581c73-05cb-428e-8c62-8bf1521a8aa1": "127.0.0.1"
}
<!-- redacted.txt -->
this is my ip:a9581c73-05cb-428e-8c62-8bf1521a8aa1. my email is 5f7aa522-86e5-4ca7-83ae-09fbb5a1044b. secret link is 983b017a-98a5-4763-aa6d-a8ad69db20bc
To redact a single file from terminal.
pyredactkit -f test.txt
To Unredact a redacted file.
pyredactkit -f redacted_test.txt -u .hashshadow_test.txt.json
This will create an unredacted file which contains the original unmasked data.
<!-- unredacted.txt -->
this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com
To redact multiple files in a folder with sub directories and output a newly created directory.
Consider the below folder containing multiple log files with sub directory.
tree foldertoredact
foldertoredact
βββ cctest.txt
βββ ip_test 3.txt
βββ ip_test 4.txt
βββ ip_test.txt
βββ nric.txt
βββ subdir
1 directory, 5 files
To redact all of log files in the folder and place them in a new folder:
pyredactkit -f foldertoredact -d newfolder
tree newfolder
newfolder
βββ redacted_cctest.txt
βββ redacted_ip_test 3.txt
βββ redacted_ip_test 4.txt
βββ redacted_ip_test.txt
βββ redacted_nric.txt
0 directories, 5 files
Besides the core regex patterns of SG NRIC, domain names, emails, ip addresses, base64 strings and credit cards, you can also define custom regex patterns to redact them from your log files.
To redact using custom regex pattern, create a custom json file as per format below.
// customregex.json
[
{
"pattern": "^([a-zA-Z0-9_-]*:[a-zA-Z0-9_-][email protected]*)$",
"type": [
"API Keys",
"Credentials",
"Bug Bounty",
"GitHub"
]
},
{
"pattern": "(?i)^(arn:(?P<Partition>[^:\\n]*):(?P<Service>[^:\\n]*):(?P<Region>[^:\\n]*):(?P<AccountID>[^:\\n]*):(?P<Ignore>(?P<ResourceType>[^:\\/\\n]*)[:\\/])?(?P<Resource>.*))$",
"type": [
"Identifiers",
"Networking",
"AWS",
"Bug Bounty"
]
},
{
"pattern": "(?i)^((facebook|fb)(.{0,20})?['\\\"][0-9a-f]{32}['\\\"])$",
"type": [
"API Keys",
"Bug Bounty",
"Credentials",
"Facebook"
]
}
]
pyredactkit -f file -c customregex.json