Skip to content
rootware edited this page Jun 26, 2022 · 8 revisions

Usage

Quick Installation and basic usage

python -m pip install --upgrade pyredactkit

Note : If you are using govtech version, you can use redactor from the powershell terminal instead. E.g:

redactor 'this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com'

To redact a text glob from terminal.

pyredactkit 'this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com'

This will create a redacted file and a hashshadow file which you can later use to unredact them.

// .hashshadow.json
{
  "5f7aa522-86e5-4ca7-83ae-09fbb5a1044b": "[email protected]",
  "983b017a-98a5-4763-aa6d-a8ad69db20bc": "github.com",
  "a9581c73-05cb-428e-8c62-8bf1521a8aa1": "127.0.0.1"
}
<!-- redacted.txt -->
this is my ip:a9581c73-05cb-428e-8c62-8bf1521a8aa1. my email is 5f7aa522-86e5-4ca7-83ae-09fbb5a1044b. secret link is 983b017a-98a5-4763-aa6d-a8ad69db20bc

To redact a single file from terminal.

pyredactkit -f test.txt 

To Unredact a redacted file.

pyredactkit -f redacted_test.txt -u .hashshadow_test.txt.json 

This will create an unredacted file which contains the original unmasked data.

<!-- unredacted.txt -->
this is my ip:127.0.0.1. my email is [email protected]. secret link is github.com

Advance usage

To redact multiple files in a folder with sub directories and output a newly created directory.

Consider the below folder containing multiple log files with sub directory.

tree foldertoredact 
foldertoredact
β”œβ”€β”€ cctest.txt
β”œβ”€β”€ ip_test 3.txt
β”œβ”€β”€ ip_test 4.txt
β”œβ”€β”€ ip_test.txt
β”œβ”€β”€ nric.txt
└── subdir

1 directory, 5 files

To redact all of log files in the folder and place them in a new folder:

pyredactkit -f foldertoredact -d newfolder
tree newfolder
newfolder
β”œβ”€β”€ redacted_cctest.txt
β”œβ”€β”€ redacted_ip_test 3.txt
β”œβ”€β”€ redacted_ip_test 4.txt
β”œβ”€β”€ redacted_ip_test.txt
└── redacted_nric.txt

0 directories, 5 files

Besides the core regex patterns of SG NRIC, domain names, emails, ip addresses, base64 strings and credit cards, you can also define custom regex patterns to redact them from your log files.

To redact using custom regex pattern, create a custom json file as per format below.

// customregex.json
[
    {
        "pattern": "^([a-zA-Z0-9_-]*:[a-zA-Z0-9_-][email protected]*)$",
        "type": [
            "API Keys",
            "Credentials",
            "Bug Bounty",
            "GitHub"
        ]
    },
    {
        "pattern": "(?i)^(arn:(?P<Partition>[^:\\n]*):(?P<Service>[^:\\n]*):(?P<Region>[^:\\n]*):(?P<AccountID>[^:\\n]*):(?P<Ignore>(?P<ResourceType>[^:\\/\\n]*)[:\\/])?(?P<Resource>.*))$",
        "type": [
            "Identifiers",
            "Networking",
            "AWS",
            "Bug Bounty"
        ]
    },
    {
        "pattern": "(?i)^((facebook|fb)(.{0,20})?['\\\"][0-9a-f]{32}['\\\"])$",
        "type": [
            "API Keys",
            "Bug Bounty",
            "Credentials",
            "Facebook"
        ]
    }
]
pyredactkit -f file -c customregex.json
Clone this wiki locally