diff --git a/README.md b/README.md index 10f5acbb..42e89bd2 100644 --- a/README.md +++ b/README.md @@ -21,23 +21,26 @@ There is also built-in support for using **headless Chrome** to efficiently meas ### Requirements -`domain-scan` requires **Python 3.5 and up**. To install dependencies: +`domain-scan` requires **Python 3.5 and up**. + +To install **core dependencies**: ```bash pip install -r requirements.txt ``` -This will automatically allow the use of two scanners: +You can install scanner- or gatherer-specific dependencies yourself. Or, you can "quick start" by just **installing all dependencies for all scanners and/or all gatherers**: -* `pshtt` - A scanner that uses the [`pshtt`](https://github.com/dhs-ncats/pshtt) Python package from the [Department of Homeland Security's NCATS team](https://github.com/dhs-ncats). -* `sslyze` - A scanner that uses the [`sslyze`](https://github.com/nabla-c0d3/sslyze) Python package maintained by Alban Diquet. +```bash +pip install -r requirements-scanners.txt +pip install -r requirements-gatherers.txt +``` -Other individual scanners will require additional externally installed dependencies: - -* `trustymail`: The `trustymail` command, available from the [`trustymail`](https://github.com/dhs-ncats/trustymail) Python package from the [Department of Homeland Security's NCATS team](https://github.com/dhs-ncats). (Override path by setting the `TRUSTYMAIL_PATH` environment variable.) -* `a11y`: The `pa11y` command, available from the [`pa11y`](https://www.npmjs.com/package/pa11y) Node package. (Override path by setting the `PA11Y_PATH` environment variable.) -* `third_parties`: The `phantomas` command, available from the [`phantomas`](https://www.npmjs.com/package/phantomas) Node package. (Override path by setting the `PHANTOMAS_PATH` environment variable.) +If you plan on **developing/testing domain-scan itself**, install development requirements: +```bash +pip install -r requirements-dev.txt +``` ### Usage @@ -65,7 +68,16 @@ Append columns to each row with metadata about the scan itself, such as how long ./scan example.com --scan=pshtt --meta ``` -##### Parallelization +### Scanners + +* `pshtt` - A scanner that uses the [`pshtt`](https://github.com/dhs-ncats/pshtt) Python package from the [Department of Homeland Security's NCATS team](https://github.com/dhs-ncats). +* `sslyze` - A scanner that uses the [`sslyze`](https://github.com/nabla-c0d3/sslyze) Python package maintained by Alban Diquet. +* `trustymail`: The `trustymail` command, available from the [`trustymail`](https://github.com/dhs-ncats/trustymail) Python package from the [Department of Homeland Security's NCATS team](https://github.com/dhs-ncats). (Override path by setting the `TRUSTYMAIL_PATH` environment variable.) +* `third_parties` - What third party web services are in use, using [headless Chrome](https://developers.google.com/web/updates/2017/04/headless-chrome) to trap outgoing requests. (See documentation for [using](#headless-chrome) or [writing](#developing-chrome-scanners) Chrome-based scanners.) +* `a11y` - Accessibility issues, using [`pa11y`](https://github.com/pa11y/pa11y). +* `noop` - Test scanner (no-op) used for development and debugging. Does nothing. + +### Parallelization It's important to understand that **scans run in parallel by default**, and **data is streamed to disk immediately** after each scan is done. @@ -117,15 +129,6 @@ See [`docs/lambda.md`](`docs/lambda.md`) for how to build and deploy Lambda-base ### Options -**Scanners:** - -* `pshtt` - HTTP/HTTPS/HSTS configuration, using [`pshtt`](https://github.com/dhs-ncats/pshtt). -* `trustymail` - MX/SPF/STARTTLS/DMARC configuration, using [`trustymail`](https://github.com/dhs-ncats/trustymail). -* `sslyze` - TLS/SSL configuration, using [`sslyze`](https://github.com/nabla-c0d3/sslyze). -* `third_parties` - What third party web services are in use, using [headless Chrome](https://developers.google.com/web/updates/2017/04/headless-chrome) to trap outgoing requests. (See documentation for [using](#headless-chrome) or [writing](#developing-chrome-scanners) Chrome-based scanners.) -* `a11y` - Accessibility issues, using [`pa11y`](https://github.com/pa11y/pa11y). -* `noop` - Test scanner (no-op) used for development and debugging. Does nothing. - **General options:** * `--scan` - **Required.** Comma-separated names of one or more scanners. diff --git a/lambda/remote_build.sh b/lambda/remote_build.sh index 4c3e2ef0..c5edde8e 100755 --- a/lambda/remote_build.sh +++ b/lambda/remote_build.sh @@ -42,7 +42,7 @@ pip install . cd .. cd domain-scan -pip install -r requirements.txt +pip install -r lambda/requirements-lambda.txt cd .. deactivate diff --git a/lambda/requirements-lambda.txt b/lambda/requirements-lambda.txt index 18061732..1ca82012 100644 --- a/lambda/requirements-lambda.txt +++ b/lambda/requirements-lambda.txt @@ -3,4 +3,3 @@ strict-rfc3339 publicsuffix - diff --git a/requirements-gatherers.txt b/requirements-gatherers.txt index e374b3c1..91b850da 100644 --- a/requirements-gatherers.txt +++ b/requirements-gatherers.txt @@ -1,3 +1,6 @@ +### +# Requirements used by specific gatherers. + # censys google-cloud-bigquery google-auth-oauthlib diff --git a/requirements-scanners.txt b/requirements-scanners.txt index d519ecac..44ec8416 100644 --- a/requirements-scanners.txt +++ b/requirements-scanners.txt @@ -1,7 +1,5 @@ - -# a11y -pyyaml -requests +### +# Requirements used by specific scanners. # pshtt git+https://github.com/dhs-ncats/pshtt.git#egg=pshtt @@ -12,3 +10,7 @@ git+https://github.com/dhs-ncats/trustymail.git#egg=trustymail # sslyze sslyze>=1.3.4,<1.4.0 cryptography + +# a11y / csp +pyyaml +requests diff --git a/requirements.txt b/requirements.txt index f462502f..37e38281 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,6 +8,6 @@ # invocation. boto3 -# Used in Lanbda functions. Also copied to lambda/requirements-lambda.txt. +# Used in Lambda functions. Also copied to lambda/requirements-lambda.txt. strict-rfc3339 publicsuffix