-
Notifications
You must be signed in to change notification settings - Fork 23
Architecture
The architecture of pkilint is rather different from other publicly available X.509 linters. This page describes how pkilint works internally and how this design contrasts with other projects.
The architecture of pkilint consists of six major components:
- The front end. These are user-facing applications, such as the command-line based linters as well as the REST API. The high-level Python API can also be considered part of the front-end if the consumer is a Python application.
- The document loader and decoder. This component reads DER or PEM-encoded documents and creates the node structure that is traversed.
- The document type detector. This component consumes the decoded document and determines the type of document.
- The validation engine. This component traverses the node tree produced by the document loader and produces a set of findings. The set of validations that are executed are determined by the document type detector.
- The finding filter. This component filters out superseded findings or findings that should be suppressed and not reported.
- The report generator. This component produces reports on findings in a variety of formats.
pkilint was designed to support a variety of front-ends. One of the primary front-ends is the command-line based tools bundled with the pkilint Python package. Each of the command line linters wraps a high-level linting Python API. The REST API is similar, where the REST API (implemented as an ASGI application) deserializes linting requests and invokes the high-level Python API.
This component accepts a PEM, DER, or base64-encoded "document" (X.509 certificate, X.509 CRL, PKIX OCSP response, etc.) and decodes it using the ASN.1 schema for the specified document type. The actual bit-twiddling for decoding DER is done by the pyasn1 library. The decoded schema is then used to generate a tree of "nodes", each of which representing a field within the decoded ASN.1 document.
This design differs from other linters in that other linters decode the substrate into an "object" (as represented in the programming language) that loses semantic information regarding the encoding of the document. For example, several linters are unable to exhaustively check for the correctness of encoding due to information being lost in the decoding process. By tying the representation of the document very closely with its encoding, the linter can perform exhaustive checks to flag encoding errors. Additionally, leveraging libraries that provide ASN.1 modules that can be readily consumed programmatically allows for checking fields and other elements that cannot be easily tested with other frameworks if they don't support a given field out of the box.
There may be scenarios where the user is not certain of the type of the document that is to be validated. For example, when mass-linting a corpus of CA/Browser Forum TLS certificates, it is not possible to explicitly specify the certificate type for each certificate. In these cases, the document type detector runs a set of checks based on the content of the document (and, optionally, other data external to the document) to determine the set of validations to execute.
It should be noted that it is not always possible to determine the type of document with 100% accuracy, as all required information to make a definitive determination may not be available.
It is possible for users to skip the use of the document type detector and instead explicitly specify the type of document. This is useful in scenarios where the document type does not change or is already known. For example, if a CA is using pkilint for pre-issuance linting of sponsored-validated legacy generation S/MIME certificates, then the CA can explicitly specify the SPONSORED-LEGACY
certificate type and pkilint will use the set of validations appropriate for that certificate type regardless of the content of the certificate. This is analogous to the quality assurance function in a hammer factory: the workers on the factory floor know that a certain machine will always produce hammers and need to be alerted if the machine starts producing screwdrivers. With pkilint this is possible, whereas other linters will not raise any alarm that the hammer-making machine is producing perfectly formed screwdrivers.
This component has two sub-components: the recursive node visitor ("validator container") as well as the visitors ("validators") themselves.
The validator container contains one or more validators. The validator container accepts the document node tree and calls each validator for each node twice. The first call is done to determine whether the validator is applicable to the current node (i.e., check that it "matches"). Matching can be done using any quality criteria deemed appropriate, but generally matching is performed by determining whether the given node is a specific ASN.1 PDU. If the first call to match the node with the validator succeeds, then the validator's validation logic is invoked on the current matched node. The validation may then return zero or more validation findings. These findings are then contained in a validation result, which provides additional metadata.
This application of the "Visitor Pattern" is one of the primary differentiating features of pkilint. Other publicly available linters take a "top-down" approach where each linter/validator accepts a X.509 document object and extracts the relevant components explicitly. While this is beneficial from a performance standpoint, this requires that the implementor consider every possible field where a specific check is to be executed. For example, in a top-down design, a test for internal domain names in URIs would require that the implementor exhaustively enumerate every possible field where a domain name can be encoded. This sharply contrasts with the "visitor pattern" design adopted by pkilint, where validators can be written to match and validate fields regardless of where they appear in the document. For example, a lint for internal domain names can be implemented by matching on every uniformResourceIdentifier
GeneralName field in the document.
Once a document has been validated using the validation engine, the set of validation results is then optionally filtered using a finding filter. Several standards, such as the CA/Browser Forum TLS and S/MIME certificate profiles, are additive to the profile requirements as specified in the PKIX documents (RFC 5280, etc.). Thus, to reduce complexity and duplication of validation logic, the CA/Browser Forum linters invoke the PKIX set of validators as well as the CA/Browser Forum-specific validators. However, the standards are not perfectly aligned, and one standard may permit (or even require) a requirement that contradicts the other. In that case, to reduce the potential for confusion by reporting contradicting information, a finding filter can remove specific findings from the set of results.
This is another differentiating feature of pkilint. Other linters either re-implement validation logic for every standard they support and/or they do not filter out superseded findings.
Once the set of results has been collected and optionally filtered, the results are then formatted for human or machine consumption. The report generator is format agnostic and is easily extensible to add new formats.