Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add proposal to allow continuous vulnerability scanning #212

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions proposals/new/continuous-vulnerability-scanning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Proposal: `Allow vulnerability scans to be configured to scan continously`

Author:
- Thomas O'Brien / [@slushysnowman](https://github.com/slushysnowman)

## Abstract

Proposal to allow vulnerability scans to be configured to scan continously on the basis of 'last scan time' rather than the current setup, which only allows configuration of scheduled scans for all stored images.

## Background

Currently the vulnerability scanners can be configured via cron notation to scan all stored artifacts. This mostly works well, but as the amount of artifacts stored in the registry rises, the time taken to complete all the scans also rises, and there is a big backlog of artifacts to be scanned.

This can result in:
- periods of very heavy load on the database, way outside usual norms, as all the results are stored in the database
- scans triggered by a push being delayed, as they end up in the processing queue

The scan queue throughput can be increased by scaling the vulnerability scanner horizontally, but this only increases the load on the database further.

## Proposal

Implement a mechanism that will scan an image on the basis of the last scan time.

For example, if Harbor is configured that all images should be scanned weekly, instead of giving the ability to schedule weekly scans, where all artifacts are scanned at the same time, Harbor should instead look at the time that an image was last scanned, and ensure that it is scanned at or around 1 week later.

## Non-Goals

Explicitly not included in this proposal:
- The ability to exclude certain artifacts from scheduled scans

## Rationale

The advantage of this approach is that load can be spread out, instead of the current approach which results in big spikes in scanning / database usage.

The disadvantage of this approach is that in the current setup, vulnerability scans can be scheduled for 'off-hours' where usage is expected to be low, and performance impact can be reduced. In the proposed setup, there may be times during 'on-hours' where the scan backlog is high.

## Compatibility

Depending on if this is implemented alongside the existing scheduled functionality or separately, this could be a breaking change.

## Implementation


## Open issues (if applicable)
- Potentially related, may mitigate this issue - https://github.com/goharbor/harbor/issues/17538
- https://github.com/goharbor/harbor/issues/17505
- https://github.com/goharbor/harbor/issues/17125
- Could mitigate this - https://github.com/goharbor/harbor/issues/12140
-