Skip to content

Commit

Permalink
Add Phishing Detection Embedded Data Update Workflow (#3403)
Browse files Browse the repository at this point in the history
Task/Issue URL:
https://app.asana.com/0/72649045549333/1208270234071172/f
Tech Design URL:
https://app.asana.com/0/481882893211075/1207483114414814
CC: 

**Description**:
In [✓ Tech Design: Phishing Protection Data
Updates](https://app.asana.com/0/481882893211075/1207483114414814/f) we
defined an approach to get embedded data for phishing protection into
the app builds. This pattern was implemented, but two small components
still need merging:
- The update script:
- A small bash script that pulls data from the API into the repo in JSON
format, and updates the checksums + revision values in
PhishingDetection.swift .
- This has already been implemented, it just needs testing and merging:
    - #3243
- A GitHub Action workflow that executes this script once a week and
creates a PR that merges this data into the new release build.
- [Example
Script](https://github.com/duckduckgo/macos-browser/blob/0fb680211ad05dacb621b351a8f7b266e7239b7d/.github/workflows/update_phishing_detection_data.yml)
- The secrets and workflow have already been defined, it just needs to
be tested, reviewed, and merged.

**Note**
After testing, but before merging, I'd like to update the GH action to
run on a schedule once per week using cron:

```
on:
  schedule:
    - cron: '0 0 * * 0'  # Midnight UTC every Sunday
```

This way it can be reviewed just once per week by whoever is on
maintenance that week as part of the weekly maintenance rota.

**Steps to test this PR**:
1. Test the script locally: `bash scripts/update_phishing_data.sh`
2. Ensure the script runs, check changes in git:
3. `DuckDuckGo/PhishingDetection/PhishingDetection.swift` - sha256 and
version values updated correctly
4. `DuckDuckGo/PhishingDetection/filterSet.json` - not empty
5. `DuckDuckGo/PhishingDetection/hashPrefixes.json` - not empty
6. Check the GH action has executed and created a PR with name like
`Update phishing protection datasets to 1681795`:
7. #3404

<!--
Tagging instructions
If this PR isn't ready to be merged for whatever reason it should be
marked with the `DO NOT MERGE` label (particularly if it's a draft)
If it's pending Product Review/PFR, please add the `Pending Product
Review` label.

If at any point it isn't actively being worked on/ready for
review/otherwise moving forward (besides the above PR/PFR exception)
strongly consider closing it (or not opening it in the first place). If
you decide not to close it, make sure it's labelled to make it clear the
PRs state and comment with more information.
-->

**Definition of Done**:

* [ ] Does this PR satisfy our [Definition of
Done](https://app.asana.com/0/1202500774821704/1207634633537039/f)?

---
###### Internal references:
[Pull Request Review
Checklist](https://app.asana.com/0/1202500774821704/1203764234894239/f)
[Software Engineering
Expectations](https://app.asana.com/0/59792373528535/199064865822552)
[Technical Design
Template](https://app.asana.com/0/59792373528535/184709971311943)
[Pull Request
Documentation](https://app.asana.com/0/1202500774821704/1204012835277482/f)
  • Loading branch information
not-a-rootkit authored Oct 22, 2024
1 parent eaf5cd7 commit 798425d
Show file tree
Hide file tree
Showing 5 changed files with 112 additions and 761,041 deletions.
35 changes: 35 additions & 0 deletions .github/workflows/update_phishing_detection_data.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Update Phishing Detection Datasets
on:
schedule:
- cron: '0 0 * * 0' # Midnight UTC every Sunday
jobs:
update_data:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
repository: duckduckgo/macos-browser
path: macos/
token: ${{ secrets.DAX_MACOS_BROWSER_PHISHING_AUTOMATION }}
- name: Execute Update Script
run: |
cd ./macos
REVISION="$(bash ./scripts/update_phishing_detection_data.sh | grep -oP 'Updated revision from \K\d+')"
echo "REVISION=$REVISION" >> $GITHUB_ENV
TEMPLATE="$(bash ./scripts/update_phishing_detection_data.sh pr-body)"
PR_BODY_MACOS="${TEMPLATE//\{\{revision\}\}/$REVISION}"
echo "PR_BODY_MACOS<<EOF" >> $GITHUB_ENV
echo "$PR_BODY_MACOS" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Create PR for macOS
uses: peter-evans/create-pull-request@88bf0de51c7487d91e1abbb4899332e602c58bbf
id: create-pr
with:
path: macos/
add-paths: |
./DuckDuckGo/PhishingDetection/
commit-message: Update phishing detection data to revision ${{ env.REVISION }}
branch: update-phishing-protection-${{ env.REVISION }}
title: Update phishing protection datasets to ${{ env.REVISION }}
body: "${{ env.PR_BODY_MACOS }}"
6 changes: 3 additions & 3 deletions DuckDuckGo/PhishingDetection/PhishingDetection.swift
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,11 @@ public class PhishingDetection: PhishingSiteDetecting {
private let hashPrefixDataSHA: String

private init(
revision: Int = 1653367,
revision: Int = 1682412,
filterSetURL: URL = Bundle.main.url(forResource: "filterSet", withExtension: "json")!,
filterSetDataSHA: String = "edd913cb0a579c2b163a01347531ed78976bfaf1d14b96a658c4a39d34a70ffc",
filterSetDataSHA: String = "c18cccf9dab535f88c1e4570a9bf2f6477b53614f6494de090393cdfea6bee67",
hashPrefixURL: URL = Bundle.main.url(forResource: "hashPrefixes", withExtension: "json")!,
hashPrefixDataSHA: String = "c61349d196c46db9155ca654a0d33368ee0f33766fcd63e5a20f1d5c92026dc5",
hashPrefixDataSHA: String = "f2a43e57eba01beb6ae6e69406d5a0b769015871f50e62ac0f2cd15afb3ae7a8",
detectionClient: PhishingDetectionAPIClient = PhishingDetectionAPIClient(),
dataProvider: PhishingDetectionDataProvider? = nil,
dataStore: PhishingDetectionDataSaving? = nil,
Expand Down
Loading

0 comments on commit 798425d

Please sign in to comment.