Skip to content

Commit

Permalink
Fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
jarelllama authored Apr 3, 2024
1 parent 2ff1bfd commit cf0ee73
Show file tree
Hide file tree
Showing 9 changed files with 78 additions and 9,005 deletions.
10 changes: 5 additions & 5 deletions .github/workflows/build_deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,22 +41,22 @@ jobs:

remove-dead:
needs: [test-functions, build]
if: ${{ always() && needs.test-functions.result == 'success' }}
if: ${{ ! cancelled() && needs.test-functions.result == 'success' }}
uses: ./.github/workflows/check_dead.yml

validate:
needs: [test-functions, remove-dead]
if: ${{ always() && needs.test-functions.result == 'success' }}
if: ${{ ! cancelled() && needs.test-functions.result == 'success' }}
uses: ./.github/workflows/validate_entries.yml

deploy:
needs: [test-functions, validate]
if: ${{ always() && needs.test-functions.result == 'success' }}
if: ${{ ! cancelled() && needs.test-functions.result == 'success' }}
uses: ./.github/workflows/build_lists.yml

prune-logs:
needs: deploy
if: always()
if: ${{ ! cancelled() }}
runs-on: ubuntu-latest
steps:
- name: Checkout
Expand All @@ -80,5 +80,5 @@ jobs:
update-readme:
needs: [deploy, prune-logs]
if: ${{ always() && needs.deploy.result == 'success' }}
if: ${{ ! cancelled() && needs.deploy.result == 'success' }}
uses: ./.github/workflows/update_readme.yml
1 change: 1 addition & 0 deletions SOURCES.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ Source | Type | Inactive | Excluded from light
[ScamAdvisor](https://www.scamadviser.com/) | Any | |
[Stop 419 Scams and Scammers](https://www.stop419scams.com/) | Any | Yes | -
[StopGunScams.com](https://stopgunscams.com/) | Firearm | |
[openSquat](https://github.com/atenreiro/opensquat) | Phishing | | Yes
[r/CryptoScamBlacklist](https://www.reddit.com/r/CryptoScamBlacklist/) | Crypto | Yes | -
[r/Scams](https://www.reddit.com/r/Scams/) | Any | Yes | -
1 change: 1 addition & 0 deletions config/blacklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ alliedaffirm.com
alphafx24.com
alphahorizonarmory.com
ammoforsale.com
gluckbox.com
ammomarsh.com
ammunitioncanada.ca
antrush.com
Expand Down
8,875 changes: 0 additions & 8,875 deletions config/domain_log.csv

Large diffs are not rendered by default.

136 changes: 40 additions & 96 deletions config/opensquat_keywords.txt
Original file line number Diff line number Diff line change
@@ -1,169 +1,113 @@
-outdoor
-usps
abercrombie
adidas
fedex
alfresco
amazon-
ammodepot
ammunition
bassethound
bernesemountain
billabong
blackmarket
bloodhound
bordercollie
bostonterrier
breeder
capitalinvest
capitaltrade
capitaltrust
cargo
cattery
chanel
charterarm
chartergun
chemical
chihuahua
chowchow
clearance
cockapoo
cockerspaniel
coinmin
cointrade
colt
copypaper
corgis
corteiz
counterfeit
courier
creditunion
cryptoaid
nicehash
cryptoinvest
cryptomine
cryptotrade
czarm
czgun
czusa
dachshund
dalmatian
danwesson
delivery
deserteagle
deutschland
diamondback
aliexpress
shopee
lazada
discount
discreet
dispensary
doberman
doodle
equitytrust
euronics
official
redcross
exotic
exporters
express
facebook
familyraised
firearm
firstflight
fluffypaw
fnamerica
fngun
foradoption
forsale
freight
frenchbull
frenchbulldog.world
frenchie
gabor
gamingplaza
frenchbulldog
gardentool
germanshepherd
globalarm
globalcrypto
globaldel
globaldoc
globalinvest
globallin
globalmail
globalmed
globaltr
glock
goldengate
goldernretriever
goldhandel
goldmin
google
gucci
gunstore
handraised
havanesepup
hermes
hkarm
hkgun
hkusa
huntergummistiefel
husky
illuminati
imrpowder
intercontinental
jackrussell
kimber
kingcharles
kitten
knives-
labrador
lekaren
levis
livestockfarm
livingroom
logistic
lululemon
magnumresearch
maltese
marlin
martens
microsoft
miningtrad
monkeyhome
monkeyshome
mossberg
gunshop
motorrad
nike
oncloud
onrunning
onschuhe
nike-
outdoor-
outdoors-
outlet
parcel
parcel-
-parcel
paypal
petshome
pinballmachine
pomeranian
poodle
poultryfarm
prioritymail
psychedelic
puppies
redwings
adoption
precisionrifle
rottweiler
samoyed
shipping
sigsauer
skechers
supply-
swarovski
swiftdel
swiftsend
alliance-
tactical
telegram
trackingservice
tradeinvest
transglobal
trustbank
upsglobal
usps-
vivobarefoot
vuitton
whatsapp
worldwide
yeezy
pornhub
torbrowser
onlyfans
bitwarden
protonmail
spotify
freevpn
expressvpn
nordvpn
bitcoin
fortnight
upsexpress
alibaba
walmart
mastercard
unitedhealth
homedepot
goldmansachs
banking
dbsgroup
ocbcbank
standardchartered
dbsbank
citibank
4 changes: 2 additions & 2 deletions config/search_terms.csv
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ We source designer style products. And have been able to do this by building gre
"We are one of the world’s leading Asset Management firms with approximately $500 billion in Assets under management that creates lasting impact for our investors, teams, businesses and the communities in which we live.",y,Low count (0)
"is the cryptocurrency trading platform equipped with the high-tech blockchain technology. We believe this technology will prosper our lives and increase the value of assets. Our aim is to provide more customers with a better online cryptocurrency trading environment, and to create the wise investment environment.",,Low count (24)
"you join a community of millions of people who choose to share their opinions and complete offers in exchange for rewards.",y,Low count (7)
"we are passionate about providing high-quality, stylish T-shirts that allow our customers to express themselves creatively. Whether you're looking for bold designs, personalized prints, or trendy graphics, we have something for everyone.",,Low count (7)
"we are passionate about providing high-quality, stylish T-shirts that allow our customers to express themselves creatively. Whether you're looking for bold designs, personalized prints, or trendy graphics, we have something for everyone.",y,Low count (9)
"Before we can show you nude pics of horny women in your area that want to fuck right now, we need to ask a few quick questions.",y,Low count (13)
"You will see hot nudes! Please be discreet.",y,Low count (2)
"There is no shortage of outstanding casinos in Australia. But for travelers and those who want to get away for a few days and enjoy a loaded gambling spree, an in-and-out casino may not always be enough. Thankfully, casino hotels offer the opportunity to hop out of a comfortable bed, have a delicious breakfast on-site, and take but a few steps to slot machines and table games.",,
"There is no shortage of outstanding casinos in Australia. But for travelers and those who want to get away for a few days and enjoy a loaded gambling spree, an in-and-out casino may not always be enough. Thankfully, casino hotels offer the opportunity to hop out of a comfortable bed, have a delicious breakfast on-site, and take but a few steps to slot machines and table games.",,Low count (6)
6 changes: 3 additions & 3 deletions functions/opensquat.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ opensquat() {

# Collate fresh NRD list and exit with status 1 if any link is broken
{
wget -qO - 'https://raw.githubusercontent.com/shreshta-labs/newly-registered-domains/main/nrd-1m.csv' \
wget -qO - 'https://raw.githubusercontent.com/shreshta-labs/newly-registered-domains/main/nrd-1w.csv' \
|| exit 1
wget -qO - 'https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/wildcard/nrds.10-onlydomains.txt' \
| grep -vF '#' || exit 1
Expand All @@ -34,8 +34,8 @@ opensquat() {
mkdir -p data/pending

# Run openSquat and collect results
python3 opensquat/opensquat.py -k "$KEYWORDS" \
-o data/pending/domains_opensquat.tmp -d new_nrd.tmp
python3 opensquat/opensquat.py -k "$KEYWORDS" -c 0 \
-d new_nrd.tmp -o data/pending/domains_opensquat.tmp
}

# Function 'format_file' calls a shell wrapper to standardize the format
Expand Down
47 changes: 24 additions & 23 deletions functions/retrieve_domains.sh
Original file line number Diff line number Diff line change
Expand Up @@ -85,9 +85,9 @@ process_source() {
# Remove common subdomains
local domains_with_subdomains # Declare local variable in case while loop does not run
while read -r subdomain; do # Loop through common subdomains
# Find domains and skip to next subdomain if none found
domains_with_subdomains="$(grep "^${subdomain}\." <<< "$domains")" \
|| continue
# Find domains with subdomains and skip to next subdomain if none found
domains_with_subdomains="$(grep '\..*\.' <<< "$domains" \
| grep "^${subdomain}\." )" || continue

# Keep only root domains
domains="$(printf "%s" "$domains" | sed "s/^${subdomain}\.//" | sort -u)"
Expand Down Expand Up @@ -431,8 +431,8 @@ source_aa419() {
local url='https://api.aa419.org/fakesites'
local query_params
query_params="1/500?fromadd=$(date +'%Y')-01-01&Status=active&fields=Domain"
curl -sH "Auth-API-Id:${AA419_API_ID}" "${url}/${query_params}" |
jq -r '.[].Domain' >> "$results_file" # Trailing slash breaks API call
curl -sH "Auth-API-Id:${AA419_API_ID}" "${url}/${query_params}" \
| jq -r '.[].Domain' >> "$results_file" # Trailing slash breaks API call

process_source
}
Expand All @@ -445,9 +445,9 @@ source_guntab() {
[[ "$USE_EXISTING" == true ]] && { process_source; return; }

local url='https://www.guntab.com/scam-websites'
curl -s "${url}/" |
grep -zoE '<table class="datatable-list table">.*</table>' |
grep -aoE '[[:alnum:].-]+\.[[:alnum:]-]{2,}$' > "$results_file"
curl -s "${url}/" \
| grep -zoE '<table class="datatable-list table">.*</table>' \
| grep -aoE '[[:alnum:].-]+\.[[:alnum:]-]{2,}$' > "$results_file"
# Note results are not sorted by time added

process_source
Expand All @@ -461,9 +461,9 @@ source_petscams() {

local url="https://petscams.com"
for page in {2..21}; do # Loop through 20 pages
curl -s "${url}/" |
grep -oE '<a href="https://petscams.com/[[a-z]-]+-[[a-z]-]+/[[:alnum:].-]+-[[:alnum:]-]{2,}/">' |
sed 's/<a href="https:\/\/petscams.com\/[[:alpha:]-]\+\///;
curl -s "${url}/" \
| grep -oE '<a href="https://petscams.com/[[:alpha:]-]+/[[:alnum:].-]+-[[:alnum:]-]{2,}/">' \
| sed 's/<a href="https:\/\/petscams.com\/[[:alpha:]-]\+\///;
s/-\?[0-9]\?\/">//; s/-/./g' >> "$results_file"
url="https://petscams.com/page/${page}" # Add '/page' after first run
done
Expand All @@ -478,9 +478,9 @@ source_scamdirectory() {
[[ "$USE_EXISTING" == true ]] && { process_source; return; }

local url='https://scam.directory/category'
curl -s "${url}/" |
grep -oE 'href="/[[:alnum:].-]+-[[:alnum:]-]{2,}" title' |
sed 's/href="\///; s/" title//; s/-/./g; 301,$d' > "$results_file"
curl -s "${url}/" \
| grep -oE 'href="/[[:alnum:].-]+-[[:alnum:]-]{2,}" title' \
| sed 's/href="\///; s/" title//; s/-/./g; 301,$d' > "$results_file"
# Keep only first 300 results

process_source
Expand All @@ -494,9 +494,10 @@ source_scamadviser() {

local url='https://www.scamadviser.com/articles'
for page in {1..20}; do # Loop through pages
curl -s "${url}?p=${page}" | # Trailing slash breaks curl
grep -oE '<div class="articles">.*<div>Read more</div>'
grep -oE '[A-Z][[:alnum:].-]+\.[[:alnum:]-]{2,}' >> "$results_file"
# Trailing slash breaks curl
curl -s "${url}?p=${page}" \
| grep -oE '<div class="articles">.*<div>Read more</div>' \
| grep -oE '[A-Z][[:alnum:].-]+\.[[:alnum:]-]{2,}' >> "$results_file"
done

process_source
Expand All @@ -509,9 +510,9 @@ source_dfpi() {
[[ "$USE_EXISTING" == true ]] && { process_source; return; }

local url='https://dfpi.ca.gov/crypto-scams'
curl -s "${url}/" |
grep -oE '<td class="column-5">(<a href=")?(https?://)?[[:alnum:].-]+\.[[:alnum:]-]{2,}' |
sed 's/<td class="column-5">//; s/<a href="//; 31,$d' > "$results_file"
curl -s "${url}/" \
| grep -oE '<td class="column-5">\s*(<a href=")?(https?://)?[[:alnum:].-]+\.[[:alnum:]-]{2,}' \
| sed 's/<td class="column-5">//; s/<a href="//; 31,$d' > "$results_file"
# Keep only first 30 results

process_source
Expand All @@ -525,9 +526,9 @@ source_stopgunscams() {

local url='https://stopgunscams.com'
for page in {1..5}; do
curl -s "${url}/?page=${page}/" |
grep -oE '<h4 class="-ih"><a href="/[[:alnum:].-]+-[[:alnum:]-]{2,}' |
sed 's/<h4 class="-ih"><a href="\///; s/-/./g' >> "$results_file"
curl -s "${url}/?page=${page}/" \
| grep -oE '<h4 class="-ih"><a href="/[[:alnum:].-]+-[[:alnum:]-]{2,}' \
| sed 's/<h4 class="-ih"><a href="\///; s/-/./g' >> "$results_file"
done

process_source
Expand Down
Loading

0 comments on commit cf0ee73

Please sign in to comment.