Adding Jarelllama's Scam Blocklists & other contributions #184
Replies: 5 comments 2 replies
-
Hey! This looks neat, and I'm actually surprised I haven't found it yet. I'll need some time to look over things thoroughly at some point. I'm using PhishStats already, and do something similar to what you are with my maintained version of the "Not on my Shift" blocklist. My present thoughts since you asked:
I'll need time to check the list content but it looks good so far. I'd also be open to integrating your project's functionality into Black Mirror if you'd like to join on as a contributor. If the contributing documentation isn't helpful I'm happy to answer any questions. |
Beta Was this translation helpful? Give feedback.
-
Thanks @T145 for the code review. Your input is incredibly valuable seeing how I'm the single maintainer and doing my own code reviews is less effective than input from someone else. Feel free to give feedback for any of the other scripts/workflows. Regarding contributing, please do let me know in what other ways I could help contribute besides lending my blocklist as a source. If it's helpful I can begin a pull request for data/v2/manifest.json. Thanks again! |
Beta Was this translation helpful? Give feedback.
-
Sure, if you'd like to make a PR go for it. My thoughts on contributing more to the project regard adding what you do w/ some sources directly to Black Mirror. This would mean making an entry in the manifest with the source URL in the mirrors field, then designating a filter in my scripts to process the text. This is what you're doing with many of your |
Beta Was this translation helpful? Give feedback.
-
I moved the issue to a discussion so conversation can feel more natural and so other interested parties can feel more welcome in joining or giving some feedback |
Beta Was this translation helpful? Give feedback.
-
Hi again. I gave it some thought after reviewing my sources and code. As my first big project, my code is rather rigid and does not leave much room for modularity. A lot of the source retrieval code do not adhere to the same filters, for example, some sources are limited to their first 100 results or first 5 pages, depending on the update frequency of each individual source. Most annoying, handling the edge cases like whether a trailing slash returns nothing in curl, or sources that add modifiers to their domains. This is despite me spending the better half of last night trying to reduce redundancy and implement mawk wherever feasible. I would love to contribute in ways practical if you have any suggestions. In the mean time, I see you have already merged my manifest.json commit. Thanks for that! |
Beta Was this translation helpful? Give feedback.
-
Contact Details
No response
What's your idea?
Hi, I'm the maintainer of Jarelllama's Scam Blocklist, a blocklist for newly created scam and phishing domains automatically retrieved daily using Google Search API, automated NRD detection, and other public sources.
This blocklist aims to be an alternative to blocking all newly registered domains (NRDs) seeing how many, but not all, NRDs are malicious. A variety of sources are integrated to detect new malicious domains within a short time span of their registration date.
Taken from my README, this is the current filtering process:
Dead domains and parked domains are automatically removed daily as well. More about the blocklist's retrieval and filtering process can be found in the README.
These are the formats I currently offer:
Please do let me know your thoughts!
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions