Skip to content

Commit

Permalink
Bug fixes on regexes containing : and default acts array; Add cli…
Browse files Browse the repository at this point in the history
… for ease of use
  • Loading branch information
PRO-2684 committed Sep 7, 2024
1 parent be20509 commit f19c2e3
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 10 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Purify URL: Remove redundant tracking parameters, skip redirecting pages, and ex

### 🚀 Quick Start

Visit our [demo page](https://pro-2684.github.io/?page=purlfy), or try it out with our [Tampermonkey script](https://greasyfork.org/scripts/492480)!
Visit our [demo page](https://pro-2684.github.io/?page=purlfy), try out our [Tampermonkey script](https://greasyfork.org/scripts/492480), or simply `node cli.js <url[]> [<options>]` to purify a list of URLs (For more information, please refer to the comments in the script).

```js
// Somewhat import `Purlfy` class from https://cdn.jsdelivr.net/gh/PRO-2684/pURLfy@latest/purlfy.min.js
Expand Down Expand Up @@ -336,7 +336,7 @@ If URL `https://example.com/?key=123` matches this rule, the `key` parameter wil

### 🖇️ Processors

Some processors support parameters, simply append them to the function name separated by a colon (`:`): `func:arg1:arg2...:argn`. The following processors are currently supported:
Some processors support parameters, simply append them to the function name separated by a colon (`:`): `func:arg`. The following processors are currently supported:

- `url`: `string->string`, URL decoding (`decodeURIComponent`)
- `base64`: `string->string`, Base64 decoding (`decodeURIComponent(escape(atob(s.replaceAll('_', '/').replaceAll('-', '+'))))`)
Expand Down
5 changes: 2 additions & 3 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

### 🚀 快速开始

访问我们的 [示例页面](https://pro-2684.github.io/?page=purlfy)或者通过我们的 [Tampermonkey 脚本](https://greasyfork.org/scripts/492480) 来体验!
访问我们的 [示例页面](https://pro-2684.github.io/?page=purlfy)体验我们的 [Tampermonkey 脚本](https://greasyfork.org/scripts/492480),或者直接 `node cli.js <url[]> [<options>]` 来净化一系列 URL (更多信息请参考脚本注释)。

```js
// 通过某种方式从 https://cdn.jsdelivr.net/gh/PRO-2684/pURLfy@latest/purlfy.min.js 导入 `Purlfy` 类
Expand Down Expand Up @@ -101,7 +101,6 @@ new Purlfy({

- `Purlfy.version: string`: pURLfy 的版本号


## 📖 规则

社区贡献的规则文件托管在 GitHub 上,您可以在 [pURLfy-rules](https://github.com/PRO-2684/pURLfy-rules) 中找到。规则文件的格式如下:
Expand Down Expand Up @@ -337,7 +336,7 @@ new Purlfy({

### 🖇️ 处理器

部分处理器支持传入参数,只需用 `:` 分隔即可:`func:arg1:arg2...:argn`。目前支持的处理器如下:
部分处理器支持传入参数,只需用 `:` 分隔即可:`func:arg`。目前支持的处理器如下:

- `url`: `string->string`,URL 解码 (`decodeURIComponent`)
- `base64`: `string->string`,Base64 解码 (`decodeURIComponent(escape(atob(s.replaceAll('_', '/').replaceAll('-', '+'))))`)
Expand Down
36 changes: 36 additions & 0 deletions cli.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// `node cli.js <url[]> [<options>]`
// `url` is the URL to purify. You can pass multiple URLs to purify them all.
// `options` can contain:
// - `--rules <enabled-rules>`, where `enabled-rules` is a comma-separated list of rules to enable. Default is all rules. Short-hand `-r`.

const Purlfy = require("./purlfy");
const { parseArgs } = require("node:util");

const options = {
rules: {
type: "string",
short: "r",
default: ""
}
};
const args = process.argv.slice(2);
const {
values,
positionals: urls,
} = parseArgs({ args, options, allowPositionals: true });
const { rules: rulesStr } = values;
const enabledRules = rulesStr.trim().length ? rulesStr.split(",").map((rule) => rule.trim()).filter(Boolean) : require("./rules/list.json");
console.log("Enabled rules:", enabledRules);
console.log("---");

const purifier = new Purlfy({
fetchEnabled: true,
lambdaEnabled: true,
});
const rules = enabledRules.map((rule) => require(`./rules/${rule}.json`));
purifier.importRules(...rules);
for (const url of urls) {
purifier.purify(url).then((purified) => {
console.log(url, "=>", purified.url);
});
}
13 changes: 8 additions & 5 deletions purlfy.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ class Purlfy extends EventTarget {
static #acts = {
"url": decodeURIComponent,
"base64": s => decodeURIComponent(escape(atob(s.replaceAll('_', '/').replaceAll('-', '+')))),
"slice": (s, start, end) => s.slice(parseInt(start), end ? parseInt(end) : undefined),
"slice": (s, startEnd) => {
const [start, end] = startEnd.split(":");
return s.slice(parseInt(start), end ? parseInt(end) : undefined)
},
"regex": (s, regex) => {
const r = new RegExp(regex);
const m = s.match(r);
Expand Down Expand Up @@ -162,16 +165,16 @@ class Purlfy extends EventTarget {
static #applyActs(input, acts, logFunc) {
let dest = input;
for (const cmd of (acts)) {
const args = cmd.split(":");
const name = args[0];
const name = cmd.split(":")[0];
const arg = cmd.slice(name.length + 1);
const act = Purlfy.#acts[name];
if (!act) {
logFunc("Invalid act:", cmd);
dest = null;
break;
}
try {
dest = act(dest, ...args.slice(1));
dest = act(dest, arg);
} catch (e) {
logFunc(`Error processing input with act "${name}":`, e);
dest = null;
Expand Down Expand Up @@ -459,7 +462,7 @@ class Purlfy extends EventTarget {
logFunc("Visit mode, but got redirected to:", r.url);
urlObj = new URL(r.headers.get("location"), urlObj.href);
} else {
const dest = Purlfy.#applyActs(html, rule.acts?.length ? rule.acts : ["regex:https?:\/\/.(?:www\.)?[-a-zA-Z0-9@%._\+~#=]{2,256}\.[a-z]{2,6}\b(?:[-a-zA-Z0-9@:%_\+.~#?!&\/\/=]*)"], logFunc);
const dest = Purlfy.#applyActs(html, rule.acts?.length ? rule.acts : [String.raw`regex:https?:\/\/.(?:www\.)?[-a-zA-Z0-9@%._\+~#=]{2,256}\.[a-z]{2,6}\b(?:[-a-zA-Z0-9@:%_\+.~#?!&\/\/=]*)`], logFunc);
if (dest && URL.canParse(dest, urlObj.href)) { // Valid URL
urlObj = new URL(dest, urlObj.href);
} else { // Invalid URL
Expand Down

0 comments on commit f19c2e3

Please sign in to comment.