-
Notifications
You must be signed in to change notification settings - Fork 55
/
Copy pathquickstart.goggle
145 lines (125 loc) · 6.46 KB
/
quickstart.goggle
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
! name: Quickstart
! description: A quick tour of Goggles
! public: false
! author: Goggles 101
! Goggles are simple, self-contained text files which can be hosted in Github or
! Gitlab. These files contain instructions allowing you to tell Brave Search how
! you'd like your results to be ranked. You can target specific URL patterns
! (and, soon, website titles and other aspects of Web pages) and indicate how
! their ranking should be altered (e.g. boosted, downranked, or completely
! discarded from the results).
! A Goggle source file consists of instructions, one per line. Empty lines, or
! comments (starting with an exclamation mark: '!') are ignored. If your Goggle
! contains invalid instructions, submitting the Goggle on Brave Search will fail
! and you will get feedback regarding what went wrong.
! You probably already noticed the header of this file which contains some
! metadata about your Goggle, such as: name, description, public and author.
! These are *mandatory*.
! Additionally, you can specify the following optional metadata attributes:
! * homepage — specifies a homepage URL displayed on your Goggle's profile.
! * issues — specifies a URL where users can report issues for your Goggle.
! * transferred_to — Allows to transfer ownership of a Goggle.
! * avatar — specifies a *valid* HEX color code for your Goggle.
! * license — specifies the license of a Goggle's instructions.
! The simplest instruction is a plain-text pattern which can be found in URLs.
! The following would match any search result whose URL contains the pattern
! as-is:
/this/is/a/pattern
! It is also possible to use some limited "globbing" capabilities such as the
! '*' character which will match zero, one or more characters (note: the number
! of wildcards allowed in a given instruction is limited):
/this/is/*/pattern
! The special character '^' can be used to indicate that an URL delimiter such
! as '/' or end-of-url can be matched. More specifically, '^' is equivalent to
! the regexp: [^\w\d._%-]|$
!
! This means that it can match either the end of the URL or any character that is
! not a letter, digit, dot, underscore, percent sign or dash. Here are a few
! examples of characters that are matched by '^': / (slash), = (equal),
! [ (bracket), : (colon), etc.
!
! In practice, you will usually want to use '/' (or any other specific
! separator like '=' or '?') most of the time in your patterns, except at the end
! of a pattern in cases where you want to be a little bit more generic, and
! express that your pattern should be either matching at the end of the URL or be
! followed by a separator and then arbitrary URL components.
!
! For example, |https://example.org^ will match: 'https://example.org',
! 'https://example.org/' or 'https://example.org/path'; but it will *not* match
! 'https://example.org.ac', which is also a valid domain name starting with
! 'https://example.org'.
!
! Another example, /foo.js^ will match: 'https://example.org/foo.js',
! 'https://example.org/foo.js?param=42', 'https://example.org/foo.js/' but it
! will *not* match 'https://example.org/foo.jsx' (because it is not followed by a
! separator).
!
! Also note that the maximum number of carets allowed in a given instruction is
! limited.
/this/is/a/pattern^
|https://example.org^
/foo.js^
! By default, a pattern can match anywhere in the URL, but there are specific
! characters which can be used to indicate prefix or suffix matches: we call them
! "anchors".
!
! The '|' character can be used at the beginning or end of an instruction to
! indicate anchoring. The following instruction will match a prefix of the URL:
|https://en.
! The following will match a suffix of the URL:
/some/path.html|
! Both can be used at the same time:
|https://brave.com|
! Additionally, each instruction can specify a list of options, following the
! '$' character and separated by commas (','). Options can be used to more finely
! target specific search results, or to indicate how a matched result's ranking
! should be altered.
! The most basic option is 'site=', which can be used to limit a instruction to
! a specific website, based on its domain. Options can be specified on their own
! (e.g. if you want to target any page of a site) or in conjunction with a
! pattern:
$site=brave.com
/blog/$site=brave.com
! Another set of options can be used to indicate what you want your instruction
! to target. By default any instruction will apply to a URL, but we will add the
! ability to match other aspects of a page too, in the future:
!
! web3$inurl
! web3$intitle
! web3$indescription
! web3$incontent
! Finally, you can specify an 'action', which indicates how the ranking of a
! matched result should be changed by your instruction. This is the mechanism
! through which you can customize the ranking of results to your liking. You can
! use one of three possible actions in your instructions, and by default, any
! instruction without an action will be considered as 'boost':
/r/brave_browser/
/r/brave_browser/$boost
/r/brave_browser/$boost=2
/r/brave_browser/$boost=3
! The value associated with the option indicate the 'strength' with which you
! want to alter the ranking (note: that it is currently limited to a maximum
! value of 10). It can be used to boost results differently, even inside of the
! same Goggle (e.g. some results should be favored more than others).
!
! You can also downrank results:
/r/google/$downrank
/r/google/$downrank=2
/r/google/$downrank=3
! Last but not least, you can discard results completely:
$discard,site=idontwanttobepartoftheresults.com
/this/is/spam/$discard
! Individually, each instruction can either target a very narrow set of pages (or
! even a single page), or a wider range of them, to apply reranking to a bigger
! set of results. In combination, hundreds or more instructions can allow you to
! express complex reranking functions.
! Although the Goggles language could express instructions to search through a
! small set of websites or act as a blocklist, Goggles really shine when used to
! express boosting and downranking across many domains and pages.
! Goggles also impacts the behavior of features such as Discussions and News or
! Videos clusters, giving you full control on your search experience.
!
! We are already using Goggles internally, and we can't wait to see what you will
! do with them!
! To learn more about Goggles, we recomend that you read the "Learn by example"
! section of our "Getting Started" page here: https://github.com/brave/goggles-quickstart/blob/main/getting-started.md#learn-by-example