Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update search syntax description #539

Merged
merged 2 commits into from
Nov 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 72 additions & 32 deletions en/finding-sorting-and-cleaning-entries/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,6 @@ To make the cursor jump to the search field, you can:

To find the search history, you can right click in the search field. Only ten recent searches will be displayed in the sub-menu. You can find clear history button under your search history.

## Search settings

At the right of the search text field, 2 buttons allow for selecting some settings:

* Regular expressions
* Whether or not the search query uses [regular expressions](search.md#regular-expressions).
* Case sensitivity
* Whether or not the search query is case sensitive.

## Simple search <a href="#simple-search" id="simple-search"></a>

In a normal search, the program searches your library for all occurrences of the words in your search string, once you entered it. Only entries containing all words will be considered matches. To search for sequences of words, enclose the sequences in double-quotes. For instance, the query **progress "marine aquaculture"** will match entries containing both the word "progress" and the phrase "marine aquaculture".
Expand All @@ -30,25 +21,30 @@ All entries that do not match are hidden, leaving for display the matching entri

To stop displaying the search results, just clear the search field, press Esc or click on the "Clear" (`X`) button.

## Search using regular expressions <a href="#regular-expressions" id="regular-expressions"></a>
## Search within specific fields

{% hint style="warning" %}
Make sure that the button "regular expressions" is activated
{% endhint %}
To search for entries whose author contains **miller**, enter: `author = miller`. The `=` sign is actually a shorthand for `contains`. Searching for an exact match is possible using `matches` or `==`.

### General syntax
If a field is not given, all fields are searched and one can mix the selection:
`video and year == 1932` will search for entries with any field containing `video` and the field `year` being exactly `1932`.

In order to only search for content within specific fields and/or to include logical operators in the search expression, a special syntax is available in which these can be specified. Both the field specification and the search term support [regular expressions](search.md#regular-expressions).
### Pseudo fields

#### Search within specific Fields
JabRef defines the following pseudo fields:

To search for entries whose author contains **miller**, enter: `author = miller`. The `=` sign is actually a shorthand for `contains`. Searching for an exact match is possible using `matches` or `==`.
| | | |
| ---------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Pseudo field** | **Purpose** | **Example** |
| `anyfield` / `any` | Search in any field | `anyfield contains fruit`: search for entries having one of its fields containing the word **fruit**. This is identical to just writing `apple`. It may be more useful as `anyfield matches apple`, where one field must be exactly `apple` for a match. |
| `anykeyword` | Search among the keywords | `anykeyword matches apple`: search for entries which have the word **apple** among its keywords. However, as this also matches `pineapple`, it may be more useful in searches of the type `anykeyword matches apple`, which will not match `apples` or `pineapple` |
| `key` | Search for citation keys | `citationkey == miller2005`: search for an entry whose citation key is **miller2005** |
| `entrytype` | Search for entries of a certain type | `entrytype = thesis`: search entries whose type (as displayed in the `entrytype` column) contains the word **thesis** (which would be **phdthesis** and **mastersthesis**) |

#### Search for terms containing spaces
## Search for terms containing spaces

If the search term contains spaces, enclose it in quotes. Do _not_ use spaces in the field specification! E.g to search for entries with the title "image processing", type: `title = "image processing"`
If the search term contains spaces, enclose it in quotes. Do _not_ use spaces in the field specification! E.g., to search for entries with the title "image processing", type: `title = "image processing"`

#### Search using parentheses, `and`, `or` and `not`
## Search using parentheses, `and`, `or` and `not`

To search for entries with the title _or_ the keyword "image processing", type: `title|keywords = "image processing"`. To search for entries _without_ the title or the keyword "image processing", type: `title|keywords != "image processing"` It is also possible to chain search expressions. In general, you can use `and`, `or`, `not`, and parentheses as intuitively expected:

Expand All @@ -61,23 +57,67 @@ To search for entries with the title _or_ the keyword "image processing", type:
| (X) | X, as a capturing group |
| != | tests if the search term is _not_ contained in the field (equivalent to `not ... contains ...`) |

#### Regular Expression search and Field Types
## Search settings

The selection of field types to search (required, optional, all) is always overruled by the field specification in the search expression. If a field is not given, all fields are searched. For example, `video and year == 1932` will search for entries with any field containing `video` and the field `year` being exactly `1932`.
At the right of the search text field, two buttons allow for selecting some settings:

### Pseudo fields
* Regular expressions
* Whether the search query uses [regular expressions](search.md#regular-expressions).
* Case sensitivity
* Whether the search query is case-sensitive.

JabRef defines the following pseudo fields:
This applies to all "unfielded" search terms. Meaning: All search terms not specifying a field (e.g., `title`).

| | | |
| ---------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Pseudo field** | **Purpose** | **Example** |
| `anyfield` | Search in any field | `anyfield contains fruit`: search for entries having one of its fields containing the word **fruit**. This is identical to just writing `apple`. It may be more useful as `anyfield matches apple`, where one field must be exactly `apple` for a match. |
| `anykeyword` | Search among the keywords | `anykeyword matches apple`: search for entries which have the word **apple** among its keywords. However, as this also matches `pineapple`, it may be more useful in searches of the type `anykeyword matches apple`, which will not match `apples` or `pineapple` |
| `key` | Search for citation keys | `citationkey == miller2005`: search for an entry whose citation key is **miller2005** |
| `entrytype` | Search for entries of a certain type | `entrytype = thesis`: search entries whose type (as displayed in the `entrytype` column) contains the word **thesis** (which would be **phdthesis** and **mastersthesis**) |
## Modifiers for fields

{% hint style="warning" %}
This has changed with JabRef v6
{% endhint %}

JabRef offers operators for the fielded search.
The general idea is to have `=` for contains search and `==` for exact matches.
Then, the `!` can be used to force case-sensitive matching (when used at the end) and as negation, when used in front.
Finally, the `~` sign is used to enable regular-expression-based search.

### Advanced use of regular expressions <a href="#regular-expressions-advanced" id="regular-expressions-advanced"></a>
This leads to following operator combinations:

| Operator | Explanation |
|----------|----------------------------------------|
| `=` | Case insensitive contains |
| `=!` | Case sensitive contains |
| `==` | Exact match, case insensitive |
| `==!` | Exact match, case sensitive |
| `=~` | Regex check, case insensitive |
| `=~!` | Regex check, case sensitive |
| `!=` | Negated case insensitive contains |
| `!=!` | Negated case sensitive contains |
| `!==` | Negated exact match, case insensitive |
| `!==!` | Negated exact match, case sensitive |
| `!=~` | Negated regex check, case insensitive |
| `!=~!` | Negated regex check, case sensitive |

Remember, the regex option has no effect on "field = value" expressions.
To use regex with field names, the expression must have the form "field =~ value", which will apply the regular expression regardless of the ".*" regex option.
To put it another way, using `field = myterm` explicitly disables regex while `field =~ myterm` explicitly enables it, _on this term only without affecting the rest of the search._ Note that the "abc" case-sensitive option follows the same principle.

The idea makes sense, because it allows regex and non-regex terms to coexist in the same search.

However, in practice this is totally unintuitive and not worth the trade-off. My suggestion for the maintainers is to keep "field =~ value" explicit (always apply regex syntax for this term) and make "field = value" apply standard or regex syntax, depending on the regex button/checkmark. In other words, `=` and `=~` should be treated as equivalent when the regex option is enabled.

Personally, I keep regex enabled all the time, so adding escape characters as needed has become second nature.

This is how the search currently works in the development version.

| Terms | Regex | Term 1 | Term 2 |
|--------|--------|---------|---------|
| `title =~ pa*ediatric AND 1.0` | Off | Matches "paediatric", "pediatric" | Matches "1.0" |
| `title =~ pa*ediatric AND 1.0` | On |Matches "paediatric", "pediatric" | Matches "1.0", "1+0" "1/0", "1q0", ... |
| `title = pa*ediatric AND 1.0` | Off | No match. Regex is disabled | "1.0" |
| `title = pa*ediatric AND 1.0` | On | No match. Regex is disabled for this term |Matches "1.0", "1+0" "1/0", "1q0", ... |

## Search using regular expressions <a href="#regular-expressions" id="regular-expressions"></a>

In order to only search for content within specific fields and/or to include logical operators in the search expression, a special syntax is available in which these can be specified. Both the field specification and the search term support [regular expressions](search.md#regular-expressions).

Regular expressions (RegEx for short) define a language for representing patterns matching text, for example when searching. There are different types of RegEx languages. JabRef uses regular expressions as defined in Java. For extensive advanced information about Java's RegEx patterns, please have a look at the [Java documentation](https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/util/regex/Pattern.html) and at the [Java tutorial](https://docs.oracle.com/javase/tutorial/essential/regex/).

Expand Down
Loading