Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
ganigeorgiev committed Jan 22, 2022
0 parents commit c85b34e
Show file tree
Hide file tree
Showing 8 changed files with 975 additions and 0 deletions.
29 changes: 29 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
BSD 3-Clause License

Copyright (c) 2022, Gani Georgiev
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
95 changes: 95 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
fexpr
[![Go Report Card](https://goreportcard.com/badge/github.com/ganigeorgiev/fexpr)](https://goreportcard.com/report/github.com/ganigeorgiev/fexpr)
[![GoDoc](https://godoc.org/github.com/ganigeorgiev/fexpr?status.svg)](https://pkg.go.dev/github.com/ganigeorgiev/fexpr)
================================================================================

**fexpr** is a filter query language parser that generates extremely easy to work with AST structure so that you can create safely SQL, Elasticsearch, etc. queries from user input.

Or in other words, transform the string `"id > 1"` into the struct `[{&& {{identifier id} > {number 1}}}]`.

Supports parenthesis and various conditional expression operators (see [Grammar](https://github.com/ganigeorgiev/fexpr#grammar)).


## Example usage

```
go get github.com/ganigeorgiev/fexpr
```

```go
package main

import github.com/ganigeorgiev/fexpr

func main() {
result, err := fexpr.Parse("id=123 && status='active'")
// result: [{&& {{identifier id} = {number 123}}} {&& {{identifier status} = {text active}}}]
}
```

> Note that each parsed expression statement contains a join/union operator (`&&` or `||`) so that the result can be consumed on small chunks without having to rely on the group/nesting context.
> See the [package documentation](https://pkg.go.dev/github.com/ganigeorgiev/fexpr) for more details and examples.
## Grammar

**fexpr** grammar resembles the SQL `WHERE` expression syntax. It recognizes several token types (identifiers, numbers, quoted text, expression operators, whitespaces, etc.).

> You could find all supported tokens in [`scanner.go`](https://github.com/ganigeorgiev/fexpr/blob/master/scanner.go).

#### Operators

- **`=`** Equal operator (eg. `a=b`)
- **`!=`** NOT Equal operator (eg. `a!=b`)
- **`>`** Greater than operator (eg. `a>b`)
- **`>=`** Greater than or equal operator (eg. `a>=b`)
- **`<`** Less than or equal operator (eg. `a<b`)
- **`<=`** Less than or equal operator (eg. `a<=b`)
- **`~`** Like/Contains operator (eg. `a~b`)
- **`!~`** NOT Like/Contains operator (eg. `a!~b`)
- **`&&`** AND join operator (eg. `a=b && c=d`)
- **`||`** OR join operator (eg. `a=b || c=d`)
- **`()`** Parenthesis (eg. `(a=1 && b=2) || (a=3 && b=4)`)


#### Numbers
Number tokens are any integer or decimal numbers. **Example**: `123`, `10.50`.


#### Identifiers

Identifier tokens are literals that start with a letter, `_`, `@` or `#` and could contain further any number of digits or `.` (usually used as a separator).
**Example**: `id`, `a.b.c`, `@request.method`, `field2`.


#### Quoted text

Text tokens are any literals that are wrapped by `'` or `"` quotes.
**Example**: `'Lorem ipsum dolor 123!'`, `"escaped \"word\""`, `"mixed 'quotes' are fine"`.


## Using only the scanner

The tokenizer (aka. `fexpr.Scanner`) could be used without the parser's state machine so that you can write your own custom tokens processing:

```go
s := fexpr.NewScanner(strings.NewReader("id > 123"))

// scan single token at a time until EOF or error is reached
for {
t, err := s.Scan()
if t.Type == fexpr.TokenEOF || err != nil {
break
}

fmt.Println(t)
}

// Output:
// {identifier id}
// {whitespace }
// {sign >}
// {whitespace }
// {number 123}
```
40 changes: 40 additions & 0 deletions examples_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package fexpr_test

import (
"fexpr"
"fmt"
"strings"
)

func ExampleNewScanner() {
fexpr.NewScanner(strings.NewReader("id"))
}

func ExampleScanner_Scan() {
s := fexpr.NewScanner(strings.NewReader("id > 123"))

for {
t, err := s.Scan()
if t.Type == fexpr.TokenEOF || err != nil {
break
}

fmt.Println(t)
}

// Output:
// {identifier id}
// {whitespace }
// {sign >}
// {whitespace }
// {number 123}
}

func ExampleParse() {
result, _ := fexpr.Parse("id > 123")

fmt.Println(result)

// Output:
// [{&& {{identifier id} > {number 123}}}]
}
3 changes: 3 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module fexpr

go 1.18
116 changes: 116 additions & 0 deletions parser.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
package fexpr

import (
"errors"
"fmt"
"strings"
)

// Expr represents an individual tokenized expression consisting
// of left operand, operator and a right operand.
type Expr struct {
Left Token
Op SignOp
Right Token
}

// ExprGroup represents a wrapped expression and its join type.
//
// The group's Item could be either an `Expr` instance or `[]ExprGroup` slice (for nested expressions).
type ExprGroup struct {
Join JoinOp
Item interface{}
}

// parser's state machine steps
const (
stepBeforeSign = iota
stepSign
stepAfterSign
StepJoin
)

// Parse parses the provided text and returns its processed AST
// in the form of `ExprGroup` slice(s).
func Parse(text string) ([]ExprGroup, error) {
result := []ExprGroup{}
scanner := NewScanner(strings.NewReader(text))
step := stepBeforeSign
join := JoinAnd

var expr Expr

for {
t, err := scanner.Scan()
if err != nil {
return nil, err
}

if t.Type == TokenEOF {
break
}

if t.Type == TokenWS {
continue
}

if t.Type == TokenGroup {
groupResult, err := Parse(t.Literal)
if err != nil {
return nil, err
}

// append only if non-empyt group
if len(groupResult) > 0 {
result = append(result, ExprGroup{Join: join, Item: groupResult})
}

step = StepJoin
continue
}

switch step {
case stepBeforeSign:
if t.Type != TokenIdentifier && t.Type != TokenText && t.Type != TokenNumber {
return nil, fmt.Errorf("Expected left operand (identifier, text or number), got %q (%s)", t.Literal, t.Type)
}

expr = Expr{Left: t}

step = stepSign
case stepSign:
if t.Type != TokenSign {
return nil, fmt.Errorf("Expected a sign operator, got %q (%s)", t.Literal, t.Type)
}

expr.Op = SignOp(t.Literal)
step = stepAfterSign
case stepAfterSign:
if t.Type != TokenIdentifier && t.Type != TokenText && t.Type != TokenNumber {
return nil, fmt.Errorf("Expected right operand (identifier, text or number), got %q (%s)", t.Literal, t.Type)
}

expr.Right = t
result = append(result, ExprGroup{Join: join, Item: expr})

step = StepJoin
case StepJoin:
if t.Type != TokenJoin {
return nil, fmt.Errorf("Expected && or ||, got %q (%s)", t.Literal, t.Type)
}

join = JoinAnd
if t.Literal == "||" {
join = JoinOr
}

step = stepBeforeSign
}
}

if step != StepJoin {
return nil, errors.New("Invalid formatted filter expression.")
}

return result, nil
}
97 changes: 97 additions & 0 deletions parser_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
package fexpr

import (
"fmt"
"testing"
)

func TestParse(t *testing.T) {
testScenarios := []struct {
input string
expectedError bool
expectedPrint string
}{
{`> 1`, true, "[]"},
{`a >`, true, "[]"},
{`a > >`, true, "[]"},
{`a > %`, true, "[]"},
{`a ! 1`, true, "[]"},
{`a - 1`, true, "[]"},
{`a + 1`, true, "[]"},
{`> a 1`, true, "[]"},
{`a || 1`, true, "[]"},
{`a && 1`, true, "[]"},
{`test > 1 &&`, true, `[]`},
{`|| test = 1`, true, `[]`},
{`test = 1 && ||`, true, "[]"},
{`test = 1 && a`, true, "[]"},
{`test = 1 && a`, true, "[]"},
{`test = 1 && "a"`, true, "[]"},
{`test = 1 a`, true, "[]"},
{`test = 1 a`, true, "[]"},
{`test = 1 "a"`, true, "[]"},
{`test = 1@test`, true, "[]"},
{`test = .@test`, true, "[]"},
// mismatched text quotes
{`test = "demo'`, true, "[]"},
{`test = 'demo"`, true, "[]"},
{`test = 'demo'"`, true, "[]"},
{`test = 'demo''`, true, "[]"},
{`test = "demo"'`, true, "[]"},
{`test = "demo""`, true, "[]"},
{`test = ""demo""`, true, "[]"},
{`test = ''demo''`, true, "[]"},
{"test = `demo`", true, "[]"},
// valid simple expression and sign operators check
{`1=12`, false, `[{&& {{number 1} = {number 12}}}]`},
{` 1 = 12 `, false, `[{&& {{number 1} = {number 12}}}]`},
{`"demo" != test`, false, `[{&& {{text demo} != {identifier test}}}]`},
{`a~1`, false, `[{&& {{identifier a} ~ {number 1}}}]`},
{`a !~ 1`, false, `[{&& {{identifier a} !~ {number 1}}}]`},
{`test>12`, false, `[{&& {{identifier test} > {number 12}}}]`},
{`test > 12`, false, `[{&& {{identifier test} > {number 12}}}]`},
{`test >="test"`, false, `[{&& {{identifier test} >= {text test}}}]`},
{`test<@demo.test2`, false, `[{&& {{identifier test} < {identifier @demo.test2}}}]`},
{`1<="test"`, false, `[{&& {{number 1} <= {text test}}}]`},
{`1<="te'st"`, false, `[{&& {{number 1} <= {text te'st}}}]`},
{`demo='te\'st'`, false, `[{&& {{identifier demo} = {text te'st}}}]`},
{`demo="te\'st"`, false, `[{&& {{identifier demo} = {text te\'st}}}]`},
{`demo="te\"st"`, false, `[{&& {{identifier demo} = {text te"st}}}]`},
// invalid parenthesis
{`(a=1`, true, `[]`},
{`a=1)`, true, `[]`},
{`((a=1)`, true, `[]`},
{`{a=1}`, true, `[]`},
{`[a=1]`, true, `[]`},
{`((a=1 || a=2) && c=1))`, true, `[]`},
// valid parenthesis
{`()`, true, `[]`},
{`(a=1)`, false, `[{&& [{&& {{identifier a} = {number 1}}}]}]`},
{`(a="test(")`, false, `[{&& [{&& {{identifier a} = {text test(}}}]}]`},
{`(a="test)")`, false, `[{&& [{&& {{identifier a} = {text test)}}}]}]`},
{`((a=1))`, false, `[{&& [{&& [{&& {{identifier a} = {number 1}}}]}]}]`},
{`a=1 || 2!=3`, false, `[{&& {{identifier a} = {number 1}}} {|| {{number 2} != {number 3}}}]`},
{`a=1 && 2!=3`, false, `[{&& {{identifier a} = {number 1}}} {&& {{number 2} != {number 3}}}]`},
{`a=1 && 2!=3 || "b"=a`, false, `[{&& {{identifier a} = {number 1}}} {&& {{number 2} != {number 3}}} {|| {{text b} = {identifier a}}}]`},
{`(a=1 && 2!=3) || "b"=a`, false, `[{&& [{&& {{identifier a} = {number 1}}} {&& {{number 2} != {number 3}}}]} {|| {{text b} = {identifier a}}}]`},
{`((a=1 || a=2) && (c=1))`, false, `[{&& [{&& [{&& {{identifier a} = {number 1}}} {|| {{identifier a} = {number 2}}}]} {&& [{&& {{identifier c} = {number 1}}}]}]}]`},
}

for i, scenario := range testScenarios {
v, err := Parse(scenario.input)

if scenario.expectedError && err == nil {
t.Errorf("(%d) Expected error, got nil (%q)", i, scenario.input)
}

if !scenario.expectedError && err != nil {
t.Errorf("(%d) Did not expect error, got %q (%q).", i, err, scenario.input)
}

vPrint := fmt.Sprintf("%v", v)

if vPrint != scenario.expectedPrint {
t.Errorf("(%d) Expected %s, got %s", i, scenario.expectedPrint, vPrint)
}
}
}
Loading

0 comments on commit c85b34e

Please sign in to comment.