Commit Graph

5 Commits

Author SHA1 Message Date
evazion
04551b8154 autocomplete: replace calls to PostQueryBuilder with PostQuery. 2022-03-30 02:12:25 -05:00
evazion
6edff247f2 search: replace calls to PostQueryBuilder#fast_count with PostQuery#fast_count.
Prepare a few more places for the new tag search parser.
2022-03-30 01:37:11 -05:00
evazion
fb0a7851bf BURs: fix mass update A -> B not being allowed.
Fix mass updates of the form `mass update A -> B` not being allowed.

This was originally because after `rename` was introduced, we wanted to
prevent people from using mass updates to move tags. Now, `mass update A -> B`
adds B to all posts tagged A instead of moving A to B. So `mass update A -> B`
should no longer be disallowed.

This also makes it so that it's an error to create a mass update with a
syntax error in the search. Before searches couldn't have syntax errors,
but now with the new query parser it's possible.
2022-03-29 21:31:28 -05:00
evazion
823fa5c6e9 search: switch to new tag search parser in a few places. 2022-03-29 18:21:47 -05:00
evazion
4c7cfc73c6 search: add new tag search parser.
Add a new tag tag search parser that supports full boolean expressions, including `and`,
`or`, and `not` operators and parenthesized subexpressions.

This is only the parser itself, not the code for converting the search into SQL. The new
parser isn't used yet for actual searches. Searches still use the old parser.

Some example syntax:

* `1girl 1boy`
* `1girl and 1boy` (same as `1girl 1boy`)
* `1girl or 1boy`
* `~1girl ~1boy` (same as `1girl or 1boy`)
* `1girl and ((blonde_hair blue_eyes) or (red_hair green_eyes))`
* `1girl ~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `1girl -(blonde_hair blue_eyes)`
* `*_hair *_eyes`
* `*_hair or *_eyes`
* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`

Rules:

AND is implicit between terms, but may be written explicitly:

* `a b c` is `a and b and c`

AND has higher precedence (binds tighter) than OR:

* `a or b and c or d` is `a or (b and c) or d`
* `a or b c or d e` is `a or (b and c) or (d and e)`

All `~` operators in the same subexpression are combined into a single OR:

* `a b ~c ~d` is `a b (c or d)`
* `~a ~b and ~c ~d` is `(a or b) (c or d)`
* `(~a ~b) (~c ~d)` is `(a or b) (c or d)`

A single `~` operator in a subexpression by itself is ignored:

* `a ~b` is `a b`
* `~a and ~b` is `a and b`, which is `a b`
* `(~a) ~b` is `a ~b`, which is `a b`

The parser is written as a backtracking recursive descent parser built on top of
StringScanner and a handful of parser combinators. The parser generates an AST, which is
then simplified using Boolean algebra to remove redundant nodes and to convert the
expression to conjunctive normal form (that is, a product of sums, or an AND of ORs).
2022-03-29 18:21:46 -05:00