Fix metatags not showing autocomplete results until after the first
letter was typed. For example, typing `filetype:` didn't show any
completions until another letter was typed. Now typing `filetype:` shows
all available file types.
This was because `filetype:` by itself wasn't recognized as a valid
search before, since metatags always required a value. Now it is a valid
search, so it's technically possible to search for `filetype:` by
itself. In this case the metatag value will be the empty string, which
will return no results because there are no posts where the filetype is
the empty string.
This sounds nonsensical, but it's potentially useful for metatags like
the `source:` metatag, where searching for posts with an empty source
does make sense. It was also technically possible before by searching
for `source:""`, so making the value optional doesn't change anything.
Fix queries like `(fate_(series) saber)` being parsed as `fate_(series` + `saber)`
instead of `fate_(series)` + `saber`.
This is pretty hacky. We assume that parentheses in tags are balanced.
So the rule is that trailing parentheses are part of the tag as long as
they're balanced, and not part of the tag if they're unbalanced.
This is necessary for the `commentary:` metatag, which has different
behavior depending on whether the metatag value is quoted. For example,
`commentary:translated` finds translated commentaries, while
`commentary:"translated"` finds commentaries containing the literal word
"translated".
Refactor the `PostQuery::AST#simplify` method to split it into three
methods: `#trim` to eliminate redundant AND and OR clauses, `#simplify`
to expand deeply nested subexpressions, and `#sort` to sort the query
into alphabetical order.
This is so we can normalize queries written by users by parsing and
rewriting them, but without expanding out nested subexpressions, which
can substantially alter the way the query is written.
Add a new tag tag search parser that supports full boolean expressions, including `and`,
`or`, and `not` operators and parenthesized subexpressions.
This is only the parser itself, not the code for converting the search into SQL. The new
parser isn't used yet for actual searches. Searches still use the old parser.
Some example syntax:
* `1girl 1boy`
* `1girl and 1boy` (same as `1girl 1boy`)
* `1girl or 1boy`
* `~1girl ~1boy` (same as `1girl or 1boy`)
* `1girl and ((blonde_hair blue_eyes) or (red_hair green_eyes))`
* `1girl ~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `1girl -(blonde_hair blue_eyes)`
* `*_hair *_eyes`
* `*_hair or *_eyes`
* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`
Rules:
AND is implicit between terms, but may be written explicitly:
* `a b c` is `a and b and c`
AND has higher precedence (binds tighter) than OR:
* `a or b and c or d` is `a or (b and c) or d`
* `a or b c or d e` is `a or (b and c) or (d and e)`
All `~` operators in the same subexpression are combined into a single OR:
* `a b ~c ~d` is `a b (c or d)`
* `~a ~b and ~c ~d` is `(a or b) (c or d)`
* `(~a ~b) (~c ~d)` is `(a or b) (c or d)`
A single `~` operator in a subexpression by itself is ignored:
* `a ~b` is `a b`
* `~a and ~b` is `a and b`, which is `a b`
* `(~a) ~b` is `a ~b`, which is `a b`
The parser is written as a backtracking recursive descent parser built on top of
StringScanner and a handful of parser combinators. The parser generates an AST, which is
then simplified using Boolean algebra to remove redundant nodes and to convert the
expression to conjunctive normal form (that is, a product of sums, or an AND of ORs).