This is necessary for the `commentary:` metatag, which has different
behavior depending on whether the metatag value is quoted. For example,
`commentary:translated` finds translated commentaries, while
`commentary:"translated"` finds commentaries containing the literal word
"translated".
Count the number of commented and noted posts directly, instead of
indirectly by counting the number of posts in a `commenter:<name>` or
`noteupdater:<name>` search. This is faster because it generates
better SQL.
Refactor the `PostQuery::AST#simplify` method to split it into three
methods: `#trim` to eliminate redundant AND and OR clauses, `#simplify`
to expand deeply nested subexpressions, and `#sort` to sort the query
into alphabetical order.
This is so we can normalize queries written by users by parsing and
rewriting them, but without expanding out nested subexpressions, which
can substantially alter the way the query is written.
Fix artist URLs still showing old cached site icons because the URL
didn't change when the file was updated. Use `image_pack_tag` so that
the filename includes the hash, so that the URL changes when the file
changes.
Stop the last remaining uses of the `artist_urls.normalized_url` column.
It's already no longer used by the artist finder. The only remaining
uses were by API users. Those users should use the `url` column instead.
Sometimes when a Newgrounds post is part of a set, there is a list of
links to other posts in the set in the artist's commentary. Exclude
these links because they're not really part of the commentary.
Example: https://www.newgrounds.com/art/view/boxofwant/annie-hughes-1 (NSFW)
Fix searches like this:
https://danbooru.donmai.us/artist_urls?search[url_not_regex]=https://www/.artstation/.com/[a-zA-Z0-9-.]
returning a HTTP 500 error with a blank page.
Here the problem is that Postgres raises an error because `[a-zA-Z0-9-.]`
is an invalid character class (it should be `[a-ZA-Z0-9.-]`). This error
happens on the error page itself, when rendering the <link rel="next"> /
<link rel="prev"> links in the <head>, so it causes the error page to fail.
Fix mass updates of the form `mass update A -> B` not being allowed.
This was originally because after `rename` was introduced, we wanted to
prevent people from using mass updates to move tags. Now, `mass update A -> B`
adds B to all posts tagged A instead of moving A to B. So `mass update A -> B`
should no longer be disallowed.
This also makes it so that it's an error to create a mass update with a
syntax error in the search. Before searches couldn't have syntax errors,
but now with the new query parser it's possible.
Fix the Tags field in the BUR search form not finding all BURs
mentioning that tag. Specifically, tags that were part of a mass update,
and that were prefixed with `~` or `-` (OR tags and NOT tags), weren't
indexed as tags affected by the BUR.
This requires re-running script/fixes/064_initialize_bulk_update_request_tags.rb
to fix old BURs.
Add a new tag tag search parser that supports full boolean expressions, including `and`,
`or`, and `not` operators and parenthesized subexpressions.
This is only the parser itself, not the code for converting the search into SQL. The new
parser isn't used yet for actual searches. Searches still use the old parser.
Some example syntax:
* `1girl 1boy`
* `1girl and 1boy` (same as `1girl 1boy`)
* `1girl or 1boy`
* `~1girl ~1boy` (same as `1girl or 1boy`)
* `1girl and ((blonde_hair blue_eyes) or (red_hair green_eyes))`
* `1girl ~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `1girl -(blonde_hair blue_eyes)`
* `*_hair *_eyes`
* `*_hair or *_eyes`
* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`
Rules:
AND is implicit between terms, but may be written explicitly:
* `a b c` is `a and b and c`
AND has higher precedence (binds tighter) than OR:
* `a or b and c or d` is `a or (b and c) or d`
* `a or b c or d e` is `a or (b and c) or (d and e)`
All `~` operators in the same subexpression are combined into a single OR:
* `a b ~c ~d` is `a b (c or d)`
* `~a ~b and ~c ~d` is `(a or b) (c or d)`
* `(~a ~b) (~c ~d)` is `(a or b) (c or d)`
A single `~` operator in a subexpression by itself is ignored:
* `a ~b` is `a b`
* `~a and ~b` is `a and b`, which is `a b`
* `(~a) ~b` is `a ~b`, which is `a b`
The parser is written as a backtracking recursive descent parser built on top of
StringScanner and a handful of parser combinators. The parser generates an AST, which is
then simplified using Boolean algebra to remove redundant nodes and to convert the
expression to conjunctive normal form (that is, a product of sums, or an AND of ORs).
Change mass updates to not automatically remove the left-hand side tags
from the post. This won't work with full boolean searches in the future
and already doesn't work with complex searches involving metatags or OR-tags.