`normalize_query` is used in certain places on the post index page where
we don't want to pay the cost of looking up tag aliases (namely inside
fast_count, in post_search_count_js, and in tag change notices). Don't
normalize aliases by default unless we need to.
Several fixes for the "This tag is under discussion" notice on the post
index page:
* Fix the notice appearing for BURs that aren't pending.
* Fix the notice never going away because of the cache never expiring.
* List all topics when a tag is involved in multiple BURs.
* Link to the forum post instead of the forum topic (fix#4421).
* Optimization: don't check for BURs when the search isn't a simple
single tag search.
* Add a `tags` field to the bulk update requests table for tracking all
tags involved in the request (excluding tags in mass updates that are
negated/optional/wildcards). Known issue: doesn't handle tag type
prefixes in mass updates correctly (e.g. `mass update foo -> artist:bar`
doesn't detect the tag `bar`).
* Allow searching the /bulk_update_requests page by tags.
We don't really need to cache the notice here, but we do it anyway to
reduce queries on the post index page.
Don't use pretty names (spaces instead of underscores) for pools in post
tooltips. This is for consistency with tags (which have underscores
here) and for easier copy & pasting.
Treat the following searches as literal text searches instead of as
special keywords:
* source:none
* commentary:true
* commentary:false
* commentary:translated
* commentary:untranslated
* Eliminate the `parse_query` method.
* Move all the metatag handling logic from the `build` method
to `metatag_matches` and helper methods.
This is to get all the main metatag handling logic in one place, inside
`metatag_matches`, so that it's easier to add new metatags and to handle
things like negated metatags more consistently.
Fix order:custom not working. Also change order:custom to return no
posts under the following error conditions:
* {{order:custom}} (id metatag isn't present)
* {{id:42 order:custom}} (id metatag isn't a list)
* {{id:>42 order:custom}} (id metatag isn't a list)
* {{id:1,2 id:2,3 order:custom}} (id metatag is present twice)
Bug: If a Member had the hide_deleted_posts option turned on and did a
two tag search, no pages would show up.
Cause: The hide_deleted_posts option implicitly adds the -status:deleted
tag, but this tag wasn't considered a free metatag, so this caused
Post.fast_count to fail and return zero because the search was treated
as a three tag search.
ref: https://danbooru.donmai.us/forum_topics/16829
* Move various search parser helper methods (`has_metatag?`,
`is_single_tag?` et al) from PostSets and the Tag model to
PostQueryBuilder.
* Fix various minor bugs stemming from trying to check if a search query
contains certain metatags using regexes or other adhoc techniques.
* Make scan_query, parse_query, normalize_query into instance methods
instead of class methods. This is to a) clean up the API and b)
prepare for moving certain tag utility methods into PostQueryBuilder.
* Fix a few cases where a caller used scan_query when they should have
used split_query or parse_tag_edit.
* Support negating the child: and embedded: metatags.
* Fix approver:<any|none>, disapproved:<reason>, commentary:<type> being
case sensitive.
* Fix child:garbage, locked:garbage, embedded:garbage returning all
posts instead of no posts.
* Fix not being able to use source:, locked:, or -id: twice in the same
search.
Fix a severe performance regression on the posts/index page introduced
by 6ca42947.
Short answer: scan_query dynamically allocated a regex inside an
inner loop that was called thousands of times per pageload.
Long answer:
* The post index page checks each post to see if they're tagged loli/shota,
* This triggers a call to Post#tag_array for every post.
* Post#tag_array called scan_query to split the tag string.
* scan_query loops over the tag string, checking if each tag matches the
regex /#{METATAGS.join("|")}:/.
* This regex uses string interpolation, which makes Ruby treat as a
dynamic value rather than a static value. Ruby doesn't know the
interpolation is static here. This causes the regex to be reallocated
on every iteration of the loop, or in other words, for every tag in
the tag string.
* This caused us to do thousands of regex allocations per pageload. On
average, a posts/index pageload contains 20 posts with ~35 tags per
post, or 7000+ total tags. Doing this many allocations killed performance.
The fix:
* Don't use scan_query for Post#tag_array. We don't have to fully parse
the tag_string here, we can use a simple split.
* Use the /o regex flag to tell Ruby to treat the regex as static and
only evaluate the interpolation once.
* Fix not being able to use the status: metatag twice in the same search.
* Fix status:active excluding banned posts.
* Fix status:garbage returning all posts.
Support exclusive ranges for numeric metatags. For example, `id:5...10`
is equivalent to `id:>=5 id:<10`. Useful for splitting searches into id
ranges without the endpoints overlapping: id:100...200, id:200...300,
id:300...400.
Support using the same numeric-valued metatag twice in the same search.
Numeric-valued metatags are those taking an integer, float, filesize, or
date argument. Previously using the same metatag twice would cause the
second metatag to overwrite the first metatag.
Examples:
* "id:>5 id:<10"
* "width:>500 width:<1000"
* "date:>2019-01-01 date:<2020-01-01"
Support using quoted values with all metatags. For example: user:"blah blah",
pool:"blah blah", commentary:"blah blah", etc. Things like rating:"safe",
id:"42" also work. Both single and double quotes are supported.
Also make the status: and rating: metatags fully free. Before only
status:deleted and rating:s were free.
Refactor so that all metatags that use parse_helper (which is most
integer-valued metatags) support the special "any" and "none" keywords.
This is mainly for consistency, since it's really only useful for using
width:none and height:none to find for certain old unsupported filetypes
that have null width/height values.
Fold parse_helper_fudged into parse_helper. This is used to make
mpixels:N and filesize:N do an inexact match with a 5% error tolerance.
For example, filesize:1mb is really equivalent to filesize:0.95mb..1.05mb
because an exact filesize:1mb search is unlikely to match anything.
Bug: extremely long sources cause the Tags column to become extremely
wide. Caused by the source link not having the word-break property set.
Fix: use post_source_tag, which lets the `a[rel=external] { word-break: break-word; }`
rule take effect.
Example: https://danbooru.donmai.us/post_versions?search%5Bpost_id%5D=3809742
Bug: searching for "filetype:jpg -user:evazion" would negate the
filetype:jpg metatag too. This was because the -user:<name> metatag was
negating the entire cumulative relation instead of just the user:<name>
clause.
Add:
* commentary:true (posts with commentary)
* commentary:false (posts without commentary)
* commentary:translated (posts with translated commentary)
* commentary:untranslated (posts with untranslated commentary)
* commentary:"text" (posts where any commentary field matches "text")
Known issues:
* There's no way to escape the true, false, translated, or
untranslated keywords to do a literal text search for commentaries
containing one of these keywords.
* Negated searches may be slow. Using a left outer join instead of a
subquery would be faster in most cases, but negating it is harder.