* Rename the `#negate` and `#and` methods that we monkey patch into
ActiveRecord::Relation. These methods are now defined in Rails 6.1, but
they shadow our methods and have slightly different behavior.
* Fix a call to `invert`. It no longer accepts an argument.
* Fix#4552: Multiple quoted search terms not parsed correctly.
* Allow quotes to be escaped in quoted metatags.
* Allow spaces to be escaped in unquoted metatags.
* Allow the empty string to be used in metatags.
Examples:
* `source:""` and `source:''` (same as `source:none`)
* `source:foo\ bar\ baz` (same as `source:"foo bar baz"`)
* `source:"don't say \"lazy\""` (use \" to write a literal ")
* `source:'don\'t say "lazy"'` (use \' to write a literal ')
* `source:"C:\\Windows"` (use \\ to write a literal \)
Replace references to the `is_resolved` field with the `status` field.
Post flags were marked as resolved when a post was approved (but not
when the post was deleted because it went unapproved). The status field
supercedes the resolved field.
* Include appealed posts in the modqueue.
* Add `status` field to appeals. Appeals start out as `pending`, then
become `rejected` if the post isn't approved within three days. If the
post is approved, the appeal's status becomes `succeeded`.
* Add `status` field to flags. Flags start out as `pending` then become
`rejected` if the post is approved within three days. If the post
isn't approved, the flag's status becomes `succeeded`.
* Leave behind a "Unapproved in three days" dummy flag when an appeal
goes unapproved, just like when a pending post is unapproved.
* Only allow deleted posts to be appealed. Don't allow flagged posts to be appealed.
* Add `status:appealed` metatag. `status:appealed` is separate from `status:pending`.
* Include appealed posts in `status:modqueue`. Search `status:modqueue order:modqueue`
to view the modqueue as a normal search.
* Retroactively set old flags and appeals as succeeded or rejected. This
may not be correct for posts that were appealed or flagged multiple
times. This is difficult to set correctly because we don't have
approval records for old posts, so we can't tell the actual outcome of
old flags and appeals.
* Deprecate the `is_resolved` field on post flags. A resolved flag is a
flag that isn't pending.
* Known bug: appealed posts have a black border instead of a blue
border. Checking whether a post has been appealed would require either
an extra query on the posts/index page, or an is_appealed flag on
posts, neither of which are very desirable.
* Known bug: you can't use `status:appealed` in blacklists, for the same
reason as above.
Fixes bug described in d3e4ac7c17 (commitcomment-39049351)
When dealing with searches, there are several variables we have to keep
in mind:
* Whether tag aliases should be applied.
* Whether search terms should be sorted.
* Whether the rating:s and -status:deleted metatags should be added by
safe mode and the hide deleted posts setting.
Which of these things we need to do depends on the context:
* We want to apply aliases when actually doing the search, calculating
the count, looking up the wiki excerpt, recording missed/popular
searches in Reportbooru, and calculating related tags for the sidebar,
but not when displaying the raw search as typed by the user (for
example, in the page title or in the tag search box).
* We want to sort the search when calculating cache keys for fast_count
or related tags, and when recording missed/popular searches, but not
in the page title or when displaying the raw search.
* We want to add rating:s and -status:deleted when performing the
search, calculating the count, or recording missed/popular searches,
but not when calculating related tags for the sidebar, or when
displaying the page title or raw search.
Here we introduce normalized_query and try to use it in contexts where
query normalization is necessary. When to use the normalized query
versus the raw unnormalized query is still subtle and prone to error.
Some searches, such as searches for private favorites or for the
status:unmoderated tag, return different results for different users.
These searches need to have their counts cached separately for each user
so that we don't return incorrect page counts when two different users
perform the same search.
This can also potentially leak private information, such as the number
of posts flagged, downvoted, or disapproved by a given user.
Partial fix for #4280.
* Refactor fast_count to return nil instead of 1,000,000 if the exact count times out.
* Remove the estimate_post_counts and blank_tag_search_fast_count global config options.
* Replace the hardcoded post count estimates inside fast_count with a
method that parses Postgres's estimated row count from EXPLAIN.
* /counts/posts.json:
** Remove the `raise_on_timeout` parameter.
** Add an `estimate_count=<true|false>` parameter.
** Return null instead of 1,000,000 if the exact count times out.
Change PostQueryBuilder to add rating:s and -status:deleted to the
search inside the constructor instead of inside `#build` and
`#fast_count`. This lets up clean up `#fast_count` so it doesn't have to
reparse the query after adding these tags. This caused aliases to be
evaluated more than once on the post index page.
Make PostQueryBuilder apply aliases earlier, immediately after parsing
the search.
On the post index page there are multiple places where we need to apply
aliases:
* When running the search with PostQueryBuilder#build.
* When calculating the search count with PostQueryBuilder#fast_count.
* When calculating the related tags for the sidebar.
* When tracking missed searches and popular searches for Reportbooru.
* When looking up wiki excerpts.
Applying aliases after parsing ensures we only have to apply aliases
once for all of these things.
We also normalize the order of tags in searches and strip repeated tags.
This is so that we have consistent cache keys for fast_count.
* Fixes searches for aliased tags being counted as missed searches (fixes#4433).
* Fixes wiki excerpts not showing up when searching for aliased tags.
When doing a tag search, we have to be careful about which user we're
running the search as because the results depend on the current user.
Specifically, things like private favorites, private favorite groups,
post votes, saved searches, and flagger names depend on the user's
permissions, and whether non-safe or deleted posts are filtered out
depend on whether the user has safe mode on or the hide deleted posts
setting enabled.
* Refactor internal searches to explicitly state whether they're
running as the system user (DanbooruBot) or as the current user.
* Explicitly pass in the current user to PostQueryBuilder instead of
implicitly relying on the CurrentUser global.
* Get rid of CurrentUser.admin_mode? (used to ignore the hide deleted
post setting) and CurrentUser.without_safe_mode (used to ignore safe
mode).
* Change the /counts/posts.json endpoint to ignore safe mode and the
hide deleted posts settings when counting posts.
* Fix searches not correctly overriding the hide deleted posts setting
when multiple status: metatags were used (e.g. `status:banned status:active`)
* Fix fast_count not respecting the hide deleted posts setting when the
status:banned metatag was used.
On the post index page, show the wiki excerpt if the search includes a
single tag, even if the tag is negated or the search includes other
metatags.
If the search includes a single pool: or ordpool: metatag, show the pool
excerpt even if the search includes other metatags.
* Add unaliased:<tag> metatag. This allows you to search for a tag
without applying aliases. This is mainly useful for debugging purposes
and for searching for large tags that are in the process of being
aliased but haven't had all their posts moved yet.
* Remove the "raw" url param from the posts index page. The "raw" param
also caused the search to ignore aliases, but it was undocumented and
exploitable. It was possible to use the raw param to view private
favorites since favorites are treated like a hidden tag.
Forgot to account for negated metatags in normalize_query after e987f070.
Fixes a bug where wrong page counts were displayed for searches
involving negated metatags due to incorrect query normalization.
Fix not being able to negate the following metatags:
* id (didn't support ranges)
* md5
* width
* height
* mpixels
* ratio
* score
* favcount
* filesize
* date
* age
* tagcount
* pixiv
Bug: in some cases searching for multiple metatags would cause one
metatag to be ignored. For example, a search for {{user:1 pool:2}} would
be treated as a search for {{pool:2}}.
Cause: we used `ActiveRecord::Relation#merge` to combine two relations,
which was wrong because `merge` doesn't combine `column IN (?)` clauses
correctly. If there are two `column IN (?)` clauses on the same column,
then `#merge` takes only the second clause and ignores the first.
Fix: write our own half-baked `#and` method to work around Rails'
broken-by-design `#merge` method.
ref: https://github.com/rails/rails/issues/33501.
`normalize_query` is used in certain places on the post index page where
we don't want to pay the cost of looking up tag aliases (namely inside
fast_count, in post_search_count_js, and in tag change notices). Don't
normalize aliases by default unless we need to.
Treat the following searches as literal text searches instead of as
special keywords:
* source:none
* commentary:true
* commentary:false
* commentary:translated
* commentary:untranslated
* Eliminate the `parse_query` method.
* Move all the metatag handling logic from the `build` method
to `metatag_matches` and helper methods.
This is to get all the main metatag handling logic in one place, inside
`metatag_matches`, so that it's easier to add new metatags and to handle
things like negated metatags more consistently.
Fix order:custom not working. Also change order:custom to return no
posts under the following error conditions:
* {{order:custom}} (id metatag isn't present)
* {{id:42 order:custom}} (id metatag isn't a list)
* {{id:>42 order:custom}} (id metatag isn't a list)
* {{id:1,2 id:2,3 order:custom}} (id metatag is present twice)
* Move various search parser helper methods (`has_metatag?`,
`is_single_tag?` et al) from PostSets and the Tag model to
PostQueryBuilder.
* Fix various minor bugs stemming from trying to check if a search query
contains certain metatags using regexes or other adhoc techniques.
* Make scan_query, parse_query, normalize_query into instance methods
instead of class methods. This is to a) clean up the API and b)
prepare for moving certain tag utility methods into PostQueryBuilder.
* Fix a few cases where a caller used scan_query when they should have
used split_query or parse_tag_edit.
* Support negating the child: and embedded: metatags.
* Fix approver:<any|none>, disapproved:<reason>, commentary:<type> being
case sensitive.
* Fix child:garbage, locked:garbage, embedded:garbage returning all
posts instead of no posts.
* Fix not being able to use source:, locked:, or -id: twice in the same
search.
Fix a severe performance regression on the posts/index page introduced
by 6ca42947.
Short answer: scan_query dynamically allocated a regex inside an
inner loop that was called thousands of times per pageload.
Long answer:
* The post index page checks each post to see if they're tagged loli/shota,
* This triggers a call to Post#tag_array for every post.
* Post#tag_array called scan_query to split the tag string.
* scan_query loops over the tag string, checking if each tag matches the
regex /#{METATAGS.join("|")}:/.
* This regex uses string interpolation, which makes Ruby treat as a
dynamic value rather than a static value. Ruby doesn't know the
interpolation is static here. This causes the regex to be reallocated
on every iteration of the loop, or in other words, for every tag in
the tag string.
* This caused us to do thousands of regex allocations per pageload. On
average, a posts/index pageload contains 20 posts with ~35 tags per
post, or 7000+ total tags. Doing this many allocations killed performance.
The fix:
* Don't use scan_query for Post#tag_array. We don't have to fully parse
the tag_string here, we can use a simple split.
* Use the /o regex flag to tell Ruby to treat the regex as static and
only evaluate the interpolation once.
* Fix not being able to use the status: metatag twice in the same search.
* Fix status:active excluding banned posts.
* Fix status:garbage returning all posts.
Support exclusive ranges for numeric metatags. For example, `id:5...10`
is equivalent to `id:>=5 id:<10`. Useful for splitting searches into id
ranges without the endpoints overlapping: id:100...200, id:200...300,
id:300...400.
Support using the same numeric-valued metatag twice in the same search.
Numeric-valued metatags are those taking an integer, float, filesize, or
date argument. Previously using the same metatag twice would cause the
second metatag to overwrite the first metatag.
Examples:
* "id:>5 id:<10"
* "width:>500 width:<1000"
* "date:>2019-01-01 date:<2020-01-01"
Support using quoted values with all metatags. For example: user:"blah blah",
pool:"blah blah", commentary:"blah blah", etc. Things like rating:"safe",
id:"42" also work. Both single and double quotes are supported.
Also make the status: and rating: metatags fully free. Before only
status:deleted and rating:s were free.
Refactor so that all metatags that use parse_helper (which is most
integer-valued metatags) support the special "any" and "none" keywords.
This is mainly for consistency, since it's really only useful for using
width:none and height:none to find for certain old unsupported filetypes
that have null width/height values.
Fold parse_helper_fudged into parse_helper. This is used to make
mpixels:N and filesize:N do an inexact match with a 5% error tolerance.
For example, filesize:1mb is really equivalent to filesize:0.95mb..1.05mb
because an exact filesize:1mb search is unlikely to match anything.