* Move various search parser helper methods (`has_metatag?`,
`is_single_tag?` et al) from PostSets and the Tag model to
PostQueryBuilder.
* Fix various minor bugs stemming from trying to check if a search query
contains certain metatags using regexes or other adhoc techniques.
* Make scan_query, parse_query, normalize_query into instance methods
instead of class methods. This is to a) clean up the API and b)
prepare for moving certain tag utility methods into PostQueryBuilder.
* Fix a few cases where a caller used scan_query when they should have
used split_query or parse_tag_edit.
* Support negating the child: and embedded: metatags.
* Fix approver:<any|none>, disapproved:<reason>, commentary:<type> being
case sensitive.
* Fix child:garbage, locked:garbage, embedded:garbage returning all
posts instead of no posts.
* Fix not being able to use source:, locked:, or -id: twice in the same
search.
Fix a severe performance regression on the posts/index page introduced
by 6ca42947.
Short answer: scan_query dynamically allocated a regex inside an
inner loop that was called thousands of times per pageload.
Long answer:
* The post index page checks each post to see if they're tagged loli/shota,
* This triggers a call to Post#tag_array for every post.
* Post#tag_array called scan_query to split the tag string.
* scan_query loops over the tag string, checking if each tag matches the
regex /#{METATAGS.join("|")}:/.
* This regex uses string interpolation, which makes Ruby treat as a
dynamic value rather than a static value. Ruby doesn't know the
interpolation is static here. This causes the regex to be reallocated
on every iteration of the loop, or in other words, for every tag in
the tag string.
* This caused us to do thousands of regex allocations per pageload. On
average, a posts/index pageload contains 20 posts with ~35 tags per
post, or 7000+ total tags. Doing this many allocations killed performance.
The fix:
* Don't use scan_query for Post#tag_array. We don't have to fully parse
the tag_string here, we can use a simple split.
* Use the /o regex flag to tell Ruby to treat the regex as static and
only evaluate the interpolation once.
* Fix not being able to use the status: metatag twice in the same search.
* Fix status:active excluding banned posts.
* Fix status:garbage returning all posts.
Support exclusive ranges for numeric metatags. For example, `id:5...10`
is equivalent to `id:>=5 id:<10`. Useful for splitting searches into id
ranges without the endpoints overlapping: id:100...200, id:200...300,
id:300...400.
Support using the same numeric-valued metatag twice in the same search.
Numeric-valued metatags are those taking an integer, float, filesize, or
date argument. Previously using the same metatag twice would cause the
second metatag to overwrite the first metatag.
Examples:
* "id:>5 id:<10"
* "width:>500 width:<1000"
* "date:>2019-01-01 date:<2020-01-01"
Support using quoted values with all metatags. For example: user:"blah blah",
pool:"blah blah", commentary:"blah blah", etc. Things like rating:"safe",
id:"42" also work. Both single and double quotes are supported.
Also make the status: and rating: metatags fully free. Before only
status:deleted and rating:s were free.
Refactor so that all metatags that use parse_helper (which is most
integer-valued metatags) support the special "any" and "none" keywords.
This is mainly for consistency, since it's really only useful for using
width:none and height:none to find for certain old unsupported filetypes
that have null width/height values.
Fold parse_helper_fudged into parse_helper. This is used to make
mpixels:N and filesize:N do an inexact match with a 5% error tolerance.
For example, filesize:1mb is really equivalent to filesize:0.95mb..1.05mb
because an exact filesize:1mb search is unlikely to match anything.
Bug: extremely long sources cause the Tags column to become extremely
wide. Caused by the source link not having the word-break property set.
Fix: use post_source_tag, which lets the `a[rel=external] { word-break: break-word; }`
rule take effect.
Example: https://danbooru.donmai.us/post_versions?search%5Bpost_id%5D=3809742
Bug: searching for "filetype:jpg -user:evazion" would negate the
filetype:jpg metatag too. This was because the -user:<name> metatag was
negating the entire cumulative relation instead of just the user:<name>
clause.
Add:
* commentary:true (posts with commentary)
* commentary:false (posts without commentary)
* commentary:translated (posts with translated commentary)
* commentary:untranslated (posts with untranslated commentary)
* commentary:"text" (posts where any commentary field matches "text")
Known issues:
* There's no way to escape the true, false, translated, or
untranslated keywords to do a literal text search for commentaries
containing one of these keywords.
* Negated searches may be slow. Using a left outer join instead of a
subquery would be faster in most cases, but negating it is harder.
* Fix fav:<user> searches to return no results instead of raising a
UserPrivilege error when the user has private favorites.
* Fix fav:<nonexistent_user> raising a UserPrivilege error instead of
returning no results.
* Fix -ordfav:<user> not being supported.
Change favgroup:<name> searches to return no results instead of raising
a UserPrivilege error when an unpermitted user searches for a private
favgroup.
Partial fix for #4389.
Partial fix for #4389.
* Fix invalid username searches returning all posts instead of no posts.
* Fix "user:A user:B" returning results for user:B instead of no results.
* Fix "approver:A approver:B" returning results for approver:B instead of no results.
* Add support for negated -commenter, -noter, -noteupdater, -upvote, -downvote metatags.
* Add support for "any" and "none" values for all username metatags,
including negated metatags that didn't support "any" or "none" before.
* Change noter:any and commenter:any to include posts with deleted notes
or comments. Note that commenter:<username> already included deleted
comments before. This is so that commenter:any has the same behavior
as commenter:<username>
* Fix corrupted image detection. We were shelling out to vips and trying
to grep for error messages, but the error message for jpeg files changed.
Now we load the file in ruby vips, which raises an error on failure.
* Don't attempt to redownload corrupted images. If a download completes
without any errors yet the downloaded file is corrupt, then something is
wrong at the source and redownloading is unlikely to help. Let the
upload fail and the user retry if necessary.
* Validate that all uploads are uncorrupted, including files uploaded
from a computer, not just files uploaded from a source.