Rewrite the implementation of related tags to be simpler, faster, and
more accurate:
* The related tags are now calculated by taking a random sample of 1000
posts, finding the top 250 most frequent tags among those posts, then
ordering those tags by cosine similarity.
* Related tags can generally be calculated in 50-300ms at these sample
sizes. Very high sample sizes (25000+ posts) are still relatively fast
(1-3 seconds), but generally they don't improve accuracy much.
* Related tags are now cached in redis rather than in the tags table.
The related_tags column in the tags table is no longer used.
* Only the related tags in the search taglist are cached. The related
tags returned by the 'Related tags' button are not cached.
* The cache lifetime is a fixed 4 hours.
* The 'Related tags' button now works with metatags.
* The /related_tag page now works with metatags and multitag searches.
Fixes#4134, #4146.
In multi-tag searches, we calculated the tags in the sidebar by sampling
300 random posts from within the search and finding the most frequently
used tags. This meant we were effectively doing two searches on every
page load, one for the actual search and one for the sidebar. This is
not so bad for fast searches, but very bad for slow searches.
Instead, now we calculate the related tags from the current page of
results. This is much faster, at the cost of slightly lower accuracy and
the tag list changing slightly as you browse between pages.
We could use caching here, which would help when browsing between pages,
but we would still have to calculate the tags on the first page load,
which can be very slow in the worst case.
Refactor tag_list_html, split_tag_list_html, and inline_tag_list_html to
take the `show_extra_links` and `current_query` options explicitly,
rather than implicitly relying on CurrentUser or taking `params[:tags]`
from the template.
Fail loudly if we forget to whitelist a param instead of silently
ignoring it.
misc models: convert to strong params.
artist commentaries: convert to strong params.
* Disallow changing or setting post_id to a nonexistent post.
artists: convert to strong params.
* Disallow setting `is_banned` in create/update actions. Changing it
this way instead of with the ban/unban actions would leave the artist in
a partially banned state.
bans: convert to strong params.
* Disallow changing the user_id after the ban has been created.
comments: convert to strong params.
favorite groups: convert to strong params.
news updates: convert to strong params.
post appeals: convert to strong params.
post flags: convert to strong params.
* Disallow users from setting the `is_deleted` / `is_resolved` flags.
ip bans: convert to strong params.
user feedbacks: convert to strong params.
* Disallow users from setting `disable_dmail_notification` when creating feedbacks.
* Disallow changing the user_id after the feedback has been created.
notes: convert to strong params.
wiki pages: convert to strong params.
* Also fix non-Builders being able to delete wiki pages.
saved searches: convert to strong params.
pools: convert to strong params.
* Disallow setting `post_count` or `is_deleted` in create/update actions.
janitor trials: convert to strong params.
post disapprovals: convert to strong params.
* Factor out quick-mod bar to shared partial.
* Fix quick-mod bar to use `Post#is_approvable?` to determine visibility
of Approve button.
dmail filters: convert to strong params.
password resets: convert to strong params.
user name change requests: convert to strong params.
posts: convert to strong params.
users: convert to strong params.
* Disallow setting password_hash, last_logged_in_at, last_forum_read_at,
has_mail, and dmail_filter_attributes[user_id].
* Remove initialize_default_image_size (dead code).
uploads: convert to strong params.
* Remove `initialize_status` because status already defaults to pending
in the database.
tag aliases/implications: convert to strong params.
tags: convert to strong params.
forum posts: convert to strong params.
* Disallow changing the topic_id after creating the post.
* Disallow setting is_deleted (destroy/undelete actions should be used instead).
* Remove is_sticky / is_locked (nonexistent attributes).
forum topics: convert to strong params.
* merges https://github.com/evazion/danbooru/tree/wip-rails-5.1
* lock pg gem to 0.21 (1.0.0 is incompatible with rails 5.1.4)
* switch to factorybot and change all references
Co-authored-by: r888888888 <r888888888@gmail.com>
Co-authored-by: evazion <noizave@gmail.com>
add diffs
Remove the category constraint option from RelatedTagCalculator.calculate_from_posts.
It slows things down and isn't used.
This method is used to calculate the related tags sidebar during
searches for single metatags. Using Tag.category_for in the inner loop
caused a memcache call on every iteration. At 100 posts per page and
20-30 tags per post, this led to up to 2000-3000 total memcache calls,
which significantly slowed pageloads.