Regenerate posts asynchronously using a delayed job.
Regenerating a post can be slow because it involves downloading the
original file, regenerating the thumbnails, and redistributing the new
thumbnails back to the image servers. It's better to run this in the
background, especially if a user is trying to regenerate posts in bulk.
The downside is there's no notification to the user when the regeneration
is complete. You have to check the modactions log to see when it's finished.
Don't log mod actions when aliases, implications, or mass updates are
processed.
Originally aliases and implications were logged because they could be
approved outside of a BUR. Mass updates could also be performed by mods
without making a forum request. This is no longer the case.
They were also logged for debugging purposes. This is no longer needed.
This generated a lot of spam in the mod action logs when a large BUR was
approved.
Remove the error status from aliases and implications. Aliases and
implications normally shouldn't fail because they're validated
beforehand. If they do, just let the delayed job itself record the
failure.
Also disable the delayed job from retrying if the alias/implication
somehow fails.
Remove the pending status from tag aliases and implications.
Previously aliases would be created first in the pending state then
changed to active when the alias was later processed in a delayed job.
This meant that BURs weren't processed completely sequentially; first
all the aliases in a BUR would be created in one go, then later they
would be processed and set to active sequentially.
This was problematic in complex BURs that tried to reverse or swap
around aliases, since new pending aliases could be created before old
conflicting aliases were removed.
Bug: if a BUR contained a mass update followed by an alias, then the
alias would become active before the mass update, which could cause
the mass update to return incorrect results if both the alias and mass
update touched the same tags.
This happened because all aliases and implications in the BUR were set
to a queued state before the mass update was processed, but putting an
alias in the queued state effectively made it active.
The fix is to remove the queued state. This was only used anyway as a
debugging tool anyway to monitor the state of BURs as they were being
processed.
* Add a `rename A -> B` command for bulk update requests.
* Change mass updates to only retag the posts, not to move saved
searches or blacklists.
A tag rename does the same thing an alias does, except it doesn't
create a permanent alias. More precisely, a tag rename:
* Moves the wiki.
* Moves the artist entry.
* Moves saved searches.
* Moves blacklists.
* Merges the wikis, if both tags have wiki pages.
* Merges the artist entries, if both tags have artist pages.
* Fixes links in wiki pages to point to the new tag.
* Retags the posts.
Make PostQueryBuilder apply aliases earlier, immediately after parsing
the search.
On the post index page there are multiple places where we need to apply
aliases:
* When running the search with PostQueryBuilder#build.
* When calculating the search count with PostQueryBuilder#fast_count.
* When calculating the related tags for the sidebar.
* When tracking missed searches and popular searches for Reportbooru.
* When looking up wiki excerpts.
Applying aliases after parsing ensures we only have to apply aliases
once for all of these things.
We also normalize the order of tags in searches and strip repeated tags.
This is so that we have consistent cache keys for fast_count.
* Fixes searches for aliased tags being counted as missed searches (fixes#4433).
* Fixes wiki excerpts not showing up when searching for aliased tags.
When doing a tag search, we have to be careful about which user we're
running the search as because the results depend on the current user.
Specifically, things like private favorites, private favorite groups,
post votes, saved searches, and flagger names depend on the user's
permissions, and whether non-safe or deleted posts are filtered out
depend on whether the user has safe mode on or the hide deleted posts
setting enabled.
* Refactor internal searches to explicitly state whether they're
running as the system user (DanbooruBot) or as the current user.
* Explicitly pass in the current user to PostQueryBuilder instead of
implicitly relying on the CurrentUser global.
* Get rid of CurrentUser.admin_mode? (used to ignore the hide deleted
post setting) and CurrentUser.without_safe_mode (used to ignore safe
mode).
* Change the /counts/posts.json endpoint to ignore safe mode and the
hide deleted posts settings when counting posts.
* Fix searches not correctly overriding the hide deleted posts setting
when multiple status: metatags were used (e.g. `status:banned status:active`)
* Fix fast_count not respecting the hide deleted posts setting when the
status:banned metatag was used.
* Make scan_query, parse_query, normalize_query into instance methods
instead of class methods. This is to a) clean up the API and b)
prepare for moving certain tag utility methods into PostQueryBuilder.
* Fix a few cases where a caller used scan_query when they should have
used split_query or parse_tag_edit.
Fix exception when submitting an upload and an in-progress preprocessed
upload already exists. In this case we forgot to pass the upload params
when calling UploadService#delayed_start.
Remove code for updating forum topics when an alias or implication is
approved or rejected. This code was only used when approving single
alias or implication requests. This is no longer used now that all
alias/implication requests are done through BURs.
Remove the post update count estimate from BUR show pages. This was
complex, slow, and usually inaccurate since it assumed that requests in
a BUR had no overlap with each other, which usually wasn't the case.
Add a dedicated queue for bulk update requests and process it using a
single worker. This prevents bulk updates from consuming all available
workers and preventing other job types from running.
This also effectively serializes bulk updates so that they're processed
one-at-a-time instead of in parallel. This will be slower overall but
may avoid some of the issues with indeterminate update order under
parallel updates.
https://danbooru.donmai.us/forum_topics/9127?page=283#forum_post_160508
There was a recent outage that was caused by the read replica
(yukinoshita.donmai.us) being temporarily unavailable. The pg driver in
rails got hardstuck trying to connect to the replica, which brought down
the whole site. The app servers stopped responding and could only be
brought down with SIGKILL. Even try to boot the rails console didn't
work.
We only really used this to calculate tag counts inside Post.fast_count,
which wasn't really beneficial since the read replica is slower than the
main database.
* Automatically fix all tags with incorrect counts during daily
maintenance (previously only tags with negative counts were fixed).
* Log fixed tags to NewRelic.
* Remove the ability to manually fix tag counts with the "Fix" button on
the /tags listing. This is no longer necessary now that tags are
fixed automatically.
Stop maintaining pool category pseudo tags (pool:series, pool:collection)
in pool strings. They're no longer used and the changes to the
`Post#pools` method in dc4d2e54b caused issues with this.
Also allow Members to change the category of large pools again. This was
only restricted because maintaining these pseudotags forced us to update
every post in the pool whenever a pool's category was changed.
Rewrite the implementation of related tags to be simpler, faster, and
more accurate:
* The related tags are now calculated by taking a random sample of 1000
posts, finding the top 250 most frequent tags among those posts, then
ordering those tags by cosine similarity.
* Related tags can generally be calculated in 50-300ms at these sample
sizes. Very high sample sizes (25000+ posts) are still relatively fast
(1-3 seconds), but generally they don't improve accuracy much.
* Related tags are now cached in redis rather than in the tags table.
The related_tags column in the tags table is no longer used.
* Only the related tags in the search taglist are cached. The related
tags returned by the 'Related tags' button are not cached.
* The cache lifetime is a fixed 4 hours.
* The 'Related tags' button now works with metatags.
* The /related_tag page now works with metatags and multitag searches.
Fixes#4134, #4146.
Also fixes a bug where mod actions weren't logged on mass updates.
Creating the mod action silently failed because it was called when
CurrentUser wasn' set.
* Fix tests to run the searches for real instead of mocking everything out.
* Fix SavedSearch.populate to only use the read only database in
production because in breaks things in tests. Specifically:
the posts get created in one db connection but searched for in
another, but the second transaction doesn't see the uncommitted posts
in the first transaction, so the search doesn't work.