Switch the ActiveJob backend from DelayedJob to GoodJob. Differences:
* The job worker is run with `bin/good_job start` instead of `bin/delayed_job`.
* Jobs have an 8 hour timeout instead of a 4 hour timeout.
* Jobs don't automatically retry on failure.
* Finishing jobs are preserved and pruned after 7 days.
Fix maintenance tasks failing to run in production. In production they
were losing the database connection and not re-establishing it, so they
couldn't queue jobs. `ApplicationRecord.verify!` will check if the
connection is lost and re-establish it if it is.
The database connection was being lost because in production we use a
Kubernetes service IP for the database IP, which is essentially a
virtual IP that maps to the real IP. This mapping is implemented with
IPVS[1][2], which has a default idle connection timeout of 5 minutes. If
the connection isn't used for more than 5 minutes, then it's closed.
Since maintenance only runs once an hour, the database connection would
be lost because it was idle for too long.
1: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs
2: https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/
Break the hourly/daily/weekly/monthly maintenance tasks down into
individual delayed jobs. This way if one task fails, it won't prevent
other tasks from running. Also, jobs can be run in parallel, and can be
individually retried if they fail.
* Export daily public database dumps to BigQuery and Google Cloud Storage.
* Only data visible to anonymous users is exported. Some tables have
null or missing fields because of this.
* The bans table is excluded because some bans have an expires_at
timestamp set beyond year 9999, which BigQuery doesn't support.
* The favorites table is excluded because it's too slow to dump (it
doesn't have an id index, which is needed by find_each).
* Version tables are excluded because dumping them every day is
inefficient, streaming insertions should be used instead.
Links:
* https://console.cloud.google.com/bigquery?project=danbooru1
* https://console.cloud.google.com/storage/browser/danbooru_public
* https://storage.googleapis.com/danbooru_public/data/posts.json
Rework the rate limit implementation to make it more flexible:
* Allow setting different rate limits for different actions. Before we
had a single rate limit for all write actions. Now different
controller endpoints can have different limits.
* Allow actions to be rate limited by user ID, by IP address, or both.
Before actions were only limited by user ID, which meant non-logged-in
actions like creating new accounts or attempting to login couldn't be rate
limited. Also, because actions were limited by user ID only, you could
use multiple accounts with the same IP to get around limits.
Other changes:
* Remove the API Limit field from user profile pages.
* Remove the `remaining_api_limit` field from the `/profile.json` endpoint.
* Rename the `X-Api-Limit` header to `X-Rate-Limit` and change it from a
number to a JSON object containing all the rate limit info
(including the refill rate, the burst factor, the cost of the call,
and the current limits).
* Fix a potential race condition where, if you flooded requests fast
enough, you could exceed the rate limit. This was because we checked
and updated the rate limit in two separate steps, which was racy;
simultaneous requests could pass the check before the update happened.
The new code uses some tricky SQL to check and update multiple limits
in a single statement.
Possible fix for this exception that happens in production for unclear
reasons:
NoMethodError: undefined method `is_admin?' for nil:NilClass
…oru/releases/20200805193154/app/policies/tag_policy.rb: 3:in `can_change_category?'
…www/danbooru/releases/20200805193154/app/models/tag.rb: 216:in `find_or_create_by_name'
…www/danbooru/releases/20200805193154/app/models/tag.rb: 193:in `block in create_for_list'
…www/danbooru/releases/20200805193154/app/models/tag.rb: 193:in `map'
…www/danbooru/releases/20200805193154/app/models/tag.rb: 193:in `create_for_list'
…ww/danbooru/releases/20200805193154/app/models/post.rb: 463:in `normalize_tags'
* Show completed uploads to other users.
* Don't show failed or incomplete uploads to other users.
* Don't show tags to other users.
* Delete completed uploads after 1 hour.
* Delete incomplete uploads after 1 day.
* Delete failed uploads after 3 days.
Remove sending dmail notifications to uploaders when an upload is
disapproved. These messages are usually confusing and frustrating to
uploaders. They don't know who is sending them and they usually feel
insulted when they get negative messages from anonymous users.
The old password reset flow:
* User requests a password reset.
* Danbooru generates a password reset nonce.
* Danbooru emails user a password reset confirmation link.
* User follows link to password reset confirmation page.
* The link contains a nonce authenticating the user.
* User confirms password reset.
* Danbooru resets user's password to a random string.
* Danbooru emails user their new password in plaintext.
The new password reset flow:
* User requests a password reset.
* Danbooru emails user a password reset link.
* User follows link to password edit page.
* The link contains a signed_user_id param authenticating the user.
* User changes their own password.
Send weekly warning dmails to approvers in danger of losing their
approver permissions. Don't send warnings if we're more than three weeks
away from demotion so that approvers aren't warned prematurely.
Few people used forum subscriptions (only around 100), and even fewer
people were subscribed to active threads. Most subscriptions were for
old threads that will never be bumped again. The implementation also had
a few problems:
* Unsubscribe links in emails didn't work (they unset the user's
receive_email_notifications flag, but forum subscriptions didn't
respect this flag).
* Some users had invalid email addresses, which caused notifications to
bounce. There was no mechanism for preventing bounces.
* The implementation wasn't scalable. It involved a daily linear scan
over _all_ forum subscriptions looking for any topics that had been updated.
* Move the Curated pool updater from Reportbooru to Danbooru.
* Change the process for selecting curated posts. Previously it was
every post from the last week with at least three supervotes. This was
flawed because it included both super-upvotes and super-downvotes. Now
it's the top 100 posts from the last week, ordered from most super-upvoted
to least.
Disable database timeouts durings daily maintenance. Fixes
`regenerate_post_counts!` timing out. Remove calls to without_timeout
because otherwise it will reenable the timeout when trying to restore
the old timeout (see 97cc873a3f).
* Automatically fix all tags with incorrect counts during daily
maintenance (previously only tags with negative counts were fixed).
* Log fixed tags to NewRelic.
* Remove the ability to manually fix tag counts with the "Fix" button on
the /tags listing. This is no longer necessary now that tags are
fixed automatically.
Drop support for https://danbooru.donmai.us/cache/tags.json. This was a
nightly dump of the tags table that was originally added in #1012. It
was never documented and never really used except for by the DanbooruUp
extension.
Setting the statement timeout at the beginning didn't work because
`PostPruner.new.prune!` clobbers the timeout (it calls `without_timeout`,
which doesn't restore the timeout properly if the timeout was zero).