The `pry` gem was removed in e698bf91 because we replaced `pry-byebug` with the standard debugger.
Add it back because `pry` is better than `irb` and we still can use the standard debugger with it.
Add ability to group reports by various columns. For example, you can see
the posts by the top 10 uploaders over time, or posts grouped by rating
over time.
Rewrite db/populate.rb:
* Fix broken code.
* Pull random posts from Danbooru for more realistic data.
* Generate more data (wiki pages, artist commentaries, artist urls).
* Make the amount of data generated configurable with environment variables.
* Use FFaker to generate better random text and usernames.
Usage:
* docker-compose exec danbooru bin/rails runner db/populate.rb # with Docker
* bin/rails runner db/populate.rb # without Docker
Add `#basename`, `#filename`, and `#file_ext` utility methods to
Danbooru::URL and change a few places to use them. Simplifies parsing
filenames in source URLs in various places.
Fixes a bug where the Foundation source strategy failed because http.rb
automatically sent a `Content-Length: 0` header with all GET requests,
which caused Foundation to return a 400 Bad Request error. This behavior
was fixed in http.rb 5.x.
http.rb 5.x has a breaking change where it now includes the request object
inside the response object, which we have to handle in a few places.
Remove the DelayedJobs gem and database table. Completes the transition
to GoodJob started in c06bfa64f and f4953549a.
Downstream users can upgrade as follows:
* Stop the Rails server.
* Stop the DelayedJobs worker (normally running as `bin/delayed_job` or `bin/rails jobs:work`).
* Run `bin/rails jobs:work` to finish any pending delayed jobs.
* Run `bin/rails db:migrate` to create the good_jobs table and drop the delayed_jobs table.
* Start the Rails server again.
* Start the GoodJobs worker with `bin/good_job start`.
This is the first step towards replacing DelayedJob with GoodJob. Compared to
DelayedJob:
* GoodJob supports Rails 7 (DelayedJob is currently a blocker for Rails 7
because it has a version bound on ActiveRecord <6.2).
* GoodJob has a builtin admin dashboard.
* GoodJob supports threaded job workers.
* GoodJob supports scheduled cronjobs.
* GoodJob supports healthchecks for workers.
* GoodJob uses Postgres notifications instead of polling to pick up new
jobs. This allows jobs to be picked up faster and scales better with
large numbers of workers.
https://github.com/bensheldon/good_job
Remove the SFTP file storage backend. Downstream users can use either
sshfs (which is what Danbooru now uses in production) or rclone instead.
The Ruby SFTP gem was much slower than sshfs.
Add a Ruby wrapper library around the libseccomp library. Seccomp is
used to restrict the syscalls a program can make. See comments in
app/logical/seccomp.rb for further details.
This is not used for anything yet. It's simply adding part of the
sandboxing infrastructure for later use.
Only load the `listen` and `solargraph` gems in the development
environment, not the test environment. The `listen` gem automatically
spawns background threads to listen for file changes, in order to
automatically reload code when files change, which we don't want or need
in test mode. These threads can interfere with sandboxing, because they
prevent us from being able to call unshare(2) (which can only be called
from a single-threaded process).
No longer used now that we use Puma in production. If you still used
Unicorn in your install, switch to `bin/rails server` instead. See
config/puma.rb for config settings.
No longer used now that we use Kubernetes to deploy the site instead of
Capistrano.
If you run your own installation of Danbooru, and you used Capistrano to
deploy your site, it is recommended that you switch to either the Docker
Compose file (for personal installs), the Procfile (for non-Dockerized,
development environments), or Kubernetes (for production environments;
see https://github.com/danbooru/danbooru-infrastructure/tree/master/k8s
for Danbooru's production configuration).
When processing an alias, rename, implication, mass update, or nuke,
update the posts in parallel. This means that if we alias foo to bar,
for example, then we use four processes at once to retag the posts from
foo to bar.
This doesn't mean that if we have two aliases in a BUR, we process both
aliases in parallel. It simply means that when processing an alias, we
update the posts in parallel for that alias.
Unlike Unicorn, Puma doesn't have a builtin HTTP request timeout
mechanism, so we have to use Rack::Timeout instead.
See the caveats in the Rack::Timeout documentation [1]. In Unicorn, a
timeout would send a SIGKILL to the worker, immediately killing it. This
would result in a dropped connection and a Cloudflare 502 error to the
user. In Puma, it raises an exception, which we can catch and return a
better error to the user. On the other hand, raising an exception can
potentially corrupt application state if it's sent at the wrong time, or
be delayed indefinitely if the app is stuck in IO or C extension code.
The default request timeout is 65 seconds. 65 seconds is to give things
like HTTP requests on a 60 second timeout enough time to complete. Set
the RACK_REQUEST_TIMEOUT environment variable to change the timeout.
1: https://github.com/sharpstone/rack-timeout#further-documentation
Fix an issue where the New Relic agent always started in the production
environment, even when a license key wasn't configured.
Also make the New Relic agent log to stdout instead of log/newrelic_agent.log.
Remove a workaround added in 2c06766c9. meta_request had a bug that
caused Rails to fail to launch under Rails 6.1. The fix was finally
merged upstream.
hxxps://github.com/dejan/rails_panel/pull/177.
* Export daily public database dumps to BigQuery and Google Cloud Storage.
* Only data visible to anonymous users is exported. Some tables have
null or missing fields because of this.
* The bans table is excluded because some bans have an expires_at
timestamp set beyond year 9999, which BigQuery doesn't support.
* The favorites table is excluded because it's too slow to dump (it
doesn't have an id index, which is needed by find_each).
* Version tables are excluded because dumping them every day is
inefficient, streaming insertions should be used instead.
Links:
* https://console.cloud.google.com/bigquery?project=danbooru1
* https://console.cloud.google.com/storage/browser/danbooru_public
* https://storage.googleapis.com/danbooru_public/data/posts.json