Don't record most HTTP request and response headers in the APM, except
for the User-Agent, Referer, Save-Data, X-Forwarded-For, Accept-Language,
and Content-Type headers. Recording every HTTP header for every request
takes up a lot of space and most of them aren't very useful.
Fix artist URLs still showing old cached site icons because the URL
didn't change when the file was updated. Use `image_pack_tag` so that
the filename includes the hash, so that the URL changes when the file
changes.
Add a new tag tag search parser that supports full boolean expressions, including `and`,
`or`, and `not` operators and parenthesized subexpressions.
This is only the parser itself, not the code for converting the search into SQL. The new
parser isn't used yet for actual searches. Searches still use the old parser.
Some example syntax:
* `1girl 1boy`
* `1girl and 1boy` (same as `1girl 1boy`)
* `1girl or 1boy`
* `~1girl ~1boy` (same as `1girl or 1boy`)
* `1girl and ((blonde_hair blue_eyes) or (red_hair green_eyes))`
* `1girl ~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `1girl -(blonde_hair blue_eyes)`
* `*_hair *_eyes`
* `*_hair or *_eyes`
* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`
Rules:
AND is implicit between terms, but may be written explicitly:
* `a b c` is `a and b and c`
AND has higher precedence (binds tighter) than OR:
* `a or b and c or d` is `a or (b and c) or d`
* `a or b c or d e` is `a or (b and c) or (d and e)`
All `~` operators in the same subexpression are combined into a single OR:
* `a b ~c ~d` is `a b (c or d)`
* `~a ~b and ~c ~d` is `(a or b) (c or d)`
* `(~a ~b) (~c ~d)` is `(a or b) (c or d)`
A single `~` operator in a subexpression by itself is ignored:
* `a ~b` is `a b`
* `~a and ~b` is `a and b`, which is `a b`
* `(~a) ~b` is `a ~b`, which is `a b`
The parser is written as a backtracking recursive descent parser built on top of
StringScanner and a handful of parser combinators. The parser generates an AST, which is
then simplified using Boolean algebra to remove redundant nodes and to convert the
expression to conjunctive normal form (that is, a product of sums, or an AND of ORs).
Support grabbing the full image for Tinami uploads, rather than the sample.
Getting the full image requires making a request like this:
curl -X POST \
-H 'Referer: https://www.tinami.com/' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Cookie: Tinami2SESSID=<redacted>;' \
--data-raw 'action_view_original=true&cont_id=1087268ðna_csrf=<redacted>' \
https://www.tinami.com/view/1087268
Then scraping the <img> tag from the resulting HTML page.
If the post has multiple images, then we need to scrape and pass the
`sub_id` of the image too.
Fixes#2818.
Fix this deprecation warning:
DEPRECATION WARNING: ActiveSupport::TimeWithZone.name has been deprecated
and from Rails 7.1 will use the default Ruby implementation. You can set
`config.active_support.remove_deprecated_time_with_zone_name = true` to
enable the new behavior now.
Triggered by the XML serializer in the API.
Add `foreman` to the base Docker image. This way you can do this:
docker run --rm -it -v $PWD:/danbooru ghcr.io/danbooru/danbooru foreman start
to start everything needed to run Danbooru in development mode (except
for the Postgres database). This will start everything listed in the
Procfile:
bin/rails server
bin/good_job start
bin/rails danbooru:cron
bin/webpack-dev-server
Use the GoodJob job adapter instead of the default Rails async job
adapter in development mode.
The default async adapter runs jobs in a background thread in the
`bin/rails server` process, but this sometimes has problems with jobs
blocking the main server thread. The job queue interface at `/jobs` also
didn't work with this.
This means that now you have to run `bin/good_job start` in development
mode in order to work background jobs. This is required for uploads to
work.
NicoSeiga changed it so that on every login, you must enter a 2FA code
sent by email. This broke the NicoSeiga strategy. The fix is to just use
a static session cookie instead (and hope it doesn't expire, and isn't
tied to an IP).
The `nico_seiga_login` and `nico_seiga_password` config settings have
been removed from config/danbooru_default_config.rb and replaced by
`nico_seiga_user_session`. If you run your own Danbooru instance, you
will have to update your config file manually.
Fix the "My Uploads" page showing Admins all uploads, not just their own
uploads.
Changes the URL of the My Uploads page from /uploads to /users/:id/uploads.
This page shows each individual file you've uploaded. This is different
from the regular uploads page because files in multi-file uploads are
not grouped together.
Add more sensitive attributes to the filtered parameters list so that
they aren't shown in exception messages, and aren't logged in log files
or to NewRelic.
Only do this in production so that in testing and development, you can
still see these things when inspecting objects on the console.
Rework the upload process so that files are saved to Danbooru first
before the user starts tagging the upload.
The main user-visible change is that you have to select the file first
before you can start tagging it. Saving the file first lets us fix a
number of problems:
* We can check for dupes before the user tags the upload.
* We can perform dupe checks and show preview images for users not using the bookmarklet.
* We can show preview images without having to proxy images through Danbooru.
* We can show previews of videos and ugoira files.
* We can reliably show the filesize and resolution of the image.
* We can let the user save files to upload later.
* We can get rid of a lot of spaghetti code related to preprocessing
uploads. This was the cause of most weird "md5 confirmation doesn't
match md5" errors.
(Not all of these are implemented yet.)
Internally, uploading is now a two-step process: first we create an upload
object, then we create a post from the upload. This is how it works:
* The user goes to /uploads/new and chooses a file or pastes an URL into
the file upload component.
* The file upload component calls `POST /uploads` to create an upload.
* `POST /uploads` immediately returns a new upload object in the `pending` state.
* Danbooru starts processing the upload in a background job (downloading,
resizing, and transferring the image to the image servers).
* The file upload component polls `/uploads/$id.json`, checking the
upload `status` until it returns `completed` or `error`.
* When the upload status is `completed`, the user is redirected to /uploads/$id.
* On the /uploads/$id page, the user can tag the upload and submit it.
* The upload form calls `POST /posts` to create a new post from the upload.
* The user is redirected to the new post.
This is the data model:
* An upload represents a set of files uploaded to Danbooru by a user.
Uploaded files don't have to belong to a post. An upload has an
uploader, a status (pending, processing, completed, or error), a
source (unless uploading from a file), and a list of media assets
(image or video files).
* There is a has-and-belongs-to-many relationship between uploads and
media assets. An upload can have many media assets, and a media asset
can belong to multiple uploads. Uploads are joined to media assets
through a upload_media_assets table.
An upload could potentially have multiple media assets if it's a Pixiv
or Twitter gallery. This is not yet implemented (at the moment all
uploads have one media asset).
A media asset can belong to multiple uploads if multiple people try
to upload the same file, or if the same user tries to upload the same
file more than once.
New features:
* On the upload page, you can press Ctrl+V to paste an URL and immediately upload it.
* You can save files for upload later. Your saved files are at /uploads.
Fixes:
* Improved error messages when uploading invalid files, bad URLs, and
when forgetting the rating.
Fix regression introduced in 0db20e0ca. Setting `format: false` on the
wiki pages resource disabled format negotiation on all wiki page routes,
not just the show page, which meant /wiki_pages.json no longer worked.
The fix to monkey patch the internal Rails method that parses the file
extension from the URL, and have it ignore everything but the .html,
.json, .js, and .xml extensions. This is really hacky and may break in
future Rails releases.
* Add ability to mark moderation reports as 'handled' or 'rejected'.
* Automatically mark reports as handled when the comment or forum post
is deleted.
* Send a dmail to the reporter when their report is handled.
* Don't show the report notice on comments or forum posts when all
reports against it have been handled or rejected.
* Add a fix script to mark all existing reports for deleted comments,
forum posts, or dmails as handled.
Fix wiki pages like this returning 406 errors:
* https://danbooru.donmai.us/wiki_pages/rnd.jpg
Caused by Rails parsing the .jpg part as a file extension and trying to
return a JPEG in response. This happens deep in Rails' MIME negotiation
code, so it's hard to override. The fix is to pass `format: false` in
the route to disable all special handling of file extensions by Rails,
and then handle it ourselves in the controller. Ugly.
This only affected two tags: `rnd.jpg` and `haru.jpg`.
Set the MALLOC_CONF environment variable in the Docker image to tune the
Jemalloc configuration. Configuring Jemalloc to use two memory arenas
reduces memory fragmentation, and using background threads and low decay
times allows freed memory to be returned to the OS sooner.
Previously we set this environment variable at runtime in Kubernetes,
but baking it into the image is simpler.
This returns a Server-Timing header on all HTTP responses, which
includes details on how long it took the server to render the response.
Browsers can show this timing information in the devtools. In Chrome, go
to the Network panel, then click a HTTP request, then click the Timing tab.
* Update framework files with `bin/rails app:update`.
* Update to use new Rails 7.0 default settings, except for a couple
things regarding new cookie and cache formats that would prevent us
from rolling back to Rails 6.1 if necessary.
Add ability to search jobs on the /jobs page by job type or by status.
Fixes#2577 (Search filters for delayed jobs). This wasn't possible
before with DelayedJobs because it stored the job data in a YAML string,
which made it difficult to search jobs by type. GoodJobs stores job data
in a JSON object, which is easier to search in Postgres.
Remove the DelayedJobs gem and database table. Completes the transition
to GoodJob started in c06bfa64f and f4953549a.
Downstream users can upgrade as follows:
* Stop the Rails server.
* Stop the DelayedJobs worker (normally running as `bin/delayed_job` or `bin/rails jobs:work`).
* Run `bin/rails jobs:work` to finish any pending delayed jobs.
* Run `bin/rails db:migrate` to create the good_jobs table and drop the delayed_jobs table.
* Start the Rails server again.
* Start the GoodJobs worker with `bin/good_job start`.
Switch the ActiveJob backend from DelayedJob to GoodJob. Differences:
* The job worker is run with `bin/good_job start` instead of `bin/delayed_job`.
* Jobs have an 8 hour timeout instead of a 4 hour timeout.
* Jobs don't automatically retry on failure.
* Finishing jobs are preserved and pruned after 7 days.
Switch the Ruby memory allocator from Glibc malloc to Jemalloc. Jemalloc
supposedly uses less memory than Glibc malloc because it's better at
handling memory fragmentation. It also has detailed internal statistics
to help monitor allocator behavior.
We use the LD_PRELOAD method of loading Jemalloc instead of building it
into Ruby so that we can switch allocators at runtime.
Do a few micro-optimizations to reduce the number of memory allocations
during thumbnail generation.
This commit, combined with freezing string literals in a7dc05 and
67b961, reduces the number of allocations on the front page from 180,000
to 150,000, and the number of retained objects from 8,000 to 4,000.
Monkey-patch Symbol#to_s to return a frozen (immutable) string instead
of a mutable string.
This should reduce string allocations, and thereby reduce memory usage
and garbage collector pressure, but it may be incompatible with
libraries that expect Symbol#to_s to return a mutable string.
https://bugs.ruby-lang.org/issues/16150https://github.com/Shopify/symbol-fstring
Remove the SFTP file storage backend. Downstream users can use either
sshfs (which is what Danbooru now uses in production) or rclone instead.
The Ruby SFTP gem was much slower than sshfs.