Break the hourly/daily/weekly/monthly maintenance tasks down into
individual delayed jobs. This way if one task fails, it won't prevent
other tasks from running. Also, jobs can be run in parallel, and can be
individually retried if they fail.
Fix a bug where where PNG images could be incorrectly detected as
exif-rotated. This would happen when a PNG contained the
IFD0:Orientation flag. It's technically possible for a PNG to contain
this flag, but it's ignored by libvips and by browsers.
post #3762340 (nsfw) is an example of a PNG like this.
The fix is to use `autorot` to let libvips apply the rotation instead of
trying to interpret the exif data ourselves. Note that libvips-8.9 has a
bug where it doesn't strip the orientation flag after applying
`autorot`, which leads to the image being incorrectly rotated a second
time when generating the thumbnail. Use libvips-8.11 instead.
Hourly pruning of expired uploads was failing because of nil deference
errors in `media_asset.destroy!`. There are various cases where an
upload doesn't have a media asset, for example when the source url
fails to download or when the upload is of an invalid filetype.
Rotate the image based on the EXIF orientation flag when generating
thumbnails and samples.
Also fix the width and height to be calculated correctly for rotated
images. Vips gives us the unrotated width and height of the image; we
have to detect whether the image is rotated and swap the width and
height manually to correct them. For example, if an image with the
"Rotate 90 CW" flag is 100x500 before rotation, then after rotation it's
500x100. This should fix#4883 (Exif rotation breaks Javascript fit-to-window)
We also have to fix it so that regenerating a post updates the width and
height of the post, in the event that it's a rotated image.
Finally we set `image-orientation: from-image;` even though it's
probably not necessary.
Fix the test suite failing when trying to run it in the default state
with no config file or API keys configured. Most source sites require
API keys or login credentials to be set in order to work. Skip these
tests when credentials aren't configured.
Autotag `greyscale`, `non-repeating_animation`, and `exif_rotation`.
Note that this does not detect all (or even most) greyscale images.
Artists often save greyscale images as RGB instead of as greyscale.
Automatically tag animated_gif and animated_png when a post is edited.
Add them back if the user tries to remove them from an animated post,
or remove them if the user tries to add them to a non-animated post.
Before we added these tags at upload time, but it was possible for users
to remove them after upload, or to incorrectly add them to non-animated
posts. They were added at upload time because we couldn't afford to open
the file and parse the metadata on every tag edit. Now that we save the
metadata in the database, we can do this.
This also makes it so you can't tag ugoira on non-ugoira files.
Known bug: it's possible to have an animated GIF where every frame is
identical. Post #3770975 is an example. This will be detected as an
animated GIF even though visually it doesn't appear to be animated.
Fixes#4041: Animated_gif tag not added to preprocessed uploads
Fix it so that you can reapprove a failed BUR to run it again. Before
this would fail because it would end up trying to create the aliases
or implications again, which would fail because they already existed.
Now it ignores when an alias or implication already exists. It will
however finish tagging the posts if they haven't been fully moved.
When processing an alias, rename, implication, mass update, or nuke,
update the posts in parallel. This means that if we alias foo to bar,
for example, then we use four processes at once to retag the posts from
foo to bar.
This doesn't mean that if we have two aliases in a BUR, we process both
aliases in parallel. It simply means that when processing an alias, we
update the posts in parallel for that alias.
When a BUR is approved, put it in a `processing` state. After it
successfully finishes processing, put it in the `approved` state. If it
fails processing, put it in the `failed` state.
If approving the BUR fails with a validation error, for example because
the alias already exists or an implication lacks a wiki, then leave the
BUR in the `pending` state. The `failed` state is only for unexpected
errors during processing.
Change the way BURs are processed. Before, we spawned a background job
for each line of the BUR, then processed each job sequentially. Now, we
process the entire BUR sequentially in a single background job.
This means that:
* BURs are truly sequential now. Before certain things like removing
aliases weren't actually performed in a background job, so they were
performed out-of-order before everything else in the BUR.
* Before, if an alias or implication line failed, then subsequent alias
or implication lines would still be processed. This was because each
alias or implication line was queued as a separate job, so a failure
of one job didn't block another. Now, if any alias or implication
fails, the entire BUR will fail and stop processing after that line.
This may be good or bad, depending on whether we actually need the BUR
to be processed in order or not.
* Before, BURs were processed inside a database transaction (except for the
actual updating of posts). Now they're not. This is because we can't
afford to hold transactions open while processing long-running aliases
or implications. This means that if BUR fails in the middle when it is
initially approved, it will be left in a half-complete state. Before
it would be rolled back and left in a pending state with no changes
performed.
* Before, only one BUR at a time could be processed. If multiple BURs
were approved at the same time, then they would queue up and be
processed one at a time. Now, multiple BURs can be processed at the
same time. This may be undesirable when processing large BURs, or BURs
that must be approved in a specific order.
* Before, large tag category changes could time out. This was because
they weren't actually performed in a background job. Now they are, so
they shouldn't time out.
Bug: if ExifTool exited with status 1 because it thought the file was
corrupt, then we didn't record any of the metadata, even though it was
able to read most of it. It turns out there are thousands of posts with
minorly corrupt metadata that ExifTool is still able to read, but will
complain about.
Fix: ignore the exit code of ExifTool and always save whatever metadata
ExifTool is able to return. It will return an `ExifTool:Error` tag in
the event of errors.
Note that there are some (many?) files that are considered corrupt by
ExifTool but not by Vips, and vice versa. Probably because ExifTool only
parses the metadata while Vips only parses the image data.
Fix Exiftool not being able to get the metadata for compressed SWF
files. Exiftool requires Compress::Zlib as an optional dependency to
decompress compressed SWF files, but it wasn't in the Docker image.
Archive::Zip is required for Zip files and Digest::MD5 for certain other
metadata (see "DEPENDENCIES" in exiftool README).
Fix a bug where if you did a slow search that took too long to calculate
the page count, and you had 200 posts per page, then we would show page
5000 as the last page of the search.
This was because we were artificially returning 1,000,000 as the post
count to signal that the count timed out, but at 200 posts per page this
would show 5000 as the last page of the search.
Add a model for storing image and video metadata for uploaded files.
Metadata is extracted using ExifTool. You will need to install ExifTool
after this commit. ExifTool 12.22 is the minimum required version
because we use the `--binary` option, which was added in this release.
The MediaMetadata model is separate from the MediaAsset model because
some files contain tons of metadata, and most of it is non-essential.
The MediaAsset model represents an uploaded file and contains essential
metadata, like the file's size and type, while the MediaMetadata model
represents all the other non-essential metadata associated with a file.
Metadata is stored as a JSON column in the database.
ExifTool returns all the file's metadata, not just the EXIF metadata.
EXIF is one of several types of image metadata, hence why we call
it MediaMetadata instead of EXIFMetadata.
Fix a bug where generating thumbnails failed for certain images when
using libvips 8.10. Specifically, it failed for single-channel greyscale
images and four-channel CMYK images without an embedded color profile.
In these cases we specified an sRGB fallback profile, but under libvips
8.10 this failed because the sRGB profile was incompatible with
single-channel and four-channel images. Before libvips 8.10 this worked,
but as of 8.10 it's a hard error.
The way libvips handles fallback color profiles differs across versions,
so we have to use different arguments for different versions. In 8.7,
vips doesn't have builtin color profiles, so we have to specify our own
manually. In 8.9, it has builtin profiles, so we can omit the import
profile, but we're still required to set the export profile to sRGB,
otherwise it will leave CMYK images as CMYK when generating thumbnails.
In 8.10, we have to _not_ to set the import or export profile to sRGB,
otherwise it will fail with an incompatible profile error when it tries
to convert CMYK images to RGB.
The builtin sRGB profile used by libvips[1] is different than the one we
used previously[2]. The builtin one comes from LCMS[3], whereas ours
came from ArgyllCMS.[4] Not all sRGB profiles are created the same[5],
so this may result in some imperceptible differences in thumbnail
output. The ArgyllCMS profile was used before because it seemed to be
the best one[6], but realistically it probably doesn't matter.
1: https://github.com/libvips/libvips/blob/v8.10.6/libvips/colour/profiles/sRGB.icm
2: 906eec190d/config/sRGB.icm
3: https://www.littlecms.com/
4: https://www.argyllcms.com/
5: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html
6: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html#addendum
Account upgrades are now logged on the /user_upgrades page, so they
no longer need to be recorded as mod actions. The mod actions log should
be reserved for privileged actions performed by Builders and above. They
also tended to spam the mod actions log.
Hardcode the list of nondisposable email providers instead of making it
a config option. Also add a few new providers.
This was previously a config option to keep it secret, but there's not
much need for secrecy here.
A Restricted user's email must be on this list to unrestrict their
account. If a user is Restricted and their email is not in this list,
then it's assumed to be disposable and can't be used to unrestrict their
account even if they verify their email address.
Fix regression in ef2857667 that caused animated GIFs and PNGs to
generate thumbnails that were larger than 150x150.
Also fix a bug with cropped previews not being generated for animated
GIFs and PNGs.
Remove StorageManager::Hybrid and StorageManager::Match. These were used
to store uploads on different servers based on the post ID or file
sample type. This is no longer used in production because in hindsight
it's a lot more difficult to manage uploads when they're fragmented
across different servers.
If you need this, you can do tricks with network filesystems to get the
same effect. For example, if you want to store some files on server A
and others on server B, then mount servers A and B as network
filesystems (with e.g. sshfs, Samba, NFS, etc), and use symlinks to
point subdirectories at either server A or B.
A MediaAsset represents an image or video file uploaded to Danbooru. It
stores the metadata associated with the image or video. This is to work
on decoupling files from posts so that images can be uploaded separately
from posts.
Allow moderators to search `disapproved:<username>` with any user.
Before mods could only search for their own disapprovals, even though
they could see disapprovals by others.
Fix an exploit that let you determine the flagger of a post using
`flagger:<username>` saved searches. Saved searches were performed as
DanbooruBot, but since DanbooruBot is a moderator, it let unprivileged
users do `flagger:<username>` searches. Saved searches were done as a
moderator to avoid tag limits, but this is no longer necessary since the
last PostQueryBuilder refactor.
fred get out
Fix exception during https://danbooru.donmai.us/posts/random?tags=ordfav:nonamethanks
Before we were doing a query like this:
SELECT
"posts".*
FROM
"posts"
INNER JOIN
"favorites" ON "favorites"."post_id" = "posts"."id"
WHERE
(favorites.user_id % 100 = 64 AND favorites.user_id = 52664)
AND "posts"."id" = 343894
ORDER BY
favorites.id DESC,
posts.id DESC,
ID=343894 DESC
but `ID=? DESC` is ambiguous during an ordfav: search because of the
join on the favorites table. The fix is to qualify the reference as
`posts.id`.
Pundit 2.1.1 changed it so that if the first argument to `authorize` is
an Array, then the `authorize` call returns the last element of the
array. This broke order:random, because in that case we returned an
Array of posts. The fix is to return an ActiveRecord::Relation of posts,
which is more correct anyway.