Add a model for storing image and video metadata for uploaded files.
Metadata is extracted using ExifTool. You will need to install ExifTool
after this commit. ExifTool 12.22 is the minimum required version
because we use the `--binary` option, which was added in this release.
The MediaMetadata model is separate from the MediaAsset model because
some files contain tons of metadata, and most of it is non-essential.
The MediaAsset model represents an uploaded file and contains essential
metadata, like the file's size and type, while the MediaMetadata model
represents all the other non-essential metadata associated with a file.
Metadata is stored as a JSON column in the database.
ExifTool returns all the file's metadata, not just the EXIF metadata.
EXIF is one of several types of image metadata, hence why we call
it MediaMetadata instead of EXIFMetadata.
Fix a bug where generating thumbnails failed for certain images when
using libvips 8.10. Specifically, it failed for single-channel greyscale
images and four-channel CMYK images without an embedded color profile.
In these cases we specified an sRGB fallback profile, but under libvips
8.10 this failed because the sRGB profile was incompatible with
single-channel and four-channel images. Before libvips 8.10 this worked,
but as of 8.10 it's a hard error.
The way libvips handles fallback color profiles differs across versions,
so we have to use different arguments for different versions. In 8.7,
vips doesn't have builtin color profiles, so we have to specify our own
manually. In 8.9, it has builtin profiles, so we can omit the import
profile, but we're still required to set the export profile to sRGB,
otherwise it will leave CMYK images as CMYK when generating thumbnails.
In 8.10, we have to _not_ to set the import or export profile to sRGB,
otherwise it will fail with an incompatible profile error when it tries
to convert CMYK images to RGB.
The builtin sRGB profile used by libvips[1] is different than the one we
used previously[2]. The builtin one comes from LCMS[3], whereas ours
came from ArgyllCMS.[4] Not all sRGB profiles are created the same[5],
so this may result in some imperceptible differences in thumbnail
output. The ArgyllCMS profile was used before because it seemed to be
the best one[6], but realistically it probably doesn't matter.
1: https://github.com/libvips/libvips/blob/v8.10.6/libvips/colour/profiles/sRGB.icm
2: 906eec190d/config/sRGB.icm
3: https://www.littlecms.com/
4: https://www.argyllcms.com/
5: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html
6: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html#addendum
Account upgrades are now logged on the /user_upgrades page, so they
no longer need to be recorded as mod actions. The mod actions log should
be reserved for privileged actions performed by Builders and above. They
also tended to spam the mod actions log.
Fix regression in ef2857667 that caused animated GIFs and PNGs to
generate thumbnails that were larger than 150x150.
Also fix a bug with cropped previews not being generated for animated
GIFs and PNGs.
Remove StorageManager::Hybrid and StorageManager::Match. These were used
to store uploads on different servers based on the post ID or file
sample type. This is no longer used in production because in hindsight
it's a lot more difficult to manage uploads when they're fragmented
across different servers.
If you need this, you can do tricks with network filesystems to get the
same effect. For example, if you want to store some files on server A
and others on server B, then mount servers A and B as network
filesystems (with e.g. sshfs, Samba, NFS, etc), and use symlinks to
point subdirectories at either server A or B.
A MediaAsset represents an image or video file uploaded to Danbooru. It
stores the metadata associated with the image or video. This is to work
on decoupling files from posts so that images can be uploaded separately
from posts.
Allow moderators to search `disapproved:<username>` with any user.
Before mods could only search for their own disapprovals, even though
they could see disapprovals by others.
Fix an exploit that let you determine the flagger of a post using
`flagger:<username>` saved searches. Saved searches were performed as
DanbooruBot, but since DanbooruBot is a moderator, it let unprivileged
users do `flagger:<username>` searches. Saved searches were done as a
moderator to avoid tag limits, but this is no longer necessary since the
last PostQueryBuilder refactor.
fred get out
Parse the user agent and log whether it seems like a known bot or a
human to NewRelic under the `user.bot` request attribute. This is so
that known bots can be filtered out of search traffic analytics. Bots
and search crawlers make up a significant portion of search traffic.
Replace the old IQDB API client with a new client for the new forked
version of IQDB at https://github.com/danbooru/iqdb.
Changes:
* The /iqdb_queries endpoint now returns `hash` and `signature` fields.
The `signature` is the full decoded Haar signature, while the `hash`
is a encoded version of the signature.
* The /iqdb_queries endpoint no longer returns `width` and `height`
fields in the response (these were always 128x128).
* We no longer need the IQDBs frontend server, now we talk to the IQDB
instance directly.
* We no longer send add/remove image commands to IQDB through AWS SQS,
now we send them to IQDB directly. They are sent in a delayed job so
that if IQDB is down, uploading images is still possible, the add
image commands will just get queued up.
* Fix a bug where regenerating an image's thumbnails didn't regenerate
IQDB, because IQDB silently ignored add image commands when the image
already existed in the database.
Make it so that when a user removes their own vote, the vote is soft
deleted (the is_deleted flag is set) instead of hard deleted.
Changes:
* Add is_deleted flag to comment votes.
* Relax uniqueness constraint so you can have multiple deleted votes on
the same comment. You can still only have one active vote on the comment.
* Add `soft_delete` method to Deletable concern.
When a POST request returns a 302 redirect, follow the redirect with a
GET request instead of with a POST request.
HTTP standards leave it unspecified whether a POST request that returns
a 302 redirect should be followed with a GET or with a POST. A GET is
what most browsers use, which means it's what most servers expect.
Fixes the /tagme Discord command not working because when we uploaded
the image to DeepDanbooru, the POST request returned a 302 redirect,
which the server expected us to follow with a GET, not with a POST.
Ref:
* https://stackoverflow.com/questions/17605915/what-is-the-correct-behavior-expected-of-an-http-post-302-redirect-to-get
Always store original files in `public/data/original` instead of directly in
`public/data`. Previously this was optional and defaulted to off.
Downstream boorus will need to either move all images in the
`public/data` directory to `public/data/original`, or symlink the
`public/data/original` directory to the toplevel `public/data` directory:
ln -s . /path/to/danbooru/public/data/original
This to simplify file layout. This option existed because in the past we
stored original files in different locations on different servers (for
no particular reason).
Changes:
* Change the `expires_at` field to `duration`.
* Make moderators choose from a fixed set of standard ban lengths,
instead of allowing arbitrary ban lengths.
* List `duration` in seconds in the /bans.json API.
* Dump bans to BigQuery.
Note that some old bans have a negative duration. This is because their
expiration date was before their creation date, which is because in 2013
bans were migrated to Danbooru 2 and the original ban creation dates
were lost.
* Export daily public database dumps to BigQuery and Google Cloud Storage.
* Only data visible to anonymous users is exported. Some tables have
null or missing fields because of this.
* The bans table is excluded because some bans have an expires_at
timestamp set beyond year 9999, which BigQuery doesn't support.
* The favorites table is excluded because it's too slow to dump (it
doesn't have an id index, which is needed by find_each).
* Version tables are excluded because dumping them every day is
inefficient, streaming insertions should be used instead.
Links:
* https://console.cloud.google.com/bigquery?project=danbooru1
* https://console.cloud.google.com/storage/browser/danbooru_public
* https://storage.googleapis.com/danbooru_public/data/posts.json