Commit Graph

2456 Commits

Author SHA1 Message Date
evazion
595e02ab45 posts: add duration:<x> and order:duration metatags.
Add duration:<x> and order:duration metatags for searching animated
posts by duration.

https://danbooru.donmai.us/posts?tags=animated+duration:<5.0
https://danbooru.donmai.us/posts?tags=animated+duration:>60
https://danbooru.donmai.us/posts?tags=animated+order:duration
2021-10-07 03:21:08 -05:00
evazion
2595f18b2f posts: fix calculation of animated PNG duration.
Fix certain animated PNGs returning NaN as the duration because the
frame rate was being reported as "0/0" by FFMpeg. This happens when the
animation has zero delay between frames. This is supposed to mean a PNG
with an infinitely fast frame rate, but in practice browsers limit it to
around 10FPS. The exact frame rate browsers will use is unknown and
implementation defined.
2021-10-06 21:04:36 -05:00
evazion
950bc608c2 maintenance: add job to check for database corruption.
Add a job to run pg_amcheck hourly to check for corrupt database indexes.

https://www.postgresql.org/docs/14/app-pgamcheck.html
2021-10-06 08:08:52 -05:00
nonamethanks
45313c56a6 Lofter: fix tag extraction 2021-10-04 14:21:07 +02:00
evazion
627e8e1013 cron: fix undefined variable in exception handler (again)
Fixup for a1d4408c2.
2021-09-29 06:04:28 -05:00
evazion
4a525c7473 cron: fix maintenance tasks failing to run.
Fix maintenance tasks failing to run in production. In production they
were losing the database connection and not re-establishing it, so they
couldn't queue jobs. `ApplicationRecord.verify!` will check if the
connection is lost and re-establish it if it is.

The database connection was being lost because in production we use a
Kubernetes service IP for the database IP, which is essentially a
virtual IP that maps to the real IP. This mapping is implemented with
IPVS[1][2], which has a default idle connection timeout of 5 minutes. If
the connection isn't used for more than 5 minutes, then it's closed.
Since maintenance only runs once an hour, the database connection would
be lost because it was idle for too long.

1: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs
2: https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/
2021-09-28 18:06:57 -05:00
evazion
a1d4408c29 cron: fix undefined variable in exception handler. 2021-09-28 16:34:05 -05:00
evazion
126046cb69 posts: remove rating, note, and status locks.
Remove the ability for users to lock ratings, note, and post statuses.

Historically the majority of locked posts were from 10+ years ago when
certain users habitually locked ratings and notes on every post they
touched for no reason. Nowadays most posts have been unlocked. Only a
handful of locked posts are left, none of which deserve to be locked.

The is_rating_locked, is_note_locked, and is_status_locked columns still
exist in the database, but aren't used.
2021-09-27 22:32:30 -05:00
evazion
738be825ff twitter: include artist name in source URLs on post pages.
Show Twitter sources on post pages like this:

    https://twitter.com/BOW999/status/1261877313349640194

and not like this:

    https://twitter.com/i/web/status/1261877313349640194

We originally removed the artist name because the link would be broken
when the artist changed their name. This is no longer the case.
2021-09-27 11:07:25 -05:00
evazion
3835a63771 cron: fix tasks not being logged.
Fix maintenance tasks not being logged by the cron daemon because
in production the log level defaulted to `error` instead of `info`.
2021-09-27 10:43:06 -05:00
evazion
0e901b2f84 media file: get duration of animated GIFs, PNGs, and ugoiras.
Add methods to MediaFile to calculate the duration, frame count, and
frame rate of animated GIFs, PNGs, Ugoiras, and videos.

Some considerations:

* It's possible to have a GIF or PNG that's technically animated but
  just has one frame. These are treated as non-animated images.

* It's possible to have an animated GIF that has an unspecified
  frame rate. In this case we assume the frame rate is 10 FPS; this is
  browser dependent and may not be correct.

* Animated GIFs, PNGs, and Ugoiras all support variable frame rates.
  Technically, each frame has a separate delay, and the delays can be
  different frame-to-frame. We report only the average frame rate.

* Getting the duration of an APNG is surprisingly hard. Most tools don't
  have good support for APNGs since it's a rare and non-standardized
  format. The best we can do is get the frame count using ExifTool and the
  frame rate using ffprobe, then calculate the duration from that.
2021-09-27 05:18:25 -05:00
evazion
f8d52e6758 /status: add more information to /status page.
Add the following:

* Container name, machine name, worker id.
* Container uptime, puma uptime, worker uptime.
* Number of requests processed by current worker.
* ExifTool version.

Also change /status page to show information in tables instead of lists.
2021-09-26 23:11:08 -05:00
evazion
52bf4a3a6b maintenance: break maintenance tasks into individual jobs.
Break the hourly/daily/weekly/monthly maintenance tasks down into
individual delayed jobs. This way if one task fails, it won't prevent
other tasks from running. Also, jobs can be run in parallel, and can be
individually retried if they fail.
2021-09-26 20:38:30 -05:00
evazion
ab3f35580f metadata: move metadata parsing into ExifTool::Metadata.
Move the metadata parsing code from MediaAsset to ExifTool::Metadata so
we can use it outside the context of a MediaAsset, in particular when
dealing with a MediaFile that hasn't been saved to disk yet.
2021-09-26 07:19:36 -05:00
evazion
6f02918676 Merge pull request #4887 from nonamethanks/update-artist-finder
Update artist finder blacklist
2021-09-24 08:54:48 -05:00
nonamethanks
16f76fe396 Update artist finder blacklist 2021-09-24 14:50:19 +02:00
evazion
74b03a7bd0 posts: fix incorrect exif rotation for PNGs.
Fix a bug where where PNG images could be incorrectly detected as
exif-rotated. This would happen when a PNG contained the
IFD0:Orientation flag. It's technically possible for a PNG to contain
this flag, but it's ignored by libvips and by browsers.

post #3762340 (nsfw) is an example of a PNG like this.

The fix is to use `autorot` to let libvips apply the rotation instead of
trying to interpret the exif data ourselves. Note that libvips-8.9 has a
bug where it doesn't strip the orientation flag after applying
`autorot`, which leads to the image being incorrectly rotated a second
time when generating the thumbnail. Use libvips-8.11 instead.
2021-09-23 00:10:00 -05:00
evazion
b378785582 Fix #3692: Rotate pictures based on metadata
Rotate the image based on the EXIF orientation flag when generating
thumbnails and samples.

Also fix the width and height to be calculated correctly for rotated
images. Vips gives us the unrotated width and height of the image; we
have to detect whether the image is rotated and swap the width and
height manually to correct them. For example, if an image with the
"Rotate 90 CW" flag is 100x500 before rotation, then after rotation it's
500x100. This should fix #4883 (Exif rotation breaks Javascript fit-to-window)

We also have to fix it so that regenerating a post updates the width and
height of the post, in the event that it's a rotated image.

Finally we set `image-orientation: from-image;` even though it's
probably not necessary.
2021-09-22 11:12:50 -05:00
evazion
8738d8a645 Remove deletion appeal thread updater.
It was already disabled before, remove it completely now.
2021-09-22 01:57:45 -05:00
evazion
d5981754c4 posts: automatically tag animated_gif & animated_png on tag edit.
Automatically tag animated_gif and animated_png when a post is edited.
Add them back if the user tries to remove them from an animated post,
or remove them if the user tries to add them to a non-animated post.

Before we added these tags at upload time, but it was possible for users
to remove them after upload, or to incorrectly add them to non-animated
posts. They were added at upload time because we couldn't afford to open
the file and parse the metadata on every tag edit. Now that we save the
metadata in the database, we can do this.

This also makes it so you can't tag ugoira on non-ugoira files.

Known bug: it's possible to have an animated GIF where every frame is
identical. Post #3770975 is an example. This will be detected as an
animated GIF even though visually it doesn't appear to be animated.

Fixes #4041: Animated_gif tag not added to preprocessed uploads
2021-09-21 08:26:02 -05:00
evazion
4d0580b160 MediaFile: remove old libvips compatibility code.
Remove code for working with older versions of libvips. This makes
libvips 8.10+ a hard requirement. Older versions were already broken and
failed certain tests in the test suite.
2021-09-21 07:47:48 -05:00
evazion
ae7d964bf1 MediaFile: replace APNGInspector with ExifTool.
Replace our own handwritten APNG parser with ExifTool. This makes
ExifTool a hard requirement for handling APNGs.
2021-09-21 07:47:45 -05:00
evazion
d854bf6b53 BURs: update posts in parallel.
When processing an alias, rename, implication, mass update, or nuke,
update the posts in parallel. This means that if we alias foo to bar,
for example, then we use four processes at once to retag the posts from
foo to bar.

This doesn't mean that if we have two aliases in a BUR, we process both
aliases in parallel. It simply means that when processing an alias, we
update the posts in parallel for that alias.
2021-09-20 01:12:14 -05:00
evazion
21f0c2acc3 BURs: add processing and failed states.
When a BUR is approved, put it in a `processing` state. After it
successfully finishes processing, put it in the `approved` state. If it
fails processing, put it in the `failed` state.

If approving the BUR fails with a validation error, for example because
the alias already exists or an implication lacks a wiki, then leave the
BUR in the `pending` state. The `failed` state is only for unexpected
errors during processing.
2021-09-20 01:12:14 -05:00
evazion
9ba84efc07 BURs: process BURs sequentially in a single job.
Change the way BURs are processed. Before, we spawned a background job
for each line of the BUR, then processed each job sequentially. Now, we
process the entire BUR sequentially in a single background job.

This means that:

* BURs are truly sequential now. Before certain things like removing
  aliases weren't actually performed in a background job, so they were
  performed out-of-order before everything else in the BUR.

* Before, if an alias or implication line failed, then subsequent alias
  or implication lines would still be processed. This was because each
  alias or implication line was queued as a separate job, so a failure
  of one job didn't block another. Now, if any alias or implication
  fails, the entire BUR will fail and stop processing after that line.
  This may be good or bad, depending on whether we actually need the BUR
  to be processed in order or not.

* Before, BURs were processed inside a database transaction (except for the
  actual updating of posts). Now they're not. This is because we can't
  afford to hold transactions open while processing long-running aliases
  or implications. This means that if BUR fails in the middle when it is
  initially approved, it will be left in a half-complete state. Before
  it would be rolled back and left in a pending state with no changes
  performed.

* Before, only one BUR at a time could be processed. If multiple BURs
  were approved at the same time, then they would queue up and be
  processed one at a time. Now, multiple BURs can be processed at the
  same time. This may be undesirable when processing large BURs, or BURs
  that must be approved in a specific order.

* Before, large tag category changes could time out. This was because
  they weren't actually performed in a background job. Now they are, so
  they shouldn't time out.
2021-09-20 01:12:14 -05:00
Lily
22430f2ec1 Update post_embed.rb 2021-09-17 18:39:56 -03:00
evazion
313257b771 posts: add exif:<value> search metatags.
Examples:

* https://danbooru.donmai.us/posts?tags=exif:File:ColorComponents
* https://danbooru.donmai.us/posts?tags=exif:GIF:GIFVersion
* https://danbooru.donmai.us/posts?tags=exif:PNG:ColorType

* https://danbooru.donmai.us/posts?tags=exif:PNG:ColorType=RGB
* https://danbooru.donmai.us/posts?tags=exif:GIF:GIFVersion=89a
* https://danbooru.donmai.us/posts?tags=exif:File:ColorComponents=3
2021-09-16 02:13:15 -05:00
evazion
ea6e47125e metadata: add ability to search exif metadata.
Usage:

* https://danbooru.donmai.us/media_metadata?search[has_metadata]=true
* https://danbooru.donmai.us/media_metadata?search[has_metadata]=false
* https://danbooru.donmai.us/media_metadata?search[metadata_has_key]=GIF:GIFVersion
* https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]=89a
* https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]&search[metadata][GIF:BackgroundColor]=0
2021-09-16 00:25:21 -05:00
evazion
9cc8d8aa4a metadata: add CLI script for printing image metadata
Add a utility script for printing image metadata from the command line.

Usage: `bin/lsmetadata 1.jpg 2.jpg`
2021-09-15 21:39:56 -05:00
evazion
822f72387e metadata: record metadata for corrupt files.
Bug: if ExifTool exited with status 1 because it thought the file was
corrupt, then we didn't record any of the metadata, even though it was
able to read most of it. It turns out there are thousands of posts with
minorly corrupt metadata that ExifTool is still able to read, but will
complain about.

Fix: ignore the exit code of ExifTool and always save whatever metadata
ExifTool is able to return. It will return an `ExifTool:Error` tag in
the event of errors.

Note that there are some (many?) files that are considered corrupt by
ExifTool but not by Vips, and vice versa. Probably because ExifTool only
parses the metadata while Vips only parses the image data.
2021-09-15 20:26:35 -05:00
evazion
34de3b4d18 Merge pull request #4879 from nonamethanks/fix-artist-name
Sources: fix artist_name not being caught in skeb and weibo
2021-09-14 05:39:06 -05:00
nonamethanks
a845477cba Sources: fix artist_name not being caught in skeb and weibo 2021-09-14 11:32:24 +02:00
nonamethanks
9a6a6e52ea Lofter: raise timeout for file download 2021-09-10 13:10:29 +02:00
evazion
55d00fc40c paginator: fix showing page 5000 when page count is unknown
Fix a bug where if you did a slow search that took too long to calculate
the page count, and you had 200 posts per page, then we would show page
5000 as the last page of the search.

This was because we were artificially returning 1,000,000 as the post
count to signal that the count timed out, but at 200 posts per page this
would show 5000 as the last page of the search.
2021-09-08 18:33:28 -05:00
evazion
3d660953d4 Add MediaMetadata model.
Add a model for storing image and video metadata for uploaded files.

Metadata is extracted using ExifTool. You will need to install ExifTool
after this commit. ExifTool 12.22 is the minimum required version
because we use the `--binary` option, which was added in this release.

The MediaMetadata model is separate from the MediaAsset model because
some files contain tons of metadata, and most of it is non-essential.
The MediaAsset model represents an uploaded file and contains essential
metadata, like the file's size and type, while the MediaMetadata model
represents all the other non-essential metadata associated with a file.

Metadata is stored as a JSON column in the database.

ExifTool returns all the file's metadata, not just the EXIF metadata.
EXIF is one of several types of image metadata, hence why we call
it MediaMetadata instead of EXIFMetadata.
2021-09-08 05:00:54 -05:00
evazion
fb5078836e Fix #4612: Input profile error with greyscale jpg images.
Fix a bug where generating thumbnails failed for certain images when
using libvips 8.10. Specifically, it failed for single-channel greyscale
images and four-channel CMYK images without an embedded color profile.
In these cases we specified an sRGB fallback profile, but under libvips
8.10 this failed because the sRGB profile was incompatible with
single-channel and four-channel images. Before libvips 8.10 this worked,
but as of 8.10 it's a hard error.

The way libvips handles fallback color profiles differs across versions,
so we have to use different arguments for different versions. In 8.7,
vips doesn't have builtin color profiles, so we have to specify our own
manually. In 8.9, it has builtin profiles, so we can omit the import
profile, but we're still required to set the export profile to sRGB,
otherwise it will leave CMYK images as CMYK when generating thumbnails.
In 8.10, we have to _not_ to set the import or export profile to sRGB,
otherwise it will fail with an incompatible profile error when it tries
to convert CMYK images to RGB.

The builtin sRGB profile used by libvips[1] is different than the one we
used previously[2]. The builtin one comes from LCMS[3], whereas ours
came from ArgyllCMS.[4] Not all sRGB profiles are created the same[5],
so this may result in some imperceptible differences in thumbnail
output. The ArgyllCMS profile was used before because it seemed to be
the best one[6], but realistically it probably doesn't matter.

1: https://github.com/libvips/libvips/blob/v8.10.6/libvips/colour/profiles/sRGB.icm
2: 906eec190d/config/sRGB.icm
3: https://www.littlecms.com/
4: https://www.argyllcms.com/
5: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html
6: https://ninedegreesbelow.com/photography/srgb-profile-comparison.html#addendum
2021-09-06 23:04:26 -05:00
evazion
bd4665886f replacements: fix updater in replacement comments.
Second try at fixing replacement comments showing the wrong updater (826736caa).
2021-09-06 03:25:03 -05:00
evazion
d03b150180 BURs: fix tag nukes not removing antecedent implications.
Fix a bug where if A implied B, and A was nuked, then the A->B
implication wasn't removed.
2021-09-06 03:25:03 -05:00
evazion
4dcfd1d141 aliases/implications: log manual deletions by admins.
Log when an admin manually deletes an alias or implication outside of a
BUR. This is usually only necessary when a BUR is bugged.
2021-09-06 03:25:02 -05:00
evazion
28edd5a22a emails: hardcode nondisposable email list.
Hardcode the list of nondisposable email providers instead of making it
a config option. Also add a few new providers.

This was previously a config option to keep it secret, but there's not
much need for secrecy here.

A Restricted user's email must be on this list to unrestrict their
account. If a user is Restricted and their email is not in this list,
then it's assumed to be disposable and can't be used to unrestrict their
account even if they verify their email address.
2021-09-06 03:24:53 -05:00
evazion
19f01d4554 emails: update canonical domains list. 2021-09-05 17:56:24 -05:00
evazion
d7d3439d79 ffmpeg: generate smart previews as .png instead of .jpg
Generate smart previews as .png so we don't suffer recompression losses
when we convert the preview frame from a full size image down to a
150x150 .jpg thumbnail.
2021-09-05 07:53:12 -05:00
evazion
13f98c02e3 media file: fix overly large thumbnails for animated GIFs.
Fix regression in ef2857667 that caused animated GIFs and PNGs to
generate thumbnails that were larger than 150x150.

Also fix a bug with cropped previews not being generated for animated
GIFs and PNGs.
2021-09-05 07:53:00 -05:00
evazion
540a3e111a Replace streamio-ffmpeg library.
Replace the streamio-ffmpeg library with our own very thin FFmpeg wrapper.
2021-09-05 06:54:56 -05:00
evazion
ef28576673 Fix #3400: Smarter thumbnail generation for videos 2021-09-05 06:10:18 -05:00
evazion
8b85bbe8ea storage manager: remove 'hybrid' and 'match' manager.
Remove StorageManager::Hybrid and StorageManager::Match. These were used
to store uploads on different servers based on the post ID or file
sample type. This is no longer used in production because in hindsight
it's a lot more difficult to manage uploads when they're fragmented
across different servers.

If you need this, you can do tricks with network filesystems to get the
same effect. For example, if you want to store some files on server A
and others on server B, then mount servers A and B as network
filesystems (with e.g. sshfs, Samba, NFS, etc), and use symlinks to
point subdirectories at either server A or B.
2021-09-03 22:45:27 -05:00
evazion
b068c113a8 Add MediaAsset model.
A MediaAsset represents an image or video file uploaded to Danbooru. It
stores the metadata associated with the image or video. This is to work
on decoupling files from posts so that images can be uploaded separately
from posts.
2021-09-02 06:07:52 -05:00
evazion
1bb383703d iqdb: allow searching images by iqdb hash.
Example:

    https://danbooru.donmai.us/iqdb_queries?search[hash]=iqdb_3fe4c0bd5a88fab53fbe5104efdc43ba3fa094416e6e039cf880f9f1fafafafffc80fd7bfd7dfd80fe72fe79fe80fefafefdfeffff75ff79ff7aff7dfff1fffb000100020003000400080082008a00940184018802000287028c028f030004000600078308001080ef80f800f8fdf900fafdfc00fc7dfcfafe00fe72fefdff7effdffff0fff8fffcfffd0007000f001f003e0080008f0106018d02800282038003810403040806800683070b078007870d800d830f801f00f800f96cfa76fb00fb7efbf4fc00fcfdfd00fd74fdfbfe78fe7efef5fef9fefbfefeff6cff76ff7efffcfffe00070080008300860087008b01030106018002820283028503800402040a0502058b0d80

The hash may be obtained from a regular IQDB search, or by calculating
it yourself (an exercise for the reader).
2021-09-02 03:37:36 -05:00
evazion
ed600f4829 iqdb: fix non-jpeg files not working with direct file upload.
Convert non-JPEGs to JPEG before sending them to IQDB.
2021-09-02 03:16:04 -05:00
evazion
19c0027d1f hentai foundry: fix 'Document tree depth exceeded' when parsing commentaries.
Fix a regression in 38c9559fe that caused #4657 to fail again.
2021-09-01 01:40:01 -05:00