Commit Graph

21 Commits

Author SHA1 Message Date
evazion
e935f01358 uploads: fix temp files not being cleaned up quickly enough.
Fix temp files generated during the upload process not being cleaned up quickly enough. This included
downloaded files, generated preview images, and Ugoira video conversions.

Before we relied on `Tempfile` cleaning up files automatically. But this only happened when the
Tempfile object was garbage collected, which could take a long time. In the meantime we could have
hundreds of megabytes of temp files hanging around.

The fix is to explicitly close temp files when we're done with them. But the standard `Tempfile`
class doesn't immediately delete the file when it's closed. So we also have to introduce a
Danbooru::Tempfile wrapper that deletes the tempfile as soon as it's closed.
2022-11-15 18:50:50 -06:00
evazion
0c1e9a1618 Add Danbooru::Archive library for handling .zip and .rar files.
Introduce a new Danbooru::Archive library. This is a wrapper around libarchive that lets us extract
.zip, .rar, .7z, and other archive formats. Replace the rubyzip library in MediaFile::Ugoira with
the new Danbooru::Archive library.

This is a step towards fixing #5340: Add support for extracting archive attachments from certain sources.

This adds a new dependency on libarchive. Downstream users should `apt-get install libarchive13` if
they're not using Docker.

https://github.com/chef/ffi-libarchive
https://github.com/libarchive/libarchive
https://www.rubydoc.info/gems/ffi-libarchive/0.4.2
https://github.com/libarchive/libarchive/wiki/Examples#a-complete-extractor
2022-11-14 20:14:37 -06:00
evazion
3172031caa media assets: track corrupted files in media metadata.
If a media asset is corrupt, include the error message from libvips or
ffmpeg in the "Vips:Error" or "FFmpeg:Error" fields in the media
metadata table.

Corrupt files can't be uploaded nowadays, but they could be in the past,
so we have some old corrupted files that we can't generate thumbnails
for. This lets us mark these files in the metadata so they're findable
with the tag search `exif:Vips:Error`.

Known bug: Vips has a single global error buffer that is shared between
threads and that isn't cleared between operations. So we can't reliably
get the actual error message because it may pick up errors from other
threads, or from previous operations in the same thread.
2022-11-02 20:48:15 -05:00
evazion
48ecb80d6b Fix #5230: video upload 500 error (StatementInvalid) & empty error panel on page
Fix StatementInvalid exception when uploading https://files.catbox.moe/vxoe2p.mp4.

This was a result of multiple bugs:

* First, generating thumbnails for the video failed. This was because
  the video uses the AV1 codec, which FFmpeg failed to decode. It failed
  because our version of FFmpeg was built without the `--enable-libdav1d`
  flag, so it uses the builtin AV1 decoder, which apparently can't
  handle this particular video (it spews a bunch of errors about "Failed
  to get pixel format" and "missing sequence header" and "failed to get
  reference frame").

* Because generating the thumbnails failed, an exception was raised. We
  tried to save the error message in the upload_media_assets.error
  field. However, this also failed because the error message was 77kb
  long (it contained the entire output of the ffmpeg command), but the
  `upload_media_assets` table had a btree index on the `error` column,
  which meant the maximum length of the error column was limited to
  ~2.7kb. This lead to a StatementInvalid exception being raised.

* Because the StatementInvalid exception was raised while we were trying
  to set the upload media asset's status to `failed`, the upload was
  left stuck in the `processing` state rather than being set to the
  `failed` state.

* Because the upload was stuck in the `processing` state, the upload
  page would hang forever waiting for the upload to complete.

The fixes are to:

* Build FFmpeg with `--enable-libdav1d` to use libdav1d for decoding AV1
  videos instead of the builtin AV1 decoder.

* Remove the index on the `upload_media_assets.error` column so that
  setting overly long error messages won't fail.

* Catch unexpected exceptions in ProcessUploadMediaAssetJob so we can
  mark uploads as failed, even if `process_upload!` itself fails because
  it raises an unexpected exception inside its own exception handler.

* Check that the video is playable with `MediaFile::Video#is_corrupt?` before
  allowing it to be uploaded. This way we can return a better error
  message if we can't generate thumbnails because the video isn't
  playable. This requires decoding the entire video, so it means uploads
  may take several seconds longer for long videos. It's also a security
  risk in case ffmpeg has any bugs.

* Define `MediaAsset#preview!` as raising an exception on error, so
  it's clear that generating thumbnails can fail. Define `MediaAsset#preview`
  as returning nil on error for when we don't care about the cause of
  the error.
2022-10-26 22:49:55 -05:00
evazion
c2adf279ee ugoira: remove the PixivUgoiraFrameData model.
Remove the last remaining uses of the PixivUgoiraFrameData model. As of
32bfb8407, Ugoira frame data is now stored in the MediaMetadata model,
under the `Ugoira:FrameDelays` EXIF field.

The pixiv_ugoira_frame_data table still exists, but it can be removed
after this commit is deployed.

Fixes #5264: Error when replacing with ugoira.
2022-10-10 18:21:30 -05:00
evazion
01d10a54f8 ugoira: store frame delays in MediaMetadata model.
Store Ugoira frame delays in the MediaMetadata model as a fake EXIF
field instead of in the PixivUgoiraFrameData model. This way we can get
rid of the PixivUgoiraFrameData model completely. This is a step towards
fixing #5264.
2022-10-09 22:25:20 -05:00
evazion
99221af855 ugoiras: fix regression in 7031fd13d.
Fix `Cannot write log file 'ffmpeg2pass-0.log' for pass-1 encoding: Permission denied` error
when uploading ugoira files. Caused by the fact that 2-pass encoding tries to write a log file in
the current directory by default, which fails in production because the default working directory in
the Docker image is /danbooru, which is read-only.
2022-03-01 00:16:55 -06:00
evazion
7031fd13d7 ugoiras: encode .webm samples using VP9 instead of VP8.
Switch the codec for .webm samples from VP8 to VP9. All modern browsers
support VP9 (Safari was the last to add support in ~2020), so it should
be safe to provide only VP9 .webms without a fallback.

VP9 lets us use two-pass encoding, which should offer better compression.

Fixes ugoira samples still having poor quality even after 4c652cf3e.
4c652cf3e tried to remove the max bitrate limit by setting `-b:v 0`, but
this only worked in FFmpeg 4.2. In production Danbooru uses FFmpeg 4.4,
and apparently in 4.4 `-b:v 0` means "use the default max bitrate of
256kb/s" instead of "no bitrate limit".

https://trac.ffmpeg.org/wiki/Encode/VP9
https://developers.google.com/media/vp9/bitrate-modes
https://developers.google.com/media/vp9/settings/vod
http://wiki.webmproject.org/ffmpeg/vp9-encoding-guide
https://www.reddit.com/r/AV1/comments/k7colv/encoder_tuning_part_1_tuning_libvpxvp9_be_more/
2022-02-28 22:02:56 -06:00
evazion
4c652cf3ec ugoiras: increase quality of webm samples.
Fix certain ugoiras having very low quality webm samples. This was
because we had a target bitrate of 5 Mbps, but this wasn't enough for
videos that were high resolution or that had choppy, hard-to-compress
motion, such as post 5081776 (nsfw).
2022-02-23 02:50:14 -06:00
evazion
1c5786d20f posts: remove cropped thumbnails. 2021-12-16 15:58:29 -06:00
evazion
a7dc05ce63 Enable frozen string literals.
Make all string literals immutable by default.
2021-12-14 21:33:27 -06:00
evazion
dab31a3ef0 media assets: fix exception when generating thumbnails for videos/ugoiras. 2021-12-06 19:01:34 -06:00
evazion
bc506ed1b8 uploads: refactor to simplify ugoira-handling and replacements:
* Make it so replacing a post doesn't generate a dummy upload as a side effect.
* Make it so you can't replace a post with itself (the post should be regenerated instead).
* Refactor uploads and replacements to save the ugoira frame data when
  the MediaAsset is created, not when the post is created. This way it's
  possible to view the ugoira before the post is created.
* Make `download_file!` in the Pixiv source strategy return a MediaFile
  with the ugoira frame data already attached to it, instead of returning it
  in the `data` field then passing it around separately in the `context`
  field of the upload.
2021-10-18 05:18:46 -05:00
evazion
0e901b2f84 media file: get duration of animated GIFs, PNGs, and ugoiras.
Add methods to MediaFile to calculate the duration, frame count, and
frame rate of animated GIFs, PNGs, Ugoiras, and videos.

Some considerations:

* It's possible to have a GIF or PNG that's technically animated but
  just has one frame. These are treated as non-animated images.

* It's possible to have an animated GIF that has an unspecified
  frame rate. In this case we assume the frame rate is 10 FPS; this is
  browser dependent and may not be correct.

* Animated GIFs, PNGs, and Ugoiras all support variable frame rates.
  Technically, each frame has a separate delay, and the delays can be
  different frame-to-frame. We report only the average frame rate.

* Getting the duration of an APNG is surprisingly hard. Most tools don't
  have good support for APNGs since it's a rare and non-standardized
  format. The best we can do is get the frame count using ExifTool and the
  frame rate using ffprobe, then calculate the duration from that.
2021-09-27 05:18:25 -05:00
evazion
ef28576673 Fix #3400: Smarter thumbnail generation for videos 2021-09-05 06:10:18 -05:00
evazion
00ca7526bb docs: add remaining docs for classes in app/logical. 2021-06-24 01:31:41 -05:00
evazion
8a2ae91ff2 tests: skip video file tests if ffmpeg isn't installed. 2020-06-10 18:07:54 -05:00
evazion
f94c52478c media file: memoize expensive methods in subclasses. 2020-05-27 14:31:39 -05:00
evazion
364343453c uploads: factor out remaining image methods to MediaFile. 2020-05-19 02:42:19 -05:00
evazion
45064853de uploads: move thumbnail generation code to MediaFile.
* Move image thumbnail generation code to MediaFile::Image.
* Move video thumbnail generation code to MediaFile::Video.
* Move ugoira->webm conversion code to MediaFile::Ugoira.

This separates thumbnail generation from the upload process so that it's
possible to generate thumbnails outside of uploads.
2020-05-18 04:19:04 -05:00
evazion
e477232e02 uploads: factor out image dimension and filetype detection code.
* Add MediaFile abstraction. A MediaFile represents an image or video file.
* Move filetype detection and dimension parsing code from uploads to MediaFile.
2020-05-06 00:33:35 -05:00