danbooru

Author	SHA1	Message	Date
evazion	e935f01358	uploads: fix temp files not being cleaned up quickly enough. Fix temp files generated during the upload process not being cleaned up quickly enough. This included downloaded files, generated preview images, and Ugoira video conversions. Before we relied on `Tempfile` cleaning up files automatically. But this only happened when the Tempfile object was garbage collected, which could take a long time. In the meantime we could have hundreds of megabytes of temp files hanging around. The fix is to explicitly close temp files when we're done with them. But the standard `Tempfile` class doesn't immediately delete the file when it's closed. So we also have to introduce a Danbooru::Tempfile wrapper that deletes the tempfile as soon as it's closed.	2022-11-15 18:50:50 -06:00
evazion	28237e2e09	posts: automatically tag videos with sound. Automatically add the `sound` tag if the post has sound. Remove the tag if the post doesn't have sound. A video is considered to have sound if its peak loudness is greater than -70 dB. The current quietest post on Danbooru has a peak loudness of -62 dB (post #3470668), but it's possible to have audible sound at -80 dB or possibly even lower. It's hard to draw a clear line between "silent" and "barely audible".	2022-11-05 01:02:29 -05:00
evazion	3172031caa	media assets: track corrupted files in media metadata. If a media asset is corrupt, include the error message from libvips or ffmpeg in the "Vips:Error" or "FFmpeg:Error" fields in the media metadata table. Corrupt files can't be uploaded nowadays, but they could be in the past, so we have some old corrupted files that we can't generate thumbnails for. This lets us mark these files in the metadata so they're findable with the tag search `exif:Vips:Error`. Known bug: Vips has a single global error buffer that is shared between threads and that isn't cleared between operations. So we can't reliably get the actual error message because it may pick up errors from other threads, or from previous operations in the same thread.	2022-11-02 20:48:15 -05:00
evazion	69d88568a6	media assets: allow assets to be regenerated. Add a `MediaAsset#regenerate!` method that regenerates everything about the asset, including the metadata, thumbnails, IQDB, cached Cloudflare URLs, and AI tags. Fixes it so that a) it's possible to regenerate media assets that aren't attached to posts and b) regenerating a post regenerates everything. Before it didn't regenerate the metadata, AI tags, or all of the cached URLs.	2022-11-01 17:32:40 -05:00
evazion	d2c520035b	media assets: fix regenerating AI tags for flash files. Fix it so that trying to regenerate AI tags for a Flash file doesn't fail because Flash files have no image preview. Also let `MediaFile.open` take a block argument.	2022-11-01 16:30:50 -05:00
evazion	b41b67af6c	media assets: add dynamically-generated thumbnails (owner-only). Add ability to dynamically generate thumbnails with: * https://danbooru.donmai.us/media_assets/6961761.jpg?width=180&height=180 This is currently restricted to the Owner-level user because it's slow.	2022-11-01 01:36:38 -05:00
evazion	2f2c73eebb	media assets: fix dimensions of corrupt GIFs. Fix certain corrupt GIFs returning dimensions of 0x0. This happened when the GIF was too corrupt for libvips to read. Fixed by using ExifTool to read the dimensions instead. Also add validations to ensure that it's not possible to have media assets with a width or height of 0.	2022-10-31 15:18:02 -05:00
evazion	d65a35d4ae	media assets: add fix script to refresh metadata. Add a script to go through every media asset and check the metadata (width, height, duration, filesize, md5, EXIF metadata) and update it if it's changed. This is necessary after upgrading ExifTool because the metadata it returns may have changed.	2022-10-30 14:49:12 -05:00
evazion	48ecb80d6b	Fix #5230 : video upload 500 error (StatementInvalid) & empty error panel on page Fix StatementInvalid exception when uploading https://files.catbox.moe/vxoe2p.mp4. This was a result of multiple bugs: * First, generating thumbnails for the video failed. This was because the video uses the AV1 codec, which FFmpeg failed to decode. It failed because our version of FFmpeg was built without the `--enable-libdav1d` flag, so it uses the builtin AV1 decoder, which apparently can't handle this particular video (it spews a bunch of errors about "Failed to get pixel format" and "missing sequence header" and "failed to get reference frame"). * Because generating the thumbnails failed, an exception was raised. We tried to save the error message in the upload_media_assets.error field. However, this also failed because the error message was 77kb long (it contained the entire output of the ffmpeg command), but the `upload_media_assets` table had a btree index on the `error` column, which meant the maximum length of the error column was limited to ~2.7kb. This lead to a StatementInvalid exception being raised. * Because the StatementInvalid exception was raised while we were trying to set the upload media asset's status to `failed`, the upload was left stuck in the `processing` state rather than being set to the `failed` state. * Because the upload was stuck in the `processing` state, the upload page would hang forever waiting for the upload to complete. The fixes are to: * Build FFmpeg with `--enable-libdav1d` to use libdav1d for decoding AV1 videos instead of the builtin AV1 decoder. * Remove the index on the `upload_media_assets.error` column so that setting overly long error messages won't fail. * Catch unexpected exceptions in ProcessUploadMediaAssetJob so we can mark uploads as failed, even if `process_upload!` itself fails because it raises an unexpected exception inside its own exception handler. * Check that the video is playable with `MediaFile::Video#is_corrupt?` before allowing it to be uploaded. This way we can return a better error message if we can't generate thumbnails because the video isn't playable. This requires decoding the entire video, so it means uploads may take several seconds longer for long videos. It's also a security risk in case ffmpeg has any bugs. * Define `MediaAsset#preview!` as raising an exception on error, so it's clear that generating thumbnails can fail. Define `MediaAsset#preview` as returning nil on error for when we don't care about the cause of the error.	2022-10-26 22:49:55 -05:00
evazion	9c811611c6	media assets: add `full` variant for .avif and .webp files. Add a JPEG conversion for .avif and .webp files. The `full` variant is the .avif or .webp file converted to JPEG format, with the same resolution as the original file (full resolution). Known bug: When converting an HDR .avif file to .jpeg, the resulting image is too bright compared to the original image as rendered by Firefox or Chrome.	2022-10-26 04:09:59 -05:00
evazion	acea0d5553	Fix #5065 : .webp images upload support Add ability to upload .webp images. Animated WebP images aren't supported. This is because they aren't supported by FFmpeg yet[1], so generating thumbnails and samples for them would be more complicated than for other formats. [1]: https://trac.ffmpeg.org/ticket/4907	2022-10-25 22:41:36 -05:00
evazion	c96d60a840	uploads: add support for uploading .avif files. Features of AVIF include: * Lossless and lossy compression. * High dynamic range (HDR) images * Wide color gamut images (i.e. 10- and 12-bit color depths) * Transparency (through alpha planes). * Animations (with an optional cover image). * Auxiliary image sequences, where the file contains a single primary image and a short secondary video, like Apple's Live Photos. * Metadata rotation, mirroring, and cropping. The AVIF format is still relatively new and some of these features aren't well supported by browsers or other software: * Animated AVIFs aren't supported by Firefox or by libvips. * HDR images aren't supported by Firefox. * Rotated, mirrored, and cropped AVIFs aren't supported by Firefox or Chrome. * Image grids, where the file contains multiple images that are tiled together into one big image, aren't supported by Firefox. * AVIF as a whole has only been supported for a year or two by Chrome and Firefox, and less than a year by Safari. For these reasons, only basic AVIFs that don't use animation, rotation, cropping, or image grids can be uploaded.	2022-10-25 03:29:58 -05:00
evazion	78fa652646	media assets: make file storage paths and URLs configurable. Add config options to customize where uploads are stored, and how image URLs are generated. * Add `media_asset_file_path` option to customize where uploads are stored. * Add `media_asset_file_url` option to customize how image URLs are generated. * Remove the `enable_seo_post_urls` config option. The `media_asset_file_url` option should be used instead to include the tags in the image URL.	2022-10-16 22:36:52 -05:00
evazion	16e74650e8	media assets: include file URLs in /media_assets.json API. Include information about the asset's variants (sample images) in the /media_assets.json API: { "id": 6410907, "created_at": "2022-07-31T15:44:34.522-04:00", "updated_at": "2022-07-31T15:44:38.002-04:00", "md5": "19a2be6a1a8582bb349de9734b7a649a", "file_ext": "jpg", "file_size": 369029, "image_width": 600, "image_height": 900, "duration": null, "status": "active", "file_key": "R4DBCxBID", "is_public": true, "variants": [ { "variant": "preview", "url": "https://cdn.donmai.us/preview/19/a2/19a2be6a1a8582bb349de9734b7a649a.jpg", "width": 100, "height": 150, "file_ext": "jpg" }, { "variant": "180x180", "url": "https://cdn.donmai.us/180x180/19/a2/19a2be6a1a8582bb349de9734b7a649a.jpg", "width": 120, "height": 180, "file_ext": "jpg" }, { "variant": "360x360", "url": "https://cdn.donmai.us/360x360/19/a2/19a2be6a1a8582bb349de9734b7a649a.jpg", "width": 240, "height": 360, "file_ext": "jpg" }, { "variant": "720x720", "url": "https://cdn.donmai.us/720x720/19/a2/19a2be6a1a8582bb349de9734b7a649a.webp", "width": 480, "height": 720, "file_ext": "webp" }, { "variant": "original", "url": "https://cdn.donmai.us/original/19/a2/19a2be6a1a8582bb349de9734b7a649a.jpg", "width": 600, "height": 900, "file_ext": "jpg" } ] }	2022-10-16 17:28:23 -05:00
evazion	3b0e94040f	posts: fix placeholder thumbnail for Flash files. * Replace the "Download" placeholder thumbnail for Flash files with a new placeholder that specifically says it's a Flash file. * Fix a bug where the Flash placeholder thumbnail was too small when using larger thumbnail sizes. * Fix it so that media assets don't falsely consider Flash files to have thumbnails. This could potentially cause errors if someone tried to expunge, replace, or regenerate a Flash post.	2022-10-16 16:46:18 -05:00
evazion	c2adf279ee	ugoira: remove the PixivUgoiraFrameData model. Remove the last remaining uses of the PixivUgoiraFrameData model. As of `32bfb8407`, Ugoira frame data is now stored in the MediaMetadata model, under the `Ugoira:FrameDelays` EXIF field. The pixiv_ugoira_frame_data table still exists, but it can be removed after this commit is deployed. Fixes #5264: Error when replacing with ugoira.	2022-10-10 18:21:30 -05:00
evazion	1d5db37f56	posts: automatically tag AI-generated on NovelAI posts. Automatically add the AI-generated tag to posts that have the `PNG:Software=NovelAI` EXIF attribute. This is not foolproof because this metadata may get removed if an AI-generated post is resaved or uploaded to a site that strips EXIF metadata. It also only works for NovelAI. Currently it detects 29 out of 177 AI-generated uploads on Danbooru.	2022-10-10 04:04:35 -05:00
evazion	88ac91f5f3	search: refactor to pass in the current user explicitly.	2022-09-22 04:31:21 -05:00
evazion	0a5ebcc69d	uploads: refactor media asset validation logic. Refactor the upload validation logic to not depend on the current user. Fixes several broken upload tests.	2022-09-15 05:09:07 -05:00
evazion	9e16de13ef	Merge pull request #5220 from nonamethanks/duration-validation Uploads: allow admins to bypass duration limits again	2022-09-15 03:46:21 -05:00
evazion	e3af738371	tests: fix broken tests.	2022-08-24 02:03:37 -05:00
evazion	d7e08d1313	media assets: add ability to search by AI tags. Add ability to search the /media_assets index by AI tags. Multi-tag searches are supported, including AND/OR/NOT operators, but metatags aren't supported. Multi-tag searches will probably be slow. The default AI tag confidence threshold is 50%. There's a hidden search[min_score] URL param that lets you change this.	2022-07-06 01:38:41 -05:00
evazion	67798c9ece	Fix #5221 : Trying to upload an unsupported url shows ai tags error.	2022-07-01 18:13:36 -05:00
nonamethanks	b1ae6112bd	Uploads: allow admins to bypass duration limits again	2022-06-29 21:17:39 +02:00
evazion	a9fe73a483	ai tags: save ai tags on upload. Save the AI tags when a media asset is uploaded.	2022-06-28 03:12:46 -05:00
evazion	6f24db92e5	ai tags: make ai tags accessible in api via includes. Make these things work: * https://danbooru.donmai.us/posts.json?only=ai_tags * https://danbooru.donmai.us/media_assets.json?only=ai_tags * https://danbooru.donmai.us/ai_tags.json?only=media_asset,post,tag	2022-06-26 20:37:35 -05:00
evazion	1aeb52186e	Add AI tag model and UI. Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags. AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that repo for details about the model. The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is designed to be as space-efficient as possible, since in production we have over 300 million AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus indexes. You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are potentially mistagged (or more likely where the AI missed the tag). You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by confidence level. You can also search unposted media assets by AI tag. To generate tags, use the `autotag` script from the Autotagger repo, something like this: docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images \| gzip > tags.csv.gz To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.	2022-06-24 04:54:26 -05:00
evazion	181639368c	posts: add is: and has: metatags. Add the following metatags: * is:parent * is:child * is:safe * is:questionable * is:explicit * is:sfw (same as -rating:q,e) * is:nsfw (same as rating:q,e) * is:active * is:deleted * is:pending * is:flagged * is:appealed * is:banned * is:modqueue * is:unmoderated * is:jpg * is:png * is:gif * is:mp4 * is:webm * is:swf * is:zip * has:parent * has:children * has:source * has:appeals * has:flags * has:replacements * has:comments * has:commentary * has:notes * has:pools All of these searches were already possible with other metatags, but these might be more convenient.	2022-05-18 13:04:15 -05:00
evazion	4ba993319a	media assets: add file_key, is_public columns. `file_key` is a random 9-character base-62 string that will be used as the image filename in the future. `is_public` is whether the image can be viewed without authentication or not. Users running downstream boorus must run `bin/rails db:migrate` and `script/fixes/109_generate_media_asset_file_keys.rb` after this commit.	2022-05-04 23:19:53 -05:00
evazion	ac98c142a4	posts: move expunged image to trash folder. When a post is expunged, move the image to a trash folder so it can be recovered if needed.	2022-05-03 05:51:09 -05:00
Michał Frąckiewicz	93635a20d9	Configurable max video duration	2022-03-21 19:22:34 +01:00
evazion	fc5aec7de0	media assets: optimize /media_assets?search[is_posted] query. Followup to `093a808a3`. Using a NOT EXISTS clause is much faster than the `LEFT OUTER JOIN posts WHERE posts.id IS NULL` clause generated by `.where.missing(:post)`.	2022-02-18 04:24:33 -06:00
evazion	093a808a36	Fix #4986 : Add ability to filter images in /media_assets and /uploads depending on if they have become posts	2022-02-18 03:39:08 -06:00
evazion	e4d7453180	uploads: improve error messages. Improve upload error messages when downloading an URL fails, or it isn't an image or video file.	2022-02-15 18:54:55 -06:00
evazion	87a00a1182	uploads: fix "ArgumentError: string contains null byte" error Fix an error when trying to upload a file larger than the file size limit. In this case we tried to dump the whole HTTP response into the error message, which included the binary file itself, which caused this exception because it contained null bytes.	2022-02-15 18:16:47 -06:00
evazion	02edb52569	uploads: enable multi-file uploads when uploading from source. Make the upload page automatically detect when a source URL has multiple images and let the user choose which images to post. For example, when uploading a Twitter or Pixiv post with more than one image, we direct the user to a page showing a thumbnail for each image and letting them choose which ones to post. This is similar to the batch upload page, except we actually download each image in the background, instead of just hotlinking or proxying the thumbnails through our servers. This avoids various problems with proxying and makes new features possible, like showing which images in the batch have already been posted.	2022-02-14 16:13:55 -06:00
evazion	e7744cb6e3	uploads: generate thumbnails in parallel. Make uploads faster by generating and saving thumbnails in parallel. We generate each thumbnail in parallel, then send each thumbnail to the backend image servers in parallel. Most images have 5 variants: 'preview' (150x150), 180x180, 360x360, 720x720, and 'sample' (850px width). Plus the original file, that's 6 files we have to save. In production we have 2 image servers, so we have to save each file twice, to 2 remote servers. Doing all this in parallel should make uploads significantly faster.	2022-02-04 16:20:50 -06:00
evazion	92a4d045e2	media assets: add thumbnail view to /media_assets page. Add a thumbnail view to the /media_assets page. This page lets you see all images uploaded to Danbooru by all users (although you can't see who the uploader is). Also add a link to this page in the subnav bar on the upload page.	2022-02-02 01:12:56 -06:00
evazion	43c4158d36	uploads: merge tags when a duplicate is uploaded (fix #3130 ). Automatically merge tags when uploading a duplicate. There are two cases: * You try to upload an image, but it's already on Danbooru. In this case you'll be immediately redirected to the original post, before you can start tagging the upload. * You're uploading an image, it wasn't a dupe when you first opened the upload page, but you got sniped while tagging it. In this case your tags will be merged with the original post, and you will be redirected to the original post. There are a few corner cases: * If you don't have permission to edit the original post, for example because it's banned or has a censored tag, then your tags won't be merged and will be silently ignored. * Only the tags, rating, and parent ID will be merged. The source and artist commentary won't be merged. This is so that if an artist uploads the exact same file to multiple sites, the new source won't override the original source. * Some tags might be contradictory. For example, the new post might be tagged translation_request, but the original post might already be translated. It's up to the user to fix these things afterwards.	2022-01-30 03:14:22 -06:00
evazion	11b7bcac91	uploads: fix broken tests. * Fix broken upload tests. * Fix uploads to return an error if both a file and a source are given at the same time, or if neither are given. Also fix the error message in this case so that it doesn't include "base" at the start of the string. * Fix uploads to percent-encode any Unicode characters in the source URL. * Add a max filesize validation to media assets.	2022-01-29 05:14:49 -06:00
evazion	abdab7a0a8	uploads: rework upload process. Rework the upload process so that files are saved to Danbooru first before the user starts tagging the upload. The main user-visible change is that you have to select the file first before you can start tagging it. Saving the file first lets us fix a number of problems: * We can check for dupes before the user tags the upload. * We can perform dupe checks and show preview images for users not using the bookmarklet. * We can show preview images without having to proxy images through Danbooru. * We can show previews of videos and ugoira files. * We can reliably show the filesize and resolution of the image. * We can let the user save files to upload later. * We can get rid of a lot of spaghetti code related to preprocessing uploads. This was the cause of most weird "md5 confirmation doesn't match md5" errors. (Not all of these are implemented yet.) Internally, uploading is now a two-step process: first we create an upload object, then we create a post from the upload. This is how it works: * The user goes to /uploads/new and chooses a file or pastes an URL into the file upload component. * The file upload component calls `POST /uploads` to create an upload. * `POST /uploads` immediately returns a new upload object in the `pending` state. * Danbooru starts processing the upload in a background job (downloading, resizing, and transferring the image to the image servers). * The file upload component polls `/uploads/$id.json`, checking the upload `status` until it returns `completed` or `error`. * When the upload status is `completed`, the user is redirected to /uploads/$id. * On the /uploads/$id page, the user can tag the upload and submit it. * The upload form calls `POST /posts` to create a new post from the upload. * The user is redirected to the new post. This is the data model: * An upload represents a set of files uploaded to Danbooru by a user. Uploaded files don't have to belong to a post. An upload has an uploader, a status (pending, processing, completed, or error), a source (unless uploading from a file), and a list of media assets (image or video files). * There is a has-and-belongs-to-many relationship between uploads and media assets. An upload can have many media assets, and a media asset can belong to multiple uploads. Uploads are joined to media assets through a upload_media_assets table. An upload could potentially have multiple media assets if it's a Pixiv or Twitter gallery. This is not yet implemented (at the moment all uploads have one media asset). A media asset can belong to multiple uploads if multiple people try to upload the same file, or if the same user tries to upload the same file more than once. New features: * On the upload page, you can press Ctrl+V to paste an URL and immediately upload it. * You can save files for upload later. Your saved files are at /uploads. Fixes: * Improved error messages when uploading invalid files, bad URLs, and when forgetting the rating.	2022-01-28 04:13:22 -06:00
evazion	1c5786d20f	posts: remove cropped thumbnails.	2021-12-16 15:58:29 -06:00
evazion	163ba8e7da	posts: micro-optimize allocations during thumbnail generation. Do a few micro-optimizations to reduce the number of memory allocations during thumbnail generation. This commit, combined with freezing string literals in a7dc05 and 67b961, reduces the number of allocations on the front page from 180,000 to 150,000, and the number of retained objects from 8,000 to 4,000.	2021-12-16 00:53:48 -06:00
evazion	a7dc05ce63	Enable frozen string literals. Make all string literals immutable by default.	2021-12-14 21:33:27 -06:00
evazion	c22f7b799b	media assets: fix error when generating thumbnails for corrupt files. Fix an error being raised when trying to generate thumbnails for corrupt files. If the original image is corrupt, then ignore any errors and let libvips try to generate a thumbnail as best it can. This will usually result in an incomplete thumbnail.	2021-12-05 21:46:14 -06:00
evazion	ad49a10147	media assets: fix bug in thumbnail generation. Fix thumbnail generation throwing a NoMatchingPatternError.	2021-12-05 19:04:17 -06:00
evazion	9cb70fa632	posts: add 720x720 thumbnail size. This is used to provide higher resolution thumbnails for high pixel density displays, such as phones or laptops. If your screen has a 2x pixel density ratio, then 360x360 thumbnails will be rendered at 720x720 resolution. We use WebP here because it's about 15% smaller than the equivalent JPEG, and because if a device has a high enough pixel density to use this, then it probably supports WebP. 720x720 thumbnails average about 36kb in size, compared to 20.35kb for 360x360 thumbnails and 7.55kb for 180x180 thumbnails.	2021-12-05 09:19:29 -06:00
evazion	17537084fe	posts: generate 180x180px and 360x360px thumbnails (#4932 ). Add two new thumbnail sizes. These new thumbnail sizes are generated on upload, but not used yet.	2021-12-02 23:42:44 -06:00
evazion	e5ba6d4afc	MediaFile: fix thumbnail dimension calculation. Calculate the dimensions of thumbnails ourselves instead of letting libvips calculate them for us. This way we know the exact size of thumbnails, so we can set the right width and height for <img> tags. If we let libvips calculate thumbnail sizes for us, then we can't predict the exact size of thumbnails, because sometimes libvips rounds numbers differently than us.	2021-12-01 04:45:26 -06:00
evazion	8f36ebe2b8	Fix #4914 : RuntimeError corrupting uploads Bug: If a media asset got stuck in the 'processing' state during upload, then it would stay stuck forever and the file couldn't be uploaded again later. Fix: Mark stuck assets as failed before raising the "Upload failed" error. Once the asset is marked as failed, it can be uploaded again later. Also, only wait for assets to finish processing if they were uploaded less than 5 minutes ago. If a processing asset is more than 5 minutes old, consider it stuck and mark it as failed immediately. Assets getting stuck in the processing state is a 'this should never happen' error. Normally if any kind of exception is raised while uploading the asset, the asset will be set to the 'failed' state. The only way an asset can get stuck is if it fails and the exception handler doesn't run, or the exception handler itself fails. This might happen if the process is unexpectedly killed, or possibly if the HTTP request times out and a TimeoutError is raised at an inopportune time. See below for discussion of issues with Timeout. [1]: https://vaneyckt.io/posts/the_disaster_that_is_rubys_timeout_method/ [2]: https://jvns.ca/blog/2015/11/27/why-rubys-timeout-is-dangerous-and-thread-dot-raise-is-terrifying/ [3]: https://adamhooper.medium.com/in-ruby-dont-use-timeout-77d9d4e5a001 [4]: https://ruby-doc.org/core-3.0.2/Thread.html#method-c-handle_interrupt-label-Guarding+from+Timeout-3A-3AError	2021-11-08 18:22:04 -06:00

1 2

69 Commits