Fix the null source strategy setting the page URL. The page URL is
expected to be nil when we can't determine the page containing the image URL.
Fixes the upload_media_assets.page_url field being filled for uploads
from unknown sites.
Raise the timeout for downloading files from the source to 60 seconds globally.
Previously had a lower timeout because uploads were processed in the
foreground when not using the bookmarklet, and we didn't want to tie up
Puma worker processes with slow downloads. Now that all uploads are
processed in the background, we can have a higher timeout.
Make the upload page automatically detect when a source URL has multiple images
and let the user choose which images to post.
For example, when uploading a Twitter or Pixiv post with more than one image, we
direct the user to a page showing a thumbnail for each image and letting
them choose which ones to post.
This is similar to the batch upload page, except we actually download each image
in the background, instead of just hotlinking or proxying the thumbnails through
our servers. This avoids various problems with proxying and makes new features
possible, like showing which images in the batch have already been posted.
This page shows each individual file you've uploaded. This is different
from the regular uploads page because files in multi-file uploads are
not grouped together.
* Make thumbnails on the "My Uploads" page show an icon with an image
count when an upload contains multiple files.
* Make the "My Uploads" page show each upload, not each individual file.
If an upload contains multiple files, they're shown grouped together
under a single upload. This does mean that failed or duplicate uploads
will show up on this page now. This is because this page shows each
upload attempt, not each uniquely uploaded file.
Make media assets show a placeholder thumbnail when the image is
missing. This can happen if the upload is still processing, or if the
media asset's image was expunged, or if the asset failed during upload
(usually because of some temporary network failure when trying to
distribute thumbnails to the backend image servers).
Fixes a problem where new images on the My Uploads or All Uploads pages
could have broken thumbnails if they were still in the uploading phase.
Include media assets in /uploads.json and /uploads/:id.json API responses, like this:
{
"id": 4983629,
"source": "https://www.pixiv.net/en/artworks/96198438",
"uploader_id": 52664,
"status": "completed",
"created_at": "2022-02-12T16:26:04.680-06:00",
"updated_at": "2022-02-12T16:26:08.071-06:00",
"referer_url": "",
"error": null,
"media_asset_count": 1,
"upload_media_assets": [
{
"id": 9370,
"created_at": "2022-02-12T16:26:08.068-06:00",
"updated_at": "2022-02-12T16:26:08.068-06:00",
"upload_id": 4983629,
"media_asset_id": 5206552,
"status": "pending",
"source_url": "https://i.pximg.net/img-original/img/2022/02/13/01/20/19/96198438_p0.jpg",
"error": null,
"page_url": "https://www.pixiv.net/artworks/96198438",
"media_asset": {
"id": 5206552,
"created_at": "2022-02-12T16:26:07.980-06:00",
"updated_at": "2022-02-12T16:26:08.061-06:00",
"md5": "90a85a5fae5f0e86bdb2501229af05b7",
"file_ext": "jpg",
"file_size": 1055775,
"image_width": 1052,
"image_height": 1545,
"duration": null,
"status": "active"
}
}
]
}
This is needed so you can check for upload errors in the API, since in a multi-file
upload, each asset can have a separate error message. This is a stopgap solution until
something like /uploads.json?include=upload_media_assets.media_asset works.
* Save the filename for files uploaded from disk. This could be used in
the future to extract source data if the filename is from a known site.
* Save both the image URL and the page URL for files uploaded from
source. This is needed for multi-file uploads. The image URL is the
URL of the file actually downloaded from the source. This can be
different from the URL given by the user, if the user tried to upload
a sample URL and we automatically changed it to the original URL. The
page URL is the URL of the page containing the image. We don't always
know this, for example if someone uploads a Twitter image without the
bookmarklet, then we can't find the page URL.
* Add a fix script to backfill URLs for existing uploads. For file
uploads, the filename will be set to "unknown.jpg". For source
uploads, we fetch the source data again to get the image and page
URLs. This may fail for uploads that have been deleted from the
source since uploading.
This exception was thrown by app/logical/pixiv_ajax_client.rb:406 when a
Pixiv API call failed with a network error. In this case we tried to log
the response body, but this failed because we returned a faked HTTP
response with an empty string for the body, which the http.rb library
didn't like because it was expecting an IO-like object for the body.
This is needed for multi-file uploads. We need to know both the image
url and the page url to set the post's source correctly when converting
an upload media asset into a post.
Make upload_media_assets.media_asset_id nullable in order to support
multi-file uploads. The media asset will be null while the image is
still being downloaded from the source.
* uploads.media_asset_count - the number of media assets attached to this upload.
* upload_media_assets.status - the status of each media asset attached to this upload (processing, active, failed)
* upload_media_assets.source_url - the source of each media asset attached to this upload
* upload_media_assets.error - the error message if uploading the media asset failed
* `float` is used for MediaAsset durations.
* `interval` is used for Ban durations.
* `uuid` is used for GoodJob IDs.
* `datetime` is used for created_at/updated_at timestamps. The format is
`2022-02-04 08:33:36 -0800`.
On the post index page, show the "Artist" tab instead of the "Wiki" tab when searching for
an artist tag that doesn't have an artist entry. This way the user is prompted to create a
new artist entry instead of a new wiki.
* Group URLs by site.
* List most important URLs first and dead URLs last.
* Add site icons next to URLs.
* Put other names and group name beneath the artist name, instead of beneath the wiki.
Fix URLs being normalized after checking for duplicates rather than
before, which meant that URLs that differed in capitalization weren't
detected as duplicates.
Followup to ef0d8151d. Add symlinks from app/components/**/*.js to
app/javascript/src/javascripts/*.js so you can still see a component's
Javascript inside the component.
Move Javascript files from app/components/**/*.js back to app/javascript/src/javascripts/*.js.
This way Javascript files are in one place, which simplifies import paths and makes it
easier to see all Javascript at once.
Fix requests for non-existent .js pages, for example https://danbooru.donmai.us/oaisfj.js,
raising AbstractController::DoubleRenderError when trying to render the 404 response.
Fix two issues that could lead to duplicate errors when creating posts:
* Fix the submit button on the upload form to disable itself on submit, to prevent
accidental double submit errors.
* Fix a race condition when checking for MD5 duplicates. MD5 uniqueness is checked on both
the Rails level, with a uniqueness validation, and on the database level, with a unique
index on the md5 column. Creating a post could fail with an ActiveRecord::RecordNotUnique
error if the uniqueness validation in Rails passed, but the uniqueness constraint in the
database failed. In this case, we catch the RecordNotUnique error and convert it to a
Rails validation error so we can treat it like a normal validation failure.
Fix buttons appearing to be clickable when in the disabled state.
Submit buttons are normally disabled after a form is submitted. Before
these buttons would still look clickable. Now disabled buttons are greyed
out instead of looking the same as normal buttons.
Delete all old upload records from before the upload rework in abdab7a0a
/ f11c46b4f. Uploads from before the rework don't have any attached
media assets, so they're not valid under the new system because we can't
find which files they were for.
Before the rework, completed uploads were only saved for 1 hour, and
failed uploads were only saved for 3 days, so deleting this data
doesn't really lose anything that wouldn't have been deleted before.
Fix a potential exploit where private information could be leaked if
it was contained in the error message of an unexpected exception.
For example, NoMethodError contains a raw dump of the object in the
error message, which could leak private user data if you could force a
User object to raise a NoMethodError.
Fix the error page to only show known-safe error messages from expected
exceptions, not unknown error messages from unexpected exceptions.
API changes:
* JSON errors now have a `message` param. The message will be blank for unknown exceptions.
* XML errors have a new format. This is a breaking change. They now look like this:
<result>
<success type="boolean">false</success>
<error>PaginationExtension::PaginationError</error>
<message>You cannot go beyond page 5000.</message>
<backtrace type="array">
<backtrace>app/logical/pagination_extension.rb:54:in `paginate'</backtrace>
<backtrace>app/models/application_record.rb:17:in `paginate'</backtrace>
<backtrace>app/logical/post_query_builder.rb:529:in `paginated_posts'</backtrace>
<backtrace>app/logical/post_sets/post.rb:95:in `posts'</backtrace>
<backtrace>app/controllers/posts_controller.rb:22:in `index'</backtrace>
</backtrace>
</result>
instead of like this:
<result success="false">You cannot go beyond page 5000.</result>
Add more sensitive attributes to the filtered parameters list so that
they aren't shown in exception messages, and aren't logged in log files
or to NewRelic.
Only do this in production so that in testing and development, you can
still see these things when inspecting objects on the console.