Change how artist URLs are normalized in artist entries. Don't try to secretly
convert image URLs to profile URLs in artist entries. For example, if someone puts a
Pixiv image URL in an artist entry, don't secretly try to fetch the source and
convert it into a profile URL in the `normalized_url` field.
We did this because years ago, it was standard practice to put image URLs in artist
entries. Pixiv image URLs used to contain the artist's username, so we used to put
image URLs in artist entries for artist finding purposes. But Pixiv changed it so
that image URLs no longer contained the username, so we dealt with it by adding a
`normalized_url` column to artist_urls and secretly converting image URLs to profile
URLs in this field. But this is no longer necessary because now we don't normally put
image URLs in artist entries in the first place.
Now the `profile_url` method in `Source::URL` is used to normalize URLs in artist
entries. This lets us parse various profile URL formats and normalize them into a
single canonical form.
This also removes the `normalize_for_artist_finder` method from source strategies.
Instead the `profile_url` method is used for artist finding purposes. So the profile
URL returned by the source strategy needs to be the same as the URL in the artist
entry in order for artist finding to work.
Bug: We assumed the referer URL was from the same site as the target
URL. We tried to call methods on the referer only supported by the
target URL.
Fix: Ignore the referer URL when it's from a different site than the
target URL.
Remove the `preview_urls` method from strategies. The only place this was used was
when doing IQDB searches, to download the thumbnail image from the source instead of
the full image.
This wasn't worth it for a few reasons:
* Thumbnails on other sites are sometimes not the size we want, which could affect
IQDB results.
* Grabbing thumbnails is complex for some sites. You can't always just rewrite the
image URL. Sometimes it requires extra API calls, which can be slower than just
grabbing the full image.
* For videos and animations, thumbnails from other sites don't always match our
thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which
means we don't always pick the first frame, which could affect IQDB results.
API changes:
* /iqdb_queries?search[file_url] now downloads the URL as is without any modification.
Before it tried to change thumbnail and sample size image URLs to the full version.
* /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that
contains multiple images. Before it would grab only the first image and silently
ignore the rest.
Remove the `image_url` method from source strategies. This method would
return only the first image if a source had multiple images. The
`image_urls` method should be used instead. Tests were the main place
that still used `image_url` instead of `image_urls`.
Also make post replacements return an error if replacing with a source
that contains multiple images, instead of just blindly replacing the
post with the first image in the source.
* Fix error when uploading non-ugoira files.
* Fix sample image URLs not being rewritten to full images correctly. We
have to get the full image URL from the API because given an
/img-master/ URL, we don't know what the original file extension is.
* Drop support for preview_urls. This means that IQDB lookups may be
slower, especially for ugoiras, since we have to download the full
ugoira now. However, ugoira lookups should produce better results,
since the ugoira thumbnail chosen by Pixiv wasn't necessarily the same
as the thumbnail chosen by Danbooru.
* Drop support for uploading single manga pages:
http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=18557054&page=2
Previously uploading an URL like this would only upload a single image
out of a multi-image work. Now it will upload all images in the work.
Pixiv no longer supports URLs like this, so we don't either.
* Add support for parsing URLs like this:
https://i.pximg.net/c/360x360_70/custom-thumb/img/2022/03/08/00/00/56/96755248_p0_custom1200.jpg
Apparently artists can choose a custom thumbnail now (not like anyone
will try to upload one though).
Add upload support for Pixiv Sketch. Fetch tags, commentary, and artist,
and rewrite sample images to full images.
Authentication isn't required. R18 images are hidden in the browser but
visible in the API.
Additionally, fixed some broken tests and changed normalization for urls
of album type to point to the mobile version instead, because they're
only visible to logged-in users.
Add stricter username rules:
* Only allow usernames to contain basic letters, numbers, CJK characters, underscores, dashes and periods.
* Don't allow names to start or end with punctuation.
* Don't allow names to have multiple underscores in a row.
* Don't allow active users to have names that look like deleted users (e.g. "user_1234").
* Don't allow emoji or any other Unicode characters except for Chinese, Japanese, and Korean
characters. CJK characters are currently grandfathered in but will be disallowed in the future.
Users with an invalid name will be shown a permanent sitewide banner until they change their name.
Also fixes the uploader uploading all images when trying to upload only a
single image in a multi-image work. Caused by `image_urls` incorrectly
returning all images when the source strategy was given a url for a
single image.
* Remove unnecessary trailing slashes when artist URLs are saved.
* Automatically add `http://` to new artist URLs if it's missing (before
this was an error; now it's automatically fixed).
Fixes a bug where the Foundation source strategy failed because http.rb
automatically sent a `Content-Length: 0` header with all GET requests,
which caused Foundation to return a 400 Bad Request error. This behavior
was fixed in http.rb 5.x.
http.rb 5.x has a breaking change where it now includes the request object
inside the response object, which we have to handle in a few places.
Fix uploads for NicoSeiga sources not working because the strategy
returned URLs like the one below in the list of image_urls, which
require a login to download:
https://seiga.nicovideo.jp/image/source/10315315
Also fix certain URLs like https://dic.nicovideo.jp/oekaki/52833.png not
working, because they didn't contain an image ID and the image_urls
method returned an empty list in this case.
This exception was thrown by app/logical/pixiv_ajax_client.rb:406 when a
Pixiv API call failed with a network error. In this case we tried to log
the response body, but this failed because we returned a faked HTTP
response with an empty string for the body, which the http.rb library
didn't like because it was expecting an IO-like object for the body.
Fix URLs being normalized after checking for duplicates rather than
before, which meant that URLs that differed in capitalization weren't
detected as duplicates.
`string.mb_chars.downcase` was used to correctly downcase Unicode
characters when downcasing strings in Ruby <2.4. This hasn't been needed
since Ruby 2.4.
* Fix a bug where creating posts failed if IQDB wasn't configured.
* Fix broken Skeb test caused by changed URL.
* Fix broken IP geolocation tests caused by API returning different data.
* Fix broken post regeneration tests.
* Move replacement tests from test/unit/upload_service_test.rb to
test/functional/post_replacement_controller_test.rb
* Move UploadService::Replacer to PostReplacementProcessor.
* Fix a minor bug where if you used the API to replace a post with a file,
the replacement would fail unless you passed an empty string for the
replacement_url.