Fix `image_urls` returning `[nil]` when fetching data for a image URL
that was bad_id. In that case `original_urls` is empty, so we fall back
to using the deleted image URL as-is.
* Fix error when uploading non-ugoira files.
* Fix sample image URLs not being rewritten to full images correctly. We
have to get the full image URL from the API because given an
/img-master/ URL, we don't know what the original file extension is.
* Drop support for preview_urls. This means that IQDB lookups may be
slower, especially for ugoiras, since we have to download the full
ugoira now. However, ugoira lookups should produce better results,
since the ugoira thumbnail chosen by Pixiv wasn't necessarily the same
as the thumbnail chosen by Danbooru.
* Drop support for uploading single manga pages:
http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=18557054&page=2
Previously uploading an URL like this would only upload a single image
out of a multi-image work. Now it will upload all images in the work.
Pixiv no longer supports URLs like this, so we don't either.
* Add support for parsing URLs like this:
https://i.pximg.net/c/360x360_70/custom-thumb/img/2022/03/08/00/00/56/96755248_p0_custom1200.jpg
Apparently artists can choose a custom thumbnail now (not like anyone
will try to upload one though).
Add upload support for Pixiv Sketch. Fetch tags, commentary, and artist,
and rewrite sample images to full images.
Authentication isn't required. R18 images are hidden in the browser but
visible in the API.
Additionally, fixed some broken tests and changed normalization for urls
of album type to point to the mobile version instead, because they're
only visible to logged-in users.
Add stricter username rules:
* Only allow usernames to contain basic letters, numbers, CJK characters, underscores, dashes and periods.
* Don't allow names to start or end with punctuation.
* Don't allow names to have multiple underscores in a row.
* Don't allow active users to have names that look like deleted users (e.g. "user_1234").
* Don't allow emoji or any other Unicode characters except for Chinese, Japanese, and Korean
characters. CJK characters are currently grandfathered in but will be disallowed in the future.
Users with an invalid name will be shown a permanent sitewide banner until they change their name.
The median username length is 8 characters. The 99% percentile is 18
characters. The 99.9% percentile is 24 characters. About 750 users have
a name more than 24 characters long.
This doesn't do anything about existing users with long usernames.
Note that this is the length in Unicode codepoints, not grapheme
clusters. Some Unicode characters and emoji may be a single glyph but
composed of multiple codepoints.
Fix `Cannot write log file 'ffmpeg2pass-0.log' for pass-1 encoding: Permission denied` error
when uploading ugoira files. Caused by the fact that 2-pass encoding tries to write a log file in
the current directory by default, which fails in production because the default working directory in
the Docker image is /danbooru, which is read-only.
Also fixes the uploader uploading all images when trying to upload only a
single image in a multi-image work. Caused by `image_urls` incorrectly
returning all images when the source strategy was given a url for a
single image.
Add `#basename`, `#filename`, and `#file_ext` utility methods to
Danbooru::URL and change a few places to use them. Simplifies parsing
filenames in source URLs in various places.
Introduce a Source::URL class for parsing URLs from source sites. Refactor the Twitter
source strategy to use it.
This is the first step towards factoring all the URL parsing logic out of source
strategies and moving it to subclasses of Source::URL. Each site will have a subclass
of Source::URL dedicated to parsing URLs from that site. Source strategies will use
these classes to extract information from URLs.
This is to simplify source strategies. Most sites have many different URL formats we have
to parse or rewrite, and handling all these different cases tends to make source
strategies very complex. Isolating the URL parsing logic from the site scraping logic
should make source strategies easier to maintain.
Fix certain ugoiras having very low quality webm samples. This was
because we had a target bitrate of 5 Mbps, but this wasn't enough for
videos that were high resolution or that had choppy, hard-to-compress
motion, such as post 5081776 (nsfw).
NicoSeiga changed it so that on every login, you must enter a 2FA code
sent by email. This broke the NicoSeiga strategy. The fix is to just use
a static session cookie instead (and hope it doesn't expire, and isn't
tied to an IP).
The `nico_seiga_login` and `nico_seiga_password` config settings have
been removed from config/danbooru_default_config.rb and replaced by
`nico_seiga_user_session`. If you run your own Danbooru instance, you
will have to update your config file manually.
Introduce a Danbooru::URL class for dealing with URLs. This is a wrapper
around Addressable::URI that adds some additional helper methods. Most
significantly, the `parse` method only allows valid http/https URLs, and
it returns nil instead of raising an exception when the URL is invalid.
Fixes a bug where the Foundation source strategy failed because http.rb
automatically sent a `Content-Length: 0` header with all GET requests,
which caused Foundation to return a 400 Bad Request error. This behavior
was fixed in http.rb 5.x.
http.rb 5.x has a breaking change where it now includes the request object
inside the response object, which we have to handle in a few places.
Add data attributes to thumbnails on the /uploads, /upload_media_assets,
and /media_assets pages. Add a `data-is-posted` attribute for styling
thumbnails based on whether they've already been posted.
Do one less API call when fetching the image URLs for a Pixiv post. The
`is_ugoira?` check in `image_urls` caused us to do an extra API call
when fetching the image URLs for a non-ugoira post.
API calls to Pixiv take around ~800ms, so this reduces minimum upload
time for Pixiv posts from ~1.6 seconds (two calls) to ~0.8 seconds.