Remove the `preview_urls` method from strategies. The only place this was used was
when doing IQDB searches, to download the thumbnail image from the source instead of
the full image.
This wasn't worth it for a few reasons:
* Thumbnails on other sites are sometimes not the size we want, which could affect
IQDB results.
* Grabbing thumbnails is complex for some sites. You can't always just rewrite the
image URL. Sometimes it requires extra API calls, which can be slower than just
grabbing the full image.
* For videos and animations, thumbnails from other sites don't always match our
thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which
means we don't always pick the first frame, which could affect IQDB results.
API changes:
* /iqdb_queries?search[file_url] now downloads the URL as is without any modification.
Before it tried to change thumbnail and sample size image URLs to the full version.
* /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that
contains multiple images. Before it would grab only the first image and silently
ignore the rest.
Remove the `image_url` method from source strategies. This method would
return only the first image if a source had multiple images. The
`image_urls` method should be used instead. Tests were the main place
that still used `image_url` instead of `image_urls`.
Also make post replacements return an error if replacing with a source
that contains multiple images, instead of just blindly replacing the
post with the first image in the source.
* Factor out the Cloudflare Polish bypass code to a standalone feature.
* Add `http_downloader` method to the base source strategy. This is a
HTTP client that should be used for downloading images or making
requests to images. This client ensures that referrer spoofing and
Cloudflare bypassing are performed.
This fixes a bug with the upload page reporting the polished filesize
instead of the original filesize when uploading ArtStation images.
* Move the source normalization logic out of the post model
and into individual sources' strategies.
* Rewrite normalization tests to be handled into each source's test,
and expand them significantly. Previously we were only testing
a very small subset of domains and variants.
* Fix up normalization for several sites.
* Normalize fav.me urls into normal deviantart urls.
Fix the moebooru strategy to fallback to returning the image url if we
can't find the preview url. Fixes iqdb lookups failing in some cases
because the strategy didn't return a valid url for preview_url.
* Rename `unique_id` to `tag_name`.
* Add `other_names` and `profile_urls` methods that sources can override
to provide extra names or urls when creating new artist entries.
* Normalize spaces to underscores when saving other names. Preserve case
since case can be significant.
* Fix WikiPage#other_names_include to search case-insensitively (note:
this prevents using the index).
* Fix sources to return the raw tags in `#tags` and the normalized tags
in `#normalized_tags`. The normalized tags are the tags that will be
matched against other names.
Fix sources choosing the wrong strategy when the referer belongs to a
different site (for example, when uploading a twitter post with a pixiv
referer).
* Fix `match?` to only consider the main url, not the referer.
* Change `match?` to match against a list of domains given by the `domains` method.
* Change `match?` to an instance method.
If the yande.re or konachan.com post has a source from a supported site,
for example Pixiv or Twitter, then delegate the artist and commentary
lookup to that substrategy.
Only do this for sources from recognized sites, not the null strategy.