downloads: refactor Downloads::File into Danbooru::Http.

Remove the Downloads::File class. Move download methods to
Danbooru::Http instead. This means that:

* HTTParty has been replaced with http.rb for downloading files.

* Downloading is no longer tightly coupled to source strategies. Before
  Downloads::File tried to automatically look up the source and download
  the full size image instead if we gave it a sample url. Now we can
  do plain downloads without source strategies altering the url.

* The Cloudflare Polish check has been changed from checking for a
  Cloudflare IP to checking for the CF-Polished header. Looking up the
  list of Cloudflare IPs was slow and flaky during testing.

* The SSRF protection code has been factored out so it can be used for
  normal http requests, not just for downloads.

* The Webmock gem can be removed, since it was only used for stubbing
  out certain HTTParty requests in the download tests. The Webmock gem
  is buggy and caused certain tests to fail during CI.

* The retriable gem can be removed, since we no longer autoretry failed
  downloads. We assume that if a download fails once then retrying
  probably won't help.
This commit is contained in:
evazion
2020-06-15 04:41:42 -05:00
committed by evazion
parent 10b7a53449
commit 26ad844bbe
15 changed files with 173 additions and 285 deletions

View File

@@ -14,6 +14,8 @@
module Sources
module Strategies
class Base
class DownloadError < StandardError; end
attr_reader :url, :referer_url, :urls, :parsed_url, :parsed_referer, :parsed_urls
extend Memoist
@@ -35,8 +37,8 @@ module Sources
# <tt>referrer_url</tt> so the strategy can discover the HTML
# page and other information.
def initialize(url, referer_url = nil)
@url = url
@referer_url = referer_url
@url = url.to_s
@referer_url = referer_url&.to_s
@urls = [url, referer_url].select(&:present?)
@parsed_url = Addressable::URI.heuristic_parse(url) rescue nil
@@ -144,10 +146,22 @@ module Sources
# Returns the size of the image resource without actually downloading the file.
def size
Downloads::File.new(image_url).size
http.head(image_url).content_length.to_i
end
memoize :size
# Download the file at the given url, or at the main image url by default.
def download_file!(download_url = image_url)
response, file = http.download_media(download_url)
raise DownloadError, "Download failed: #{download_url} returned error #{response.status}" if response.status != 200
file
end
def http
Danbooru::Http.public_only.timeout(30).max_size(Danbooru.config.max_file_size)
end
memoize :http
# The url to use for artist finding purposes. This will be stored in the
# artist entry. Normally this will be the profile url.
def normalize_for_artist_finder