sources: refactor normalize_for_source.

`normalize_for_source` was used to convert image URLs to page URLs when displaying sources
on the post show page. Move all the code for converting image URLs to page URLs from
`Sources::Strategies#normalize_for_source` to `Source::URL#page_url`.

Before we had to be very careful in source strategies not to make any network calls in
`normalize_for_source`, since it was used in the view for the post show page. Now all the
code for generating page URLs is isolated in Source::URL, which makes source strategies
simpler. It also makes it easier to check if a source is an image URL or page URL, and if
the image URL is convertible to a page URL, which will make autotagging bad_link or
bad_source feasible.

Finally, this fixes it to generate better page URLs in a handful of cases:

* https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP
* https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6
* http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677
* https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png
* https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1
This commit is contained in:
evazion
2022-03-23 00:41:56 -05:00
parent 770f850c66
commit 3aa5cab2aa
59 changed files with 471 additions and 484 deletions

View File

@@ -1,7 +1,7 @@
# frozen_string_literal: true
class Source::URL::Fc2 < Source::URL
attr_reader :username, :profile_url
attr_reader :username, :profile_url, :page_url
def self.match?(url)
url.domain.in?(%w[fc2.com fc2blog.net fc2blog.us])
@@ -48,6 +48,7 @@ class Source::URL::Fc2 < Source::URL
# http://blog.fc2.com/g/b/o/gbot/20071023195141.jpg
in (/^blog-imgs-\d+(-origin)?$/ | "blog"), "fc2", "com", /^\w$/, /^\w$/, /^\w$/, username, file
@username = username
@page_url = "http://#{username}.blog.fc2.com/img/#{file}"
@profile_url = "http://#{username}.blog.fc2.com"
# http://diary.fc2.com/user/yuuri/img/2005_12/26.jpg
@@ -55,6 +56,9 @@ class Source::URL::Fc2 < Source::URL
# http://diary.fc2.com/user/kazuharoom/img/2015_5/22.jpg
in /diary\d*$/, "fc2", "com", "user", username, "img", date, file
@username = username
@year, @month = date.split("_")
@day = filename
@page_url = "http://#{host}/cgi-sys/ed.cgi/#{username}?Y=#{@year}&M=#{@month}&D=#{@day}"
@profile_url = "http://diary.fc2.com/cgi-sys/ed.cgi/#{username}"
# http://diary.fc2.com/cgi-sys/ed.cgi/kazuharoom/?Y=2012&M=10&D=22