Commit Graph

13 Commits

Author SHA1 Message Date
evazion
23b8350320 sources: add image_url?, page_url?, and profile_url? methods.
Add methods to Source::URL for determining whether a URL is an image
URL, a page URL, or a profile URL.

Also add more source URL tests and fix various URL parsing bugs.
2022-05-01 21:01:36 -05:00
evazion
d9d3c1dfe4 sources: rename Sources::Strategies to Source::Extractor.
Rename Sources::Strategies to Source::Extractor. A Source::Extractor
represents a thing that extracts information from a given URL.
2022-03-24 03:49:44 -05:00
evazion
4ef8178bd1 sources: remove canonical_url method.
Refactor source strategies to remove the `canonical_url` method.

`canonical_url` returned the URL that should be used as the source of
the post after upload. Now we simply use `Source::URL#page_url` to
determine the source after upload. If the source is an image URL that is
convertible to a page URL, then the image URL is used as the source. If
the source is an image URL that is not convertible to a page URL, then
the page URL is used as the source.

This simplifies source strategies so that all they have to care about is
implementing the `Source::URL#page_url` and `Sources::Strategies#page_url`
methods, and the preferred source will be chosen for posts automatically.
2022-03-23 23:38:06 -05:00
evazion
3aa5cab2aa sources: refactor normalize_for_source.
`normalize_for_source` was used to convert image URLs to page URLs when displaying sources
on the post show page. Move all the code for converting image URLs to page URLs from
`Sources::Strategies#normalize_for_source` to `Source::URL#page_url`.

Before we had to be very careful in source strategies not to make any network calls in
`normalize_for_source`, since it was used in the view for the post show page. Now all the
code for generating page URLs is isolated in Source::URL, which makes source strategies
simpler. It also makes it easier to check if a source is an image URL or page URL, and if
the image URL is convertible to a page URL, which will make autotagging bad_link or
bad_source feasible.

Finally, this fixes it to generate better page URLs in a handful of cases:

* https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP
* https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6
* http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677
* https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png
* https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1
2022-03-23 01:34:04 -05:00
evazion
a78b6528dc weibo: fix https://m.weibo.cn/detail/1234 urls not finding the artist.
Fix https://m.weibo.cn/detail/4506950043618873 type URLs not finding the
artist because the profile_url method returned nil instead of the actual
profile URL.
2022-03-18 06:01:51 -05:00
evazion
2f61486ac6 sources: remove image_url method from base strategy.
Remove the `image_url` method from source strategies. This method would
return only the first image if a source had multiple images. The
`image_urls` method should be used instead. Tests were the main place
that still used `image_url` instead of `image_urls`.

Also make post replacements return an error if replacing with a source
that contains multiple images, instead of just blindly replacing the
post with the first image in the source.
2022-03-11 01:59:21 -06:00
evazion
8bba1b0a54 weibo: add test for m.weibo.cn/detail urls. 2022-03-10 15:04:02 -06:00
nonamethanks
c9be77d1f8 Weibo: fix exception for deleted url 2022-03-09 05:31:38 +01:00
nonamethanks
d8e2f2ee33 sources: factor out Source::URL::Weibo
Additionally, fixed some broken tests and changed normalization for urls
of album type to point to the mobile version instead, because they're
only visible to logged-in users.
2022-03-07 16:52:43 +01:00
evazion
b7b6797b18 tests: skip weibo tests in CI.
Weibo tests keep failing randomly in CI. Give up and skip them to
prevent them from always marking commits as failed.
2021-09-20 02:17:31 -05:00
evazion
8dbeec4788 tests: fix unit tests. 2021-09-11 20:37:18 -05:00
nonamethanks
d339947647 Weibo: add source normalization 2020-05-28 01:05:11 +02:00
nonamethanks
5c7307a1c9 Add Weibo support 2020-05-27 11:30:05 +02:00