Commit Graph

10 Commits

Author SHA1 Message Date
evazion
4d009568fd Fix #5165: add support for weibo share urls 2022-08-26 01:12:23 -05:00
evazion
f46134e87f Fix #5234: Weibo URLs get normalized incorrectly in some cases. 2022-08-24 14:47:00 -05:00
evazion
23b8350320 sources: add image_url?, page_url?, and profile_url? methods.
Add methods to Source::URL for determining whether a URL is an image
URL, a page URL, or a profile URL.

Also add more source URL tests and fix various URL parsing bugs.
2022-05-01 21:01:36 -05:00
nonamethanks
e1b9166a56 Sources: do not use an empty else in case blocks 2022-04-22 03:53:18 +02:00
evazion
3aa5cab2aa sources: refactor normalize_for_source.
`normalize_for_source` was used to convert image URLs to page URLs when displaying sources
on the post show page. Move all the code for converting image URLs to page URLs from
`Sources::Strategies#normalize_for_source` to `Source::URL#page_url`.

Before we had to be very careful in source strategies not to make any network calls in
`normalize_for_source`, since it was used in the view for the post show page. Now all the
code for generating page URLs is isolated in Source::URL, which makes source strategies
simpler. It also makes it easier to check if a source is an image URL or page URL, and if
the image URL is convertible to a page URL, which will make autotagging bad_link or
bad_source feasible.

Finally, this fixes it to generate better page URLs in a handful of cases:

* https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP
* https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6
* http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677
* https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png
* https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1
2022-03-23 01:34:04 -05:00
evazion
133c45ee29 sources: parse more profile url formats.
Add support for parsing these URL formats:

* https://www.artstation.com/felipecartin/profile
* https://www.deviantart.com/nlpsllp/gallery
* https://fantia.jp/asanagi
* https://www.lofter.com/front/blog/home-page/noshiqian
* https://www.lofter.com/app/xiaokonggedmx
* https://www.lofter.com/blog/semblance
* https://q.nicovideo.jp/users/18700356
* https://dic.nicovideo.jp/u/11141663
* https://3d.nicovideo.jp/users/109584
* https://3d.nicovideo.jp/u/siobi
* https://game.nicovideo.jp/atsumaru/users/7757217
* https://www.pixiv.net/user/13569921/series/81967
* https://pixiv.cc/zerousagi/
* https://www.plurk.com/u/ddks2923
* https://www.plurk.com/m/u/leiy1225
* https://www.plurk.com/s/u/salmonroe13
* https://www.plurk.com/RSSSww/invite/4
* https://skeb.jp/@okku_oxn/works
* https://www.tumblr.com/blog/view/artofelaineho/187614935612
* https://www.tumblr.com/blog/view/artofelaineho
* https://www.tumblr.com/blog/artofelaineho
* https://www.tumblr.com/dashboard/blog/dankwartart
* https://rosarrie.tumblr.com/archive
* https://whereisnovember.tumblr.com/tagged/art
* https://twitpic.com/photos/Type10TK
* https://www.weibo.com/detail/4676597657371957
* https://www.weibo.com/u/5957640693/home?wvr=5
* https://www.weibo.com/lvxiuzi0/home
2022-03-15 00:49:54 -05:00
evazion
1d9a15a119 weibo: handle a couple more profile url types.
Parse these profile URL types:

* https://www.weibo.cn/endlessnsmt
* https://www.weibo.com/p/1005055399876326

Also add anchors around the regexes so they have to match the full string.
2022-03-13 20:32:57 -05:00
evazion
9343f7c912 Source::URL: add profile_url method.
Add a method for converting a source URL into a profile URL. This will
be used for normalizing profile URLs in artist entries.

Also add the ability to parse a few more profile URL formats.
2022-03-13 03:54:17 -05:00
nonamethanks
93adba06e5 weibo: fix typo in strategy 2022-03-10 08:31:23 +01:00
nonamethanks
d8e2f2ee33 sources: factor out Source::URL::Weibo
Additionally, fixed some broken tests and changed normalization for urls
of album type to point to the mobile version instead, because they're
only visible to logged-in users.
2022-03-07 16:52:43 +01:00