evazion
3aa5cab2aa
sources: refactor normalize_for_source.
...
`normalize_for_source` was used to convert image URLs to page URLs when displaying sources
on the post show page. Move all the code for converting image URLs to page URLs from
`Sources::Strategies#normalize_for_source` to `Source::URL#page_url`.
Before we had to be very careful in source strategies not to make any network calls in
`normalize_for_source`, since it was used in the view for the post show page. Now all the
code for generating page URLs is isolated in Source::URL, which makes source strategies
simpler. It also makes it easier to check if a source is an image URL or page URL, and if
the image URL is convertible to a page URL, which will make autotagging bad_link or
bad_source feasible.
Finally, this fixes it to generate better page URLs in a handful of cases:
* https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP
* https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6
* http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677
* https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png
* https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1
2022-03-23 01:34:04 -05:00
evazion
01b683798e
sources: add Tinami support.
2022-03-19 00:50:36 -05:00
evazion
40cbc0423c
sources: add Instagram profile url normalization.
2022-03-18 18:20:29 -05:00
evazion
03d2a86ef1
artists: normalize fc2.com profile urls.
2022-03-17 19:42:57 -05:00
evazion
9343f7c912
Source::URL: add profile_url method.
...
Add a method for converting a source URL into a profile URL. This will
be used for normalizing profile URLs in artist entries.
Also add the ability to parse a few more profile URL formats.
2022-03-13 03:54:17 -05:00
evazion
787b5c8e27
sources: merge Sta.sh strategy into DeviantArt strategy.
...
This turns out to be a little simpler than keeping them separate. The
only thing special we have to do for Sta.sh is use the Sta.sh page when
we have a DeviantArt image with a Sta.sh referer.
2022-03-12 00:57:43 -06:00
evazion
28971fe103
sources: factor out site_name method.
2022-03-11 23:20:53 -06:00
nonamethanks
a6549bc6fe
Add Fantia support
...
Also fixes a regression in 74fdeef10c
that stopped mastodon urls from being given the right priority.
2022-03-10 17:43:32 +01:00
evazion
43a665a66d
sources: factor out Source::URL::NicoSeiga.
2022-03-10 04:53:51 -06:00
evazion
34854185be
sources: factor out Source::URL::DeviantArt and Source::URL::Stash.
2022-03-10 00:29:49 -06:00
evazion
bb4b8619f5
pixiv: fix Source::URL::Pixiv not being included in Source::URL list.
2022-03-09 01:14:09 -06:00
evazion
6afb2f8e3c
Merge pull request #5037 from nonamethanks/tumblr-refactor
...
sources: factor out Source::URL::Tumblr
2022-03-08 23:26:30 -06:00
evazion
df0bb70486
sources: factor out Source::URL::PixivSketch.
...
Add upload support for Pixiv Sketch. Fetch tags, commentary, and artist,
and rewrite sample images to full images.
Authentication isn't required. R18 images are hidden in the browser but
visible in the API.
2022-03-08 18:24:12 -06:00
nonamethanks
b9c7e467e5
sources: factor out Source::URL::Tumblr
...
Also adds support for fetching source data from direct image urls when
possible.
2022-03-08 15:06:06 +01:00
nonamethanks
d8e2f2ee33
sources: factor out Source::URL::Weibo
...
Additionally, fixed some broken tests and changed normalization for urls
of album type to point to the mobile version instead, because they're
only visible to logged-in users.
2022-03-07 16:52:43 +01:00
evazion
1609059bf4
sources: factor out Source::URL::Fanbox.
...
Also fix it so that we grab the full image for cover URLs like this:
* Sample: https://pixiv.pximg.net/c/1620x580_90_a2_g5/fanbox/public/images/creator/1566167/cover/QqxYtuWdy4XWQx1ZLIqr4wvA.jpeg
* Full: https://pixiv.pximg.net/fanbox/public/images/creator/1566167/cover/QqxYtuWdy4XWQx1ZLIqr4wvA.jpeg
2022-02-28 06:25:06 -06:00
evazion
317ec886bc
sources: factor out Source::URL::Nijie.
...
Also fixes the uploader uploading all images when trying to upload only a
single image in a multi-image work. Caused by `image_urls` incorrectly
returning all images when the source strategy was given a url for a
single image.
2022-02-27 02:27:35 -06:00
evazion
fcf517834d
sources: factor out Source::URL::ArtStation.
2022-02-26 21:03:49 -06:00
evazion
9169f00e80
sources: factor out Source::URL::Moebooru.
2022-02-26 17:46:44 -06:00
evazion
74fdeef10c
sources: factor out Source::URL::Mastodon.
2022-02-26 15:08:27 -06:00
evazion
86d8e2d13d
sources: factor out Source::URL::Lofter.
2022-02-25 23:43:10 -06:00
evazion
f062f2d145
sources: factor out Source::URL::Newgrounds.
...
Also fix it so that the image URL is set as the source for Newgrounds
posts, not the page URL. It's possible to generate the page URL from the
image URL (except for images after the first in multi-image posts).
* Page: https://www.newgrounds.com/art/view/natthelich/weaver
* Image: https://art.ngfiles.com/images/1520000/1520217_natthelich_weaver.jpg?f1606365031
2022-02-25 23:04:03 -06:00
evazion
64472a7b7e
sources: factor out Source::URL::HentaiFoundry.
...
Add support for these URL types:
* http://pictures.hentai-foundry.com//s/soranamae/363663.jpg
* http://www.hentai-foundry.com/piccies/d/dmitrys/1183.jpg
* http://www.hentai-foundry.com/pic-149160.php
* http://www.hentai-foundry.com/user-RockCandy.php
* http://www.hentai-foundry.com/profile-sawao.php
These URL types are obsolete, but still present in some old posts.
2022-02-25 22:01:17 -06:00
evazion
e6ded89f85
sources: factor out Source::URL::Plurk.
...
Also fix it so that for adult works, we get the images posted by the
artist in the replies. Example: https://www.plurk.com/p/omc64y (nsfw).
2022-02-25 02:06:57 -06:00
evazion
26f4cf1ebd
sources: factor out Source::URL::Skeb.
2022-02-25 02:06:57 -06:00
evazion
ffe52f5ead
sources: factor out Source::URL::Foundation.
...
Add support for a couple more URL types:
* https://foundation.app/@asuka111art/dinner-with-cats-82426
* https://f8n-production-collection-assets.imgix.net/0x3B3ee1931Dc30C1957379FAc9aba94D1C48a5405/128711/QmcBfbeCMSxqYB3L1owPAxFencFx3jLzCPFx6xUBxgSCkH/nft.png
Also include these URLs in the list of profile URLs:
* https://foundation.app/0x7E2ef75C0C09b2fc6BCd1C68B6D409720CcD58d2 (for https://foundation.app/@mochiiimo )
These URLs should be stable even if the user changes their name.
2022-02-23 23:49:31 -06:00
evazion
043c08eb05
sources: factor out Source::URL::TwitPic.
2022-02-23 23:49:31 -06:00
evazion
7ed8f95a8e
sources: add Source::URL class; factor out Source::URL::Twitter.
...
Introduce a Source::URL class for parsing URLs from source sites. Refactor the Twitter
source strategy to use it.
This is the first step towards factoring all the URL parsing logic out of source
strategies and moving it to subclasses of Source::URL. Each site will have a subclass
of Source::URL dedicated to parsing URLs from that site. Source strategies will use
these classes to extract information from URLs.
This is to simplify source strategies. Most sites have many different URL formats we have
to parse or rewrite, and handling all these different cases tends to make source
strategies very complex. Isolating the URL parsing logic from the site scraping logic
should make source strategies easier to maintain.
2022-02-23 23:46:04 -06:00