Commit Graph

16 Commits

Author SHA1 Message Date
evazion
e3af738371 tests: fix broken tests. 2022-08-24 02:03:37 -05:00
evazion
09dfab1f0d hentai foundry: update url for Hentai Foundry tags.
Change the URL used for Hentai Foundry tags from:

    https://www.hentai-foundry.com/search/index?query=elf&search_in=keywords

to:

    https://www.hentai-foundry.com/pictures/tagged/elf
2022-08-24 00:25:37 -05:00
evazion
2c36e02810 foundation.app: fix scraping of image urls.
Foundation changed their HTML page format and we can no longer scrape
the image URL directly from the page. Instead we have to build it based
on API data.
2022-08-24 00:25:37 -05:00
nonamethanks
2fd8e9bc14 Deviantart: fix regression in 3a0a32b98a 2022-06-04 20:26:14 +02:00
nonamethanks
3a0a32b98a Fix deviantart strategy to get biggest available size 2022-05-27 17:07:22 +02:00
nonamethanks
5b8402751c Furaffinity: fix uploads for non-ascii image urls
Use Addressable::URI, which supports non-ascii urls.
2022-05-09 18:38:38 +02:00
evazion
23b8350320 sources: add image_url?, page_url?, and profile_url? methods.
Add methods to Source::URL for determining whether a URL is an image
URL, a page URL, or a profile URL.

Also add more source URL tests and fix various URL parsing bugs.
2022-05-01 21:01:36 -05:00
nonamethanks
8edd5dd810 Add furaffinity support 2022-04-27 03:47:59 +02:00
evazion
76d9e86724 Fix #5140: Unexpected error: PublicSuffix::DomainInvalid for searching some newgrounds urls in /artists
When the artist name couldn't found for a Newgrounds URL, for example
for `https://www.newgrounds.com/dump/item`, then the `profile_url`
method erroneously returned `https://.newgrounds.com`. This led to an
error later on when the artist finder tried to parse the invalid URL.

Also fix `strategy_should_work` to test that the profile URL is a valid
URL, and not to try to download the file when image_urls is empty.
2022-04-22 23:16:41 -05:00
nonamethanks
e6cb255a7a Foundation: fix some video posts not being extracted
Also adjusts SourceTestHelper to not autogenerate contexts, so that
tests can be launched individually.
2022-04-21 17:54:22 +02:00
nonamethanks
c9227645d9 Add anifty.jp support 2022-04-18 16:50:26 +02:00
nonamethanks
9612578fcb Add Booth support 2022-04-16 17:52:18 +02:00
evazion
ca8083465b newgrounds: exclude links to other works in commentary.
Sometimes when a Newgrounds post is part of a set, there is a list of
links to other posts in the set in the artist's commentary. Exclude
these links because they're not really part of the commentary.

Example: https://www.newgrounds.com/art/view/boxofwant/annie-hughes-1 (NSFW)
2022-04-02 23:13:26 -05:00
evazion
6807ed7786 Fix #5077: Images rated "Adult" on Newgrounds no longer upload. 2022-04-02 17:55:29 -05:00
evazion
bfbc932025 Fix #5082: NoMethodError when searching an old-style dead fanbox url in artist urls.
This API call:

    # profile: https://www.pixiv.net/fanbox/creator/40684196
    curl -H "Origin: https://fanbox.cc" "https://api.fanbox.cc/creator.get?userId=40684196"

returns `{ "body": nil }` when the artist is deleted. We didn't expect `body` to be nil.

Also fix it so that `profile_url` returns the `https://www.pixiv.net/fanbox/creator/40684196`
URL if we can't get the `https://<username>.fanbox.cc` URL, usually because the API call failed
because the artist is deleted.
2022-03-30 18:19:08 -05:00
evazion
d9d3c1dfe4 sources: rename Sources::Strategies to Source::Extractor.
Rename Sources::Strategies to Source::Extractor. A Source::Extractor
represents a thing that extracts information from a given URL.
2022-03-24 03:49:44 -05:00