Commit Graph

130 Commits

Author SHA1 Message Date
evazion
c76e0bd4c1 gelbooru: fix normalization of old image URLs. 2022-10-30 17:26:43 -05:00
N. Oname
61112bc9a1 Merge pull request #5285 from nonamethanks/tests
Rewrite the tests for various source strategies
2022-10-23 18:05:54 +02:00
evazion
412b7f2727 http: split requests into internal and external requests.
Split requests made by Danbooru::Http into either internal or external
requests. Internal requests are API calls to internal services run by
Danbooru. External requests are requests to external websites, for
example fetching sources or downloading files. External requests may use
a HTTP proxy if one is configured. Internal requests don't.

Fixes a few source extractors not using the HTTP proxy for certain API calls.
2022-10-19 01:49:28 -05:00
nonamethanks
72528bdcb1 Twitter: rewrite tests
Also add partial support for profile banner images.
2022-10-17 18:53:11 +02:00
evazion
115521906c tumblr: fix failure to upload new Tumblr URLs containing the post title.
Fix failure to upload Tumblr URLs of this form:

* https://www.tumblr.com/munespice/683613396085719040/saur-family
2022-10-13 21:11:07 -05:00
evazion
268ec9118a tumblr: fix failure to upload certain video posts.
Fix failure to upload Tumblr video posts that contained a video URL of this form:

* https://va.media.tumblr.com/tumblr_rjoh0hR8Xe1teimlz_720.mp4
2022-10-13 21:10:29 -05:00
evazion
a07234121d tumblr: fixup for parsing www.tumblr.com/name URLs. 2022-10-13 00:26:16 -05:00
evazion
2e7b3cd80b tumblr: normalize https://www.tumblr.com/name artist URLs. 2022-10-12 23:55:17 -05:00
evazion
eb8f98e4a6 artists: normalize foriio.com artist URLs.
Normalize `https://fori.io/comori22` to `https://www.foriio.com/comori22` in artist entries.
2022-10-12 23:46:50 -05:00
evazion
8fbc6d1d3a gelbooru: fix exception in md5-based post urls.
Fix exception when trying to get the image URL for sources like
https://gelbooru.com/index.php?page=post&s=list&md5=04f2767c64593c3030ce74ecc2528704.
2022-10-11 01:31:49 -05:00
evazion
f05268df7f sources: add Gelbooru support.
Add support for uploading posts from Gelbooru. Note that the translated
tags will include both the Gelbooru tags and the tags from the Gelbooru
post's source. The commentary and artist information will also be taken
from the Gelbooru post's source. The source of the Danbooru post however
will be left as the Gelbooru post itself, not as the Gelbooru post's source.
2022-10-11 00:06:45 -05:00
evazion
c2adf279ee ugoira: remove the PixivUgoiraFrameData model.
Remove the last remaining uses of the PixivUgoiraFrameData model. As of
32bfb8407, Ugoira frame data is now stored in the MediaMetadata model,
under the `Ugoira:FrameDelays` EXIF field.

The pixiv_ugoira_frame_data table still exists, but it can be removed
after this commit is deployed.

Fixes #5264: Error when replacing with ugoira.
2022-10-10 18:21:30 -05:00
nonamethanks
f4b14ba23e Mastodon: rewrite tests 2022-10-08 15:55:06 +02:00
nonamethanks
3c8e8ad8d9 Artstation: rewrite tests 2022-10-07 21:37:22 +02:00
nonamethanks
775326dc37 Tumblr: fix crash when uploading image links from custom domains 2022-10-01 00:26:29 +02:00
nonamethanks
1d7caf703c Lofter: support another theme and rewrite tests 2022-09-30 22:04:40 +02:00
nonamethanks
d51cc17eaf Nicoseiga: rewrite tests and fix several bugs
* Fixed a bug where manga posts with a single tag would raise an error
* Fixed a bug where dic.nicovideo.jp/oekaki posts weren't uploadable due
  to SSL issues
* Added support for more manga corner cases
2022-09-29 14:37:46 +02:00
nonamethanks
5051c6649d Tumblr: parse new dashboard links 2022-09-28 17:00:08 +02:00
evazion
361af6a4cb posts: rework post events page.
* Add a global /post_events page that shows the history of all approvals,
  disapprovals, flags, appeals, and replacements on a single page.

* Redesign the /posts/:id/events page to show all approval, disapproval,
  flag, appeal, and replacement events for a single post (before it only
  showed approvals, flags, and appeals).

* Remove the replacement history link from the post show page. Replacements
  are now included in the post events page (closes #4948: Highlighed replacements).

* Add /post_approvals/:id and /post_replacements/:id routes (these are
  used by the "Details" link on the post events page).
2022-09-24 20:12:41 -05:00
evazion
abf493794f twitter: fix misparsing of https://twitter.com/i/status/:id urls.
Fix URLs like `https://twitter.com/i/status/943446161586733056` parsing
the username as `i`. This led to the new artist page recommending the
tag name `i` when creating an artist for a source like this.

Also fix these URLs not being normalized to `https://twitter.com/:username/status/:id` after upload.
2022-09-15 19:57:12 -05:00
evazion
d2147eca80 tumblr: fix exception when fetching data for video urls.
Fix an exception when trying to fetch source data for URLs like
https://va.media.tumblr.com/tumblr_pgohk0TjhS1u7mrsl.mp4.

For these URLs it's not possible to use the trick where we try to open
the URL as a HTML page and scrape the post id from the HTML. Instead we
get the raw video if we try to to this.
2022-09-05 16:15:47 -05:00
evazion
f55951ab58 tumblr: fix exception when parsing mangled image urls.
Fix a nil exception when trying to parse invalid URLs like `https://25.media.tumblr.com/91719d337b218681abc48cdc24e`.
2022-09-05 16:15:46 -05:00
evazion
2b76a4c5ba tumblr: fix exception when parsing subdomainless Tumblr URLs.
Fix exception when a post has a Tumblr source without a subdomain, such
as `https://tumblr.com`.
2022-08-30 01:52:55 -05:00
evazion
f7794de0b7 weibo: fix bad artist name suggestions in new artist form.
Fix the new artist form suggesting invalid Chinese tag names for Weibo
artists. Suggest `weibo_123456` instead as a placeholder.
2022-08-26 01:25:05 -05:00
evazion
4d009568fd Fix #5165: add support for weibo share urls 2022-08-26 01:12:23 -05:00
evazion
600bdc9ae6 pixiv: drop support for https://tc-pximg01.techorus-cdn.com urls.
This was an obsolete URL format briefly used by Pixiv around 2019-2020.
There were only ~80 posts with sources using this format. They have been
manually fixed.
2022-08-24 15:54:10 -05:00
evazion
bf3ee9cfb8 Fix #5238: Trying to upload a pixiv direct image url that got trumped by a revision redirects to the new post if it's uploaded.
Bug: When uploading a direct Pixiv image URL, we ignored it in favor of the
image URL returned by the Pixiv API. This meant if you tried to upload the
original version of a revised image, we would get the revised version instead.

Fix: When given a direct Pixiv image URL, use it as-is if it's a full
image URL. If it's a sample image URL, ignore it in favor of the full image
URL as returned by the API, unless the post is deleted and the API data
is unavailable.
2022-08-24 15:40:04 -05:00
evazion
f46134e87f Fix #5234: Weibo URLs get normalized incorrectly in some cases. 2022-08-24 14:47:00 -05:00
evazion
e3af738371 tests: fix broken tests. 2022-08-24 02:03:37 -05:00
evazion
09dfab1f0d hentai foundry: update url for Hentai Foundry tags.
Change the URL used for Hentai Foundry tags from:

    https://www.hentai-foundry.com/search/index?query=elf&search_in=keywords

to:

    https://www.hentai-foundry.com/pictures/tagged/elf
2022-08-24 00:25:37 -05:00
evazion
2c36e02810 foundation.app: fix scraping of image urls.
Foundation changed their HTML page format and we can no longer scrape
the image URL directly from the page. Instead we have to build it based
on API data.
2022-08-24 00:25:37 -05:00
evazion
228850b749 newgrounds: support parsing video urls.
Fixes URLS like `https://www.newgrounds.com/portal/view/830293` being treated as bad_source.
2022-08-23 13:39:32 -05:00
evazion
9c2d362e93 tumblr: fix misparsing of image urls.
Fix URLs like https://yogurtmedia.tumblr.com/post/45732863347 being
misparsed as image urls.
2022-08-20 21:20:46 -05:00
evazion
9cab67c0ac artstation: fix parsing of reserved usernames. 2022-07-06 16:00:54 -05:00
evazion
7149845677 Merge pull request #5202 from nonamethanks/fix-nicoseiga-oekaki-bad-tag
Nicoseiga: normalize oekaki links
2022-06-05 15:56:37 -05:00
nonamethanks
e7584c7e0a Nicoseiga: normalize oekaki links 2022-06-04 22:57:54 +02:00
nonamethanks
2fd8e9bc14 Deviantart: fix regression in 3a0a32b98a 2022-06-04 20:26:14 +02:00
nonamethanks
3a0a32b98a Fix deviantart strategy to get biggest available size 2022-05-27 17:07:22 +02:00
evazion
6b54415c47 Merge pull request #5170 from nonamethanks/fix-fc2-bad-source
Fc2: don't mark valid blog page sources as bad_source
2022-05-16 15:12:07 -05:00
nonamethanks
dcbb2216aa Fc2: don't mark valid blog page sources as bad_source 2022-05-15 18:46:50 +02:00
evazion
80f3778616 Merge pull request #5159 from nonamethanks/fix-furaffinity-ascii-urls
Furaffinity: fix uploads for non-ascii image urls
2022-05-09 14:16:32 -05:00
nonamethanks
5b8402751c Furaffinity: fix uploads for non-ascii image urls
Use Addressable::URI, which supports non-ascii urls.
2022-05-09 18:38:38 +02:00
evazion
c07b099bf8 Fix #5152: Nicovideo video urls getting bad_source. 2022-05-03 03:59:15 -05:00
evazion
2d9bba4abb posts: automatically add the bad_link and bad_source tags.
Automatically add the bad_link tag when the source is an image url from
a known site, but it can't be converted to a page url (for example, a
Twitter or Tumblr direct image link).

Automatically add the bad_source tag when the source is from a known
site, but it's not an image or page url (for example, a Twitter or Pixiv
profile url)
2022-05-01 21:01:36 -05:00
evazion
23b8350320 sources: add image_url?, page_url?, and profile_url? methods.
Add methods to Source::URL for determining whether a URL is an image
URL, a page URL, or a profile URL.

Also add more source URL tests and fix various URL parsing bugs.
2022-05-01 21:01:36 -05:00
nonamethanks
8edd5dd810 Add furaffinity support 2022-04-27 03:47:59 +02:00
evazion
76d9e86724 Fix #5140: Unexpected error: PublicSuffix::DomainInvalid for searching some newgrounds urls in /artists
When the artist name couldn't found for a Newgrounds URL, for example
for `https://www.newgrounds.com/dump/item`, then the `profile_url`
method erroneously returned `https://.newgrounds.com`. This led to an
error later on when the artist finder tried to parse the invalid URL.

Also fix `strategy_should_work` to test that the profile URL is a valid
URL, and not to try to download the file when image_urls is empty.
2022-04-22 23:16:41 -05:00
evazion
90182148aa Merge pull request #5137 from nonamethanks/foundation-videos
Foundation: fix some video posts not being extracted
2022-04-22 01:50:26 -05:00
evazion
57a92ad336 Fix #5072: Fandom source normalization is wrong 2022-04-22 01:27:17 -05:00
evazion
40dda8a672 Merge pull request #5138 from nonamethanks/fix-fandom-links
Fix normalization for fandom sources
2022-04-22 00:36:11 -05:00