danbooru

Author	SHA1	Message	Date
evazion	8fbc6d1d3a	gelbooru: fix exception in md5-based post urls. Fix exception when trying to get the image URL for sources like https://gelbooru.com/index.php?page=post&s=list&md5=04f2767c64593c3030ce74ecc2528704.	2022-10-11 01:31:49 -05:00
evazion	f05268df7f	sources: add Gelbooru support. Add support for uploading posts from Gelbooru. Note that the translated tags will include both the Gelbooru tags and the tags from the Gelbooru post's source. The commentary and artist information will also be taken from the Gelbooru post's source. The source of the Danbooru post however will be left as the Gelbooru post itself, not as the Gelbooru post's source.	2022-10-11 00:06:45 -05:00
evazion	c2adf279ee	ugoira: remove the PixivUgoiraFrameData model. Remove the last remaining uses of the PixivUgoiraFrameData model. As of `32bfb8407`, Ugoira frame data is now stored in the MediaMetadata model, under the `Ugoira:FrameDelays` EXIF field. The pixiv_ugoira_frame_data table still exists, but it can be removed after this commit is deployed. Fixes #5264: Error when replacing with ugoira.	2022-10-10 18:21:30 -05:00
nonamethanks	775326dc37	Tumblr: fix crash when uploading image links from custom domains	2022-10-01 00:26:29 +02:00
nonamethanks	1d7caf703c	Lofter: support another theme and rewrite tests	2022-09-30 22:04:40 +02:00
nonamethanks	d51cc17eaf	Nicoseiga: rewrite tests and fix several bugs * Fixed a bug where manga posts with a single tag would raise an error * Fixed a bug where dic.nicovideo.jp/oekaki posts weren't uploadable due to SSL issues * Added support for more manga corner cases	2022-09-29 14:37:46 +02:00
nonamethanks	5051c6649d	Tumblr: parse new dashboard links	2022-09-28 17:00:08 +02:00
evazion	fc122cbc5a	tests: fix broken tests.	2022-09-24 03:49:10 -05:00
evazion	1d2bac7b95	Remove CurrentUser.ip_addr. Remove the `CurrentUser.ip_addr` global variable and replace it with `request.remote_ip`. Before we had to track the current user's IP in a global variable so that when we edited a post for example, we could pass down the user's IP to the model and save it in the post_versions table. Now that we now longer save IPs in version tables, we don't need a global variable to get access to the current user's IP outside of controllers.	2022-09-18 05:02:10 -05:00
evazion	abf493794f	twitter: fix misparsing of https://twitter.com/i/status/:id urls. Fix URLs like `https://twitter.com/i/status/943446161586733056` parsing the username as `i`. This led to the new artist page recommending the tag name `i` when creating an artist for a source like this. Also fix these URLs not being normalized to `https://twitter.com/:username/status/:id` after upload.	2022-09-15 19:57:12 -05:00
nonamethanks	425a905b83	tests: update tumblr tests	2022-09-15 09:48:28 +02:00
evazion	d2147eca80	tumblr: fix exception when fetching data for video urls. Fix an exception when trying to fetch source data for URLs like https://va.media.tumblr.com/tumblr_pgohk0TjhS1u7mrsl.mp4. For these URLs it's not possible to use the trick where we try to open the URL as a HTML page and scrape the post id from the HTML. Instead we get the raw video if we try to to this.	2022-09-05 16:15:47 -05:00
evazion	f55951ab58	tumblr: fix exception when parsing mangled image urls. Fix a nil exception when trying to parse invalid URLs like `https://25.media.tumblr.com/91719d337b218681abc48cdc24e`.	2022-09-05 16:15:46 -05:00
evazion	2b76a4c5ba	tumblr: fix exception when parsing subdomainless Tumblr URLs. Fix exception when a post has a Tumblr source without a subdomain, such as `https://tumblr.com`.	2022-08-30 01:52:55 -05:00
evazion	f7794de0b7	weibo: fix bad artist name suggestions in new artist form. Fix the new artist form suggesting invalid Chinese tag names for Weibo artists. Suggest `weibo_123456` instead as a placeholder.	2022-08-26 01:25:05 -05:00
evazion	4d009568fd	Fix #5165 : add support for weibo share urls	2022-08-26 01:12:23 -05:00
evazion	600bdc9ae6	pixiv: drop support for https://tc-pximg01.techorus-cdn.com urls. This was an obsolete URL format briefly used by Pixiv around 2019-2020. There were only ~80 posts with sources using this format. They have been manually fixed.	2022-08-24 15:54:10 -05:00
evazion	bf3ee9cfb8	Fix #5238 : Trying to upload a pixiv direct image url that got trumped by a revision redirects to the new post if it's uploaded. Bug: When uploading a direct Pixiv image URL, we ignored it in favor of the image URL returned by the Pixiv API. This meant if you tried to upload the original version of a revised image, we would get the revised version instead. Fix: When given a direct Pixiv image URL, use it as-is if it's a full image URL. If it's a sample image URL, ignore it in favor of the full image URL as returned by the API, unless the post is deleted and the API data is unavailable.	2022-08-24 15:40:04 -05:00
evazion	f46134e87f	Fix #5234 : Weibo URLs get normalized incorrectly in some cases.	2022-08-24 14:47:00 -05:00
evazion	e3af738371	tests: fix broken tests.	2022-08-24 02:03:37 -05:00
evazion	09dfab1f0d	hentai foundry: update url for Hentai Foundry tags. Change the URL used for Hentai Foundry tags from: https://www.hentai-foundry.com/search/index?query=elf&search_in=keywords to: https://www.hentai-foundry.com/pictures/tagged/elf	2022-08-24 00:25:37 -05:00
evazion	2c36e02810	foundation.app: fix scraping of image urls. Foundation changed their HTML page format and we can no longer scrape the image URL directly from the page. Instead we have to build it based on API data.	2022-08-24 00:25:37 -05:00
evazion	228850b749	newgrounds: support parsing video urls. Fixes URLS like `https://www.newgrounds.com/portal/view/830293` being treated as bad_source.	2022-08-23 13:39:32 -05:00
nonamethanks	2fd8e9bc14	Deviantart: fix regression in `3a0a32b98a`	2022-06-04 20:26:14 +02:00
nonamethanks	3a0a32b98a	Fix deviantart strategy to get biggest available size	2022-05-27 17:07:22 +02:00
evazion	80f3778616	Merge pull request #5159 from nonamethanks/fix-furaffinity-ascii-urls Furaffinity: fix uploads for non-ascii image urls	2022-05-09 14:16:32 -05:00
nonamethanks	5b8402751c	Furaffinity: fix uploads for non-ascii image urls Use Addressable::URI, which supports non-ascii urls.	2022-05-09 18:38:38 +02:00
evazion	c07b099bf8	Fix #5152 : Nicovideo video urls getting bad_source.	2022-05-03 03:59:15 -05:00
evazion	23b8350320	sources: add image_url?, page_url?, and profile_url? methods. Add methods to Source::URL for determining whether a URL is an image URL, a page URL, or a profile URL. Also add more source URL tests and fix various URL parsing bugs.	2022-05-01 21:01:36 -05:00
nonamethanks	8edd5dd810	Add furaffinity support	2022-04-27 03:47:59 +02:00
evazion	76d9e86724	Fix #5140 : Unexpected error: PublicSuffix::DomainInvalid for searching some newgrounds urls in /artists When the artist name couldn't found for a Newgrounds URL, for example for `https://www.newgrounds.com/dump/item`, then the `profile_url` method erroneously returned `https://.newgrounds.com`. This led to an error later on when the artist finder tried to parse the invalid URL. Also fix `strategy_should_work` to test that the profile URL is a valid URL, and not to try to download the file when image_urls is empty.	2022-04-22 23:16:41 -05:00
evazion	90182148aa	Merge pull request #5137 from nonamethanks/foundation-videos Foundation: fix some video posts not being extracted	2022-04-22 01:50:26 -05:00
evazion	57a92ad336	Fix #5072 : Fandom source normalization is wrong	2022-04-22 01:27:17 -05:00
nonamethanks	3b055138ff	Fix normalization for fandom sources	2022-04-22 03:27:05 +02:00
nonamethanks	e6cb255a7a	Foundation: fix some video posts not being extracted Also adjusts SourceTestHelper to not autogenerate contexts, so that tests can be launched individually.	2022-04-21 17:54:22 +02:00
nonamethanks	c9227645d9	Add anifty.jp support	2022-04-18 16:50:26 +02:00
nonamethanks	9612578fcb	Add Booth support	2022-04-16 17:52:18 +02:00
Lily	9dde90ef94	fix broken assertion in nijie test	2022-04-09 12:32:45 -03:00
evazion	ca8083465b	newgrounds: exclude links to other works in commentary. Sometimes when a Newgrounds post is part of a set, there is a list of links to other posts in the set in the artist's commentary. Exclude these links because they're not really part of the commentary. Example: https://www.newgrounds.com/art/view/boxofwant/annie-hughes-1 (NSFW)	2022-04-02 23:13:26 -05:00
evazion	bfbc932025	Fix #5082 : NoMethodError when searching an old-style dead fanbox url in artist urls. This API call: # profile: https://www.pixiv.net/fanbox/creator/40684196 curl -H "Origin: https://fanbox.cc" "https://api.fanbox.cc/creator.get?userId=40684196" returns `{ "body": nil }` when the artist is deleted. We didn't expect `body` to be nil. Also fix it so that `profile_url` returns the `https://www.pixiv.net/fanbox/creator/40684196` URL if we can't get the `https://<username>.fanbox.cc` URL, usually because the API call failed because the artist is deleted.	2022-03-30 18:19:08 -05:00
evazion	d9d3c1dfe4	sources: rename Sources::Strategies to Source::Extractor. Rename Sources::Strategies to Source::Extractor. A Source::Extractor represents a thing that extracts information from a given URL.	2022-03-24 03:49:44 -05:00
evazion	34aa22f90b	sources: fix fandom.com page urls. Fix it so that sources like this: * https://vignette.wikia.nocookie.net/valkyriecrusade/images/c/c5/Crimson_Hatsune_H.png/revision/latest?cb=20180702031954 link to this: * https://valkyriecrusade.fandom.com/?file=Crimson_Hatsune_H.png instead of this * https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png The `/wiki/File:$name` URL redirects to whatever wiki page contains the image instead of showing the file itself.	2022-03-23 23:38:06 -05:00
evazion	5941c47b79	nicoseiga: support a few more url types.	2022-03-23 23:38:06 -05:00
evazion	c07c5ea594	nicoseiga: fix page_url method not to return seiga.nicovideo.jp/image/source/:id urls. Fix the page_url method not to return URLs like this: https://seiga.nicovideo.jp/image/source/8017978 (page: https://seiga.nicovideo.jp/watch/mg310193) These are direct image URLs, not page URLs. It's not generally possible to get to the page URL from an image URL like this. This fixes it so that we don't incorrectly set the source of NicoSeiga uploads to the image URL.	2022-03-23 23:38:06 -05:00
evazion	4ef8178bd1	sources: remove `canonical_url` method. Refactor source strategies to remove the `canonical_url` method. `canonical_url` returned the URL that should be used as the source of the post after upload. Now we simply use `Source::URL#page_url` to determine the source after upload. If the source is an image URL that is convertible to a page URL, then the image URL is used as the source. If the source is an image URL that is not convertible to a page URL, then the page URL is used as the source. This simplifies source strategies so that all they have to care about is implementing the `Source::URL#page_url` and `Sources::Strategies#page_url` methods, and the preferred source will be chosen for posts automatically.	2022-03-23 23:38:06 -05:00
evazion	3aa5cab2aa	sources: refactor normalize_for_source. `normalize_for_source` was used to convert image URLs to page URLs when displaying sources on the post show page. Move all the code for converting image URLs to page URLs from `Sources::Strategies#normalize_for_source` to `Source::URL#page_url`. Before we had to be very careful in source strategies not to make any network calls in `normalize_for_source`, since it was used in the view for the post show page. Now all the code for generating page URLs is isolated in Source::URL, which makes source strategies simpler. It also makes it easier to check if a source is an image URL or page URL, and if the image URL is convertible to a page URL, which will make autotagging bad_link or bad_source feasible. Finally, this fixes it to generate better page URLs in a handful of cases: * https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP * https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6 * http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677 * https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png * https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1	2022-03-23 01:34:04 -05:00
evazion	452ce8d165	artstation: add partial support for video clips (#5063 ). Add partial support for fetching videos from ArtStation posts that contain videos. Most of this code is disabled for now because actually downloading these videos requires bypassing a Cloudflare captcha.	2022-03-21 16:51:42 -05:00
evazion	7394660ba9	posts: fix exception when post has source like 'https://www.twitter.com/username '. `twitter.com` sources worked but `www.twitter.com` didn't. Also match the URL by class instead of by site name to ensure we match the expected class.	2022-03-20 21:08:05 -05:00
evazion	7f58cfbe5e	tinami: get the full image. Support grabbing the full image for Tinami uploads, rather than the sample. Getting the full image requires making a request like this: curl -X POST \ -H 'Referer: https://www.tinami.com/' \ -H 'Content-Type: application/x-www-form-urlencoded' \ -H 'Cookie: Tinami2SESSID=<redacted>;' \ --data-raw 'action_view_original=true&cont_id=1087268&ethna_csrf=<redacted>' \ https://www.tinami.com/view/1087268 Then scraping the <img> tag from the resulting HTML page. If the post has multiple images, then we need to scrape and pass the `sub_id` of the image too. Fixes #2818.	2022-03-19 23:22:09 -05:00
evazion	01b683798e	sources: add Tinami support.	2022-03-19 00:50:36 -05:00

1 2 3 4 5 ...

402 Commits