danbooru

Author	SHA1	Message	Date
nonamethanks	79a9081efa	Moebooru: rewrite tests	2022-10-08 16:10:39 +02:00
evazion	23b8350320	sources: add image_url?, page_url?, and profile_url? methods. Add methods to Source::URL for determining whether a URL is an image URL, a page URL, or a profile URL. Also add more source URL tests and fix various URL parsing bugs.	2022-05-01 21:01:36 -05:00
evazion	d9d3c1dfe4	sources: rename Sources::Strategies to Source::Extractor. Rename Sources::Strategies to Source::Extractor. A Source::Extractor represents a thing that extracts information from a given URL.	2022-03-24 03:49:44 -05:00
evazion	4ef8178bd1	sources: remove `canonical_url` method. Refactor source strategies to remove the `canonical_url` method. `canonical_url` returned the URL that should be used as the source of the post after upload. Now we simply use `Source::URL#page_url` to determine the source after upload. If the source is an image URL that is convertible to a page URL, then the image URL is used as the source. If the source is an image URL that is not convertible to a page URL, then the page URL is used as the source. This simplifies source strategies so that all they have to care about is implementing the `Source::URL#page_url` and `Sources::Strategies#page_url` methods, and the preferred source will be chosen for posts automatically.	2022-03-23 23:38:06 -05:00
evazion	3aa5cab2aa	sources: refactor normalize_for_source. `normalize_for_source` was used to convert image URLs to page URLs when displaying sources on the post show page. Move all the code for converting image URLs to page URLs from `Sources::Strategies#normalize_for_source` to `Source::URL#page_url`. Before we had to be very careful in source strategies not to make any network calls in `normalize_for_source`, since it was used in the view for the post show page. Now all the code for generating page URLs is isolated in Source::URL, which makes source strategies simpler. It also makes it easier to check if a source is an image URL or page URL, and if the image URL is convertible to a page URL, which will make autotagging bad_link or bad_source feasible. Finally, this fixes it to generate better page URLs in a handful of cases: * https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP * https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6 * http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677 * https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png * https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1	2022-03-23 01:34:04 -05:00
evazion	f2028c14fb	Fix #5045 : Exception on uploads when SauceNAO is the referrer URL. Bug: We assumed the referer URL was from the same site as the target URL. We tried to call methods on the referer only supported by the target URL. Fix: Ignore the referer URL when it's from a different site than the target URL.	2022-03-12 00:04:39 -06:00
evazion	b4aea72d04	sources: remove `preview_urls` method from base strategy. Remove the `preview_urls` method from strategies. The only place this was used was when doing IQDB searches, to download the thumbnail image from the source instead of the full image. This wasn't worth it for a few reasons: * Thumbnails on other sites are sometimes not the size we want, which could affect IQDB results. * Grabbing thumbnails is complex for some sites. You can't always just rewrite the image URL. Sometimes it requires extra API calls, which can be slower than just grabbing the full image. * For videos and animations, thumbnails from other sites don't always match our thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which means we don't always pick the first frame, which could affect IQDB results. API changes: * /iqdb_queries?search[file_url] now downloads the URL as is without any modification. Before it tried to change thumbnail and sample size image URLs to the full version. * /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that contains multiple images. Before it would grab only the first image and silently ignore the rest.	2022-03-11 03:22:23 -06:00
evazion	2f61486ac6	sources: remove `image_url` method from base strategy. Remove the `image_url` method from source strategies. This method would return only the first image if a source had multiple images. The `image_urls` method should be used instead. Tests were the main place that still used `image_url` instead of `image_urls`. Also make post replacements return an error if replacing with a source that contains multiple images, instead of just blindly replacing the post with the first image in the source.	2022-03-11 01:59:21 -06:00
evazion	9169f00e80	sources: factor out Source::URL::Moebooru.	2022-02-26 17:46:44 -06:00
evazion	abdab7a0a8	uploads: rework upload process. Rework the upload process so that files are saved to Danbooru first before the user starts tagging the upload. The main user-visible change is that you have to select the file first before you can start tagging it. Saving the file first lets us fix a number of problems: * We can check for dupes before the user tags the upload. * We can perform dupe checks and show preview images for users not using the bookmarklet. * We can show preview images without having to proxy images through Danbooru. * We can show previews of videos and ugoira files. * We can reliably show the filesize and resolution of the image. * We can let the user save files to upload later. * We can get rid of a lot of spaghetti code related to preprocessing uploads. This was the cause of most weird "md5 confirmation doesn't match md5" errors. (Not all of these are implemented yet.) Internally, uploading is now a two-step process: first we create an upload object, then we create a post from the upload. This is how it works: * The user goes to /uploads/new and chooses a file or pastes an URL into the file upload component. * The file upload component calls `POST /uploads` to create an upload. * `POST /uploads` immediately returns a new upload object in the `pending` state. * Danbooru starts processing the upload in a background job (downloading, resizing, and transferring the image to the image servers). * The file upload component polls `/uploads/$id.json`, checking the upload `status` until it returns `completed` or `error`. * When the upload status is `completed`, the user is redirected to /uploads/$id. * On the /uploads/$id page, the user can tag the upload and submit it. * The upload form calls `POST /posts` to create a new post from the upload. * The user is redirected to the new post. This is the data model: * An upload represents a set of files uploaded to Danbooru by a user. Uploaded files don't have to belong to a post. An upload has an uploader, a status (pending, processing, completed, or error), a source (unless uploading from a file), and a list of media assets (image or video files). * There is a has-and-belongs-to-many relationship between uploads and media assets. An upload can have many media assets, and a media asset can belong to multiple uploads. Uploads are joined to media assets through a upload_media_assets table. An upload could potentially have multiple media assets if it's a Pixiv or Twitter gallery. This is not yet implemented (at the moment all uploads have one media asset). A media asset can belong to multiple uploads if multiple people try to upload the same file, or if the same user tries to upload the same file more than once. New features: * On the upload page, you can press Ctrl+V to paste an URL and immediately upload it. * You can save files for upload later. Your saved files are at /uploads. Fixes: * Improved error messages when uploading invalid files, bad URLs, and when forgetting the rating.	2022-01-28 04:13:22 -06:00
evazion	4074cc99f9	uploads: fix incorrect remote sizes on pixiv uploads. Bug: the uploads page showed a remote size of 146 bytes for Pixiv uploads. Cause: we didn't spoof the Referer header when making the HEAD request for the image, causing Pixiv to return a 403 error. Also fix the case where the Content-Length header is absent.	2020-06-24 03:02:45 -05:00
evazion	8eac82a971	pixiv: fix regression with new user profile urls. * Update tests to use new Pixiv profile urls. * Fix issue with artist finder not working when given direct image or html page urls.	2020-06-24 02:41:11 -05:00
evazion	9ca848d732	tests: fix more ruby 2.7 deprecation warnings.	2020-05-29 15:36:21 -05:00
nonamethanks	307df3b3e4	Refactor source normalization * Move the source normalization logic out of the post model and into individual sources' strategies. * Rewrite normalization tests to be handled into each source's test, and expand them significantly. Previously we were only testing a very small subset of domains and variants. * Fix up normalization for several sites. * Normalize fav.me urls into normal deviantart urls.	2020-05-21 22:46:51 +02:00
evazion	a15bbe4264	moebooru: fix preview_urls to fallback to image_urls. Fix the moebooru strategy to fallback to returning the image url if we can't find the preview url. Fixes iqdb lookups failing in some cases because the strategy didn't return a valid url for preview_url.	2019-10-26 13:51:58 -05:00
evazion	c700ea4b5f	Fix #4016 : Translated tags failing to find some tags. * Normalize spaces to underscores when saving other names. Preserve case since case can be significant. * Fix WikiPage#other_names_include to search case-insensitively (note: this prevents using the index). * Fix sources to return the raw tags in `#tags` and the normalized tags in `#normalized_tags`. The normalized tags are the tags that will be matched against other names.	2018-12-16 11:37:57 -06:00
evazion	c8d538f618	moebooru: delegate to substrategy based on post source (#3911 ). If the yande.re or konachan.com post has a source from a supported site, for example Pixiv or Twitter, then delegate the artist and commentary lookup to that substrategy. Only do this for sources from recognized sites, not the null strategy.	2018-10-06 14:27:49 -05:00
evazion	e5a4193dd4	moebooru: support batch bookmarklet previews (#3911 ).	2018-10-06 00:58:22 -05:00
evazion	fdb6e4ecee	moebooru: rewrite konachan urls for Post#normalized_source (#3911 ).	2018-10-06 00:58:22 -05:00
evazion	864349dc7b	moebooru: fetch tags (#3911 ).	2018-10-06 00:58:22 -05:00
evazion	bd3fb7d70e	Post#normalized_source: fix for yande.re urls. Fix regex for yande.re urls like this: https://files.yande.re/image/b66909b940e8d77accab7c9b25aa4dc3/yande.re%20377828.png	2018-10-01 20:03:21 -05:00
evazion	958a9f505b	moebooru: rewrite sample urls + support bookmarklet on html page. * Fixes #2942: Add Moebooru Rewrite for Sample Images. * Addresses #3911: Improve Moebooru support.	2018-09-19 23:32:21 -05:00
evazion	4a99cb098f	moebooru: use the image url as the canonical url.	2018-09-16 21:00:11 -05:00
evazion	d9ce953752	Fix #3906 : Moebooru strategy raises NotImplementedError.	2018-09-16 21:00:11 -05:00
evazion	cae78fa8ee	moebooru: move tests from unit/downloads to unit/sources.	2018-09-16 21:00:11 -05:00

25 Commits