danbooru

Author	SHA1	Message	Date
evazion	28971fe103	sources: factor out `site_name` method.	2022-03-11 23:20:53 -06:00
evazion	b4aea72d04	sources: remove `preview_urls` method from base strategy. Remove the `preview_urls` method from strategies. The only place this was used was when doing IQDB searches, to download the thumbnail image from the source instead of the full image. This wasn't worth it for a few reasons: * Thumbnails on other sites are sometimes not the size we want, which could affect IQDB results. * Grabbing thumbnails is complex for some sites. You can't always just rewrite the image URL. Sometimes it requires extra API calls, which can be slower than just grabbing the full image. * For videos and animations, thumbnails from other sites don't always match our thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which means we don't always pick the first frame, which could affect IQDB results. API changes: * /iqdb_queries?search[file_url] now downloads the URL as is without any modification. Before it tried to change thumbnail and sample size image URLs to the full version. * /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that contains multiple images. Before it would grab only the first image and silently ignore the rest.	2022-03-11 03:22:23 -06:00
evazion	2f61486ac6	sources: remove `image_url` method from base strategy. Remove the `image_url` method from source strategies. This method would return only the first image if a source had multiple images. The `image_urls` method should be used instead. Tests were the main place that still used `image_url` instead of `image_urls`. Also make post replacements return an error if replacing with a source that contains multiple images, instead of just blindly replacing the post with the first image in the source.	2022-03-11 01:59:21 -06:00
evazion	4701027f45	sources: remove unused methods from base strategy. Remove unused `urls`, `parsed_urls`, and `domains` methods.	2022-03-10 23:11:00 -06:00
evazion	5016d9ad26	Merge pull request #5043 from nonamethanks/fantia-support Add Fantia support	2022-03-10 15:21:03 -06:00
evazion	29fc072cf1	Merge pull request #5042 from nonamethanks/weibo-fix-typo weibo: fix typo in strategy	2022-03-10 15:01:12 -06:00
nonamethanks	a6549bc6fe	Add Fantia support Also fixes a regression in `74fdeef10c` that stopped mastodon urls from being given the right priority.	2022-03-10 17:43:32 +01:00
evazion	43a665a66d	sources: factor out Source::URL::NicoSeiga.	2022-03-10 04:53:51 -06:00
nonamethanks	93adba06e5	weibo: fix typo in strategy	2022-03-10 08:31:23 +01:00
evazion	34854185be	sources: factor out Source::URL::DeviantArt and Source::URL::Stash.	2022-03-10 00:29:49 -06:00
evazion	bb4b8619f5	pixiv: fix Source::URL::Pixiv not being included in Source::URL list.	2022-03-09 01:14:09 -06:00
evazion	8a50148823	pixiv: fixup bug with fetching image_urls for bad_id posts. Fix `image_urls` returning `[nil]` when fetching data for a image URL that was bad_id. In that case `original_urls` is empty, so we fall back to using the deleted image URL as-is.	2022-03-09 01:14:09 -06:00
evazion	77c88fd867	Merge pull request #5038 from nonamethanks/remove-redundant-comments sources: remove redundant comments	2022-03-08 23:28:29 -06:00
evazion	6afb2f8e3c	Merge pull request #5037 from nonamethanks/tumblr-refactor sources: factor out Source::URL::Tumblr	2022-03-08 23:26:30 -06:00
evazion	cf4b9a6114	Merge pull request #5039 from nonamethanks/simplify-lofter-tag-parsing Lofter: simplify tag extraction logic	2022-03-08 23:21:57 -06:00
evazion	987f2985d3	Merge pull request #5040 from nonamethanks/fix-weibo-404 Weibo: fix exception for deleted url	2022-03-08 23:08:37 -06:00
evazion	52a2d3418c	pixiv: fixup bugs in `1c620f805`. * Fix error when uploading non-ugoira files. * Fix sample image URLs not being rewritten to full images correctly. We have to get the full image URL from the API because given an /img-master/ URL, we don't know what the original file extension is.	2022-03-08 23:07:24 -06:00
nonamethanks	c9be77d1f8	Weibo: fix exception for deleted url	2022-03-09 05:31:38 +01:00
evazion	1c620f8055	sources: factor out Source::URL::Pixiv. * Drop support for preview_urls. This means that IQDB lookups may be slower, especially for ugoiras, since we have to download the full ugoira now. However, ugoira lookups should produce better results, since the ugoira thumbnail chosen by Pixiv wasn't necessarily the same as the thumbnail chosen by Danbooru. * Drop support for uploading single manga pages: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=18557054&page=2 Previously uploading an URL like this would only upload a single image out of a multi-image work. Now it will upload all images in the work. Pixiv no longer supports URLs like this, so we don't either. * Add support for parsing URLs like this: https://i.pximg.net/c/360x360_70/custom-thumb/img/2022/03/08/00/00/56/96755248_p0_custom1200.jpg Apparently artists can choose a custom thumbnail now (not like anyone will try to upload one though).	2022-03-08 22:17:38 -06:00
evazion	df0bb70486	sources: factor out Source::URL::PixivSketch. Add upload support for Pixiv Sketch. Fetch tags, commentary, and artist, and rewrite sample images to full images. Authentication isn't required. R18 images are hidden in the browser but visible in the API.	2022-03-08 18:24:12 -06:00
nonamethanks	ff6bfff311	Lofter: simplify tag extraction logic Now that we have a separate parsing class we can just use it to properly parse tag urls as well.	2022-03-08 17:01:50 +01:00
nonamethanks	ebd3670076	sources: remove redundant comments These comments are already present under the parse blocks, so the huge walls of text before the code are not needed anymore.	2022-03-08 16:56:00 +01:00
nonamethanks	b9c7e467e5	sources: factor out Source::URL::Tumblr Also adds support for fetching source data from direct image urls when possible.	2022-03-08 15:06:06 +01:00
NamelessContributor	5cdbc1d454	Replace hard tabs with spaces in .rb files	2022-03-08 07:11:54 +01:00
evazion	de61e56161	Merge pull request #5032 from nonamethanks/factor-out-weibo sources: factor out Source::URL::Weibo	2022-03-07 18:31:15 -06:00
evazion	8d28453f17	Merge pull request #5031 from nonamethanks/fix-foundation Foundation: fix normalization error	2022-03-07 18:28:06 -06:00
nonamethanks	d8e2f2ee33	sources: factor out Source::URL::Weibo Additionally, fixed some broken tests and changed normalization for urls of album type to point to the mobile version instead, because they're only visible to logged-in users.	2022-03-07 16:52:43 +01:00
evazion	74d6b4e81e	users: don't allow names ending with file extensions. This is so in the future we can have URLs like https://danbooru.donmai.us/users/evazion without problems caused by names like https://danbooru.donmai.us/users/evazion.json	2022-03-07 04:39:00 -06:00
nonamethanks	d195d30587	Foundation: fix normalization error Urls like https://foundation.app/@yohan1754/fso/3 would get normalized like https://foundation.app/@foundation/foundation/3, which was wrong because it would point to a completely different collection	2022-03-07 06:52:23 +01:00
evazion	a160a3acce	users: add stricter username rules. Add stricter username rules: * Only allow usernames to contain basic letters, numbers, CJK characters, underscores, dashes and periods. * Don't allow names to start or end with punctuation. * Don't allow names to have multiple underscores in a row. * Don't allow active users to have names that look like deleted users (e.g. "user_1234"). * Don't allow emoji or any other Unicode characters except for Chinese, Japanese, and Korean characters. CJK characters are currently grandfathered in but will be disallowed in the future. Users with an invalid name will be shown a permanent sitewide banner until they change their name.	2022-03-05 01:08:53 -06:00
evazion	b4620f561c	users: lower max username length to 25 characters. The median username length is 8 characters. The 99% percentile is 18 characters. The 99.9% percentile is 24 characters. About 750 users have a name more than 24 characters long. This doesn't do anything about existing users with long usernames. Note that this is the length in Unicode codepoints, not grapheme clusters. Some Unicode characters and emoji may be a single glyph but composed of multiple codepoints.	2022-03-01 21:23:21 -06:00
evazion	99221af855	ugoiras: fix regression in `7031fd13d`. Fix `Cannot write log file 'ffmpeg2pass-0.log' for pass-1 encoding: Permission denied` error when uploading ugoira files. Caused by the fact that 2-pass encoding tries to write a log file in the current directory by default, which fails in production because the default working directory in the Docker image is /danbooru, which is read-only.	2022-03-01 00:16:55 -06:00
evazion	7031fd13d7	ugoiras: encode .webm samples using VP9 instead of VP8. Switch the codec for .webm samples from VP8 to VP9. All modern browsers support VP9 (Safari was the last to add support in ~2020), so it should be safe to provide only VP9 .webms without a fallback. VP9 lets us use two-pass encoding, which should offer better compression. Fixes ugoira samples still having poor quality even after `4c652cf3e`. `4c652cf3e` tried to remove the max bitrate limit by setting `-b:v 0`, but this only worked in FFmpeg 4.2. In production Danbooru uses FFmpeg 4.4, and apparently in 4.4 `-b:v 0` means "use the default max bitrate of 256kb/s" instead of "no bitrate limit". https://trac.ffmpeg.org/wiki/Encode/VP9 https://developers.google.com/media/vp9/bitrate-modes https://developers.google.com/media/vp9/settings/vod http://wiki.webmproject.org/ffmpeg/vp9-encoding-guide https://www.reddit.com/r/AV1/comments/k7colv/encoder_tuning_part_1_tuning_libvpxvp9_be_more/	2022-02-28 22:02:56 -06:00
user	2600dcdbfa	nijie: extract post ID from new image URL.	2022-02-28 21:14:47 +01:00
user	550e0bef93	nijie: fix pattern for new image URL.	2022-02-28 21:14:31 +01:00
evazion	1609059bf4	sources: factor out Source::URL::Fanbox. Also fix it so that we grab the full image for cover URLs like this: * Sample: https://pixiv.pximg.net/c/1620x580_90_a2_g5/fanbox/public/images/creator/1566167/cover/QqxYtuWdy4XWQx1ZLIqr4wvA.jpeg * Full: https://pixiv.pximg.net/fanbox/public/images/creator/1566167/cover/QqxYtuWdy4XWQx1ZLIqr4wvA.jpeg	2022-02-28 06:25:06 -06:00
evazion	317ec886bc	sources: factor out Source::URL::Nijie. Also fixes the uploader uploading all images when trying to upload only a single image in a multi-image work. Caused by `image_urls` incorrectly returning all images when the source strategy was given a url for a single image.	2022-02-27 02:27:35 -06:00
evazion	926a8fa81f	Danbooru::URL: add `#basename`, `#filename`, and `#file_ext` utility methods. Add `#basename`, `#filename`, and `#file_ext` utility methods to Danbooru::URL and change a few places to use them. Simplifies parsing filenames in source URLs in various places.	2022-02-27 02:27:21 -06:00
evazion	fcf517834d	sources: factor out Source::URL::ArtStation.	2022-02-26 21:03:49 -06:00
evazion	9169f00e80	sources: factor out Source::URL::Moebooru.	2022-02-26 17:46:44 -06:00
evazion	74fdeef10c	sources: factor out Source::URL::Mastodon.	2022-02-26 15:08:27 -06:00
evazion	86d8e2d13d	sources: factor out Source::URL::Lofter.	2022-02-25 23:43:10 -06:00
evazion	f062f2d145	sources: factor out Source::URL::Newgrounds. Also fix it so that the image URL is set as the source for Newgrounds posts, not the page URL. It's possible to generate the page URL from the image URL (except for images after the first in multi-image posts). * Page: https://www.newgrounds.com/art/view/natthelich/weaver * Image: https://art.ngfiles.com/images/1520000/1520217_natthelich_weaver.jpg?f1606365031	2022-02-25 23:04:03 -06:00
evazion	64472a7b7e	sources: factor out Source::URL::HentaiFoundry. Add support for these URL types: * http://pictures.hentai-foundry.com//s/soranamae/363663.jpg * http://www.hentai-foundry.com/piccies/d/dmitrys/1183.jpg * http://www.hentai-foundry.com/pic-149160.php * http://www.hentai-foundry.com/user-RockCandy.php * http://www.hentai-foundry.com/profile-sawao.php These URL types are obsolete, but still present in some old posts.	2022-02-25 22:01:17 -06:00
evazion	e6ded89f85	sources: factor out Source::URL::Plurk. Also fix it so that for adult works, we get the images posted by the artist in the replies. Example: https://www.plurk.com/p/omc64y (nsfw).	2022-02-25 02:06:57 -06:00
evazion	26f4cf1ebd	sources: factor out Source::URL::Skeb.	2022-02-25 02:06:57 -06:00
evazion	ffe52f5ead	sources: factor out Source::URL::Foundation. Add support for a couple more URL types: * https://foundation.app/@asuka111art/dinner-with-cats-82426 * https://f8n-production-collection-assets.imgix.net/0x3B3ee1931Dc30C1957379FAc9aba94D1C48a5405/128711/QmcBfbeCMSxqYB3L1owPAxFencFx3jLzCPFx6xUBxgSCkH/nft.png Also include these URLs in the list of profile URLs: * https://foundation.app/0x7E2ef75C0C09b2fc6BCd1C68B6D409720CcD58d2 (for https://foundation.app/@mochiiimo) These URLs should be stable even if the user changes their name.	2022-02-23 23:49:31 -06:00
evazion	043c08eb05	sources: factor out Source::URL::TwitPic.	2022-02-23 23:49:31 -06:00
evazion	7ed8f95a8e	sources: add Source::URL class; factor out Source::URL::Twitter. Introduce a Source::URL class for parsing URLs from source sites. Refactor the Twitter source strategy to use it. This is the first step towards factoring all the URL parsing logic out of source strategies and moving it to subclasses of Source::URL. Each site will have a subclass of Source::URL dedicated to parsing URLs from that site. Source strategies will use these classes to extract information from URLs. This is to simplify source strategies. Most sites have many different URL formats we have to parse or rewrite, and handling all these different cases tends to make source strategies very complex. Isolating the URL parsing logic from the site scraping logic should make source strategies easier to maintain.	2022-02-23 23:46:04 -06:00
evazion	4c652cf3ec	ugoiras: increase quality of webm samples. Fix certain ugoiras having very low quality webm samples. This was because we had a target bitrate of 5 Mbps, but this wasn't enough for videos that were high resolution or that had choppy, hard-to-compress motion, such as post 5081776 (nsfw).	2022-02-23 02:50:14 -06:00

1 2 3 4 5 ...

2652 Commits