danbooru

Author	SHA1	Message	Date
evazion	1aeb52186e	Add AI tag model and UI. Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags. AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that repo for details about the model. The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is designed to be as space-efficient as possible, since in production we have over 300 million AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus indexes. You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are potentially mistagged (or more likely where the AI missed the tag). You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by confidence level. You can also search unposted media assets by AI tag. To generate tags, use the `autotag` script from the Autotagger repo, something like this: docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images \| gzip > tags.csv.gz To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.	2022-06-24 04:54:26 -05:00
evazion	d346adabc9	Revert "posts: fix rounding errors in ratio: metatag." This reverts commit `80ced3e418`. This turned out to be intentional. Rounding the aspect ratio to 2 decimal places is so that searches for exact ratios like `ratio:16:9` or `ratio:1.78` work even when the ratio doesn't exactly match. Rounding to 2 decimal places means that the ratio: metatag has a 1% error tolerance.	2022-05-22 12:37:26 -05:00
evazion	80ced3e418	posts: fix rounding errors in ratio: metatag. Fix the ratio: metatag sometimes including wrong results due to rounding errors. For example, searching for `ratio:>=4.0` would include post 3220414, which has an aspect ratio of 3.99879. This would get rounded up to 2 decimal places to 4.00.	2022-05-21 14:08:50 -05:00
evazion	9867514a78	Fix #5177 : ordfav with commentary search raises exception.	2022-05-20 22:59:02 -05:00
evazion	181639368c	posts: add is: and has: metatags. Add the following metatags: * is:parent * is:child * is:safe * is:questionable * is:explicit * is:sfw (same as -rating:q,e) * is:nsfw (same as rating:q,e) * is:active * is:deleted * is:pending * is:flagged * is:appealed * is:banned * is:modqueue * is:unmoderated * is:jpg * is:png * is:gif * is:mp4 * is:webm * is:swf * is:zip * has:parent * has:children * has:source * has:appeals * has:flags * has:replacements * has:comments * has:commentary * has:notes * has:pools All of these searches were already possible with other metatags, but these might be more convenient.	2022-05-18 13:04:15 -05:00
evazion	141044d352	posts: refactor hardcoded ratings. Refactor ratings to not be hardcoded in various places. Make it so all ratings are defined in Post::RATINGS. Also make it so that you can search multiple ratings at once with `rating:q,e`.	2022-05-18 13:04:15 -05:00
evazion	f117049750	users: remove 'hide deleted posts' account setting. This setting automatically added the `-status:deleted` metatag to all searches. This meant deleted posts were filtered out at the database level, rather than at the html level. This way searches wouldn't have less-than-full pages. The cost was that searches were slower, mainly because post counts weren't cached. Normally when you search for a tag, we can get the post count from the tags table. If the search is actually like `touhou -status:deleted`, then we don't know the count and we have to calculate it on demand. This option is being removed because it did the opposite of what people thought it did. People thought it made deleted posts visible, when actually it made them more hidden.	2022-05-01 00:47:46 -05:00
evazion	918f32c554	Fix #4461 : Improve posts/index page titles.	2022-04-30 01:52:33 -05:00
evazion	bbe748bd2b	posts: factor out post edit logic. Factor out most of the tag edit logic from the Post class to a new PostEdit class. The PostEdit class contains the logic for parsing tags and metatags from the tag edit string, and for determining which tags were added or removed by the edit. Fixes various bugs caused by not calculating the set of added or removed tags correctly, for example when tag category prefixes were used (e.g. `copy:touhou`) or when the same tag was added and removed in the same edit (e.g. `touhou -touhou`). Fixes #5123: Tag categorization prefixes bypass deprecation check Fixes #5126: Negating a deprecated tag will still cause the warning to show Fixes #3477: Remove tag validator triggering on tag category changes Fixes #4848: newpool: metatag doesn't parse correctly	2022-04-29 17:13:33 -05:00
evazion	eca0ab04f7	post queries: raise error on invalid searches. Raise an error if the search is invalid for one of the following reasons: * It contains multiple conflicting order: metatags (e.g. `order:score order:favcount` or `ordfav:a ordfav:b`). * It contains a metatag that can't be used more than once: (e.g. `limit:5 limit:10`, `random:5 random:10`). * It contains a metatag that can't be negated (e.g. `-order:score`, `-limit:20`, or `-random:20`). * It contains a metatag that can't be used in an OR clause (e.g. ` touhou or order:score`, `touhou or limit:20`, `touhou or random:20`).	2022-04-17 23:20:22 -05:00
evazion	af183467b6	post queries: switch to new post search engine. Switch to the post search engine using the new PostQuery parser. The new engine fully supports AND, OR, and NOT operators and grouping expressions with parentheses. Highlights: New OR operator: * `skirt or dress` (same as `~skirt ~dress`) Tags can be grouped with parentheses: * `1girl (skirt or dress)` * `(blonde_hair blue_eyes) or (red_hair green_eyes)` * `~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above) * `(pantyhose or thighhighs) (black_legwear or brown_legwear)` * `(~pantyhose ~thighhighs) (~black_legwear ~brown_legwear)` (same as above) Metatags can be OR'd together: * `user:evazion or fav:evazion` * `~user:evazion ~fav:evazion` Wildcard tags can combined with either AND or OR: * `black_* white_` (find posts with at least one black_ tag AND one white_* tag) * `black_* or white_` (find posts with at least one black_ tag OR one white_* tag) * `~black_* ~white_*` (same as above) See `4c7cfc73` for more syntax examples. Fixes #4949: And+or search? Fixes #5056: Wildcard searches return unexpected results when combined with OR searches	2022-04-17 23:20:22 -05:00
evazion	86de5cb5d2	posts: fixup `flagger:` metatag. Fix regression in `01a22930e`.	2022-04-06 23:57:50 -05:00
evazion	01a22930e7	posts: move attribute search methods from PostQueryBuilder to Post. Move `status_matches` etc methods from PostQueryBuilder to Post. This is to make refactoring to use the new query parser easier.	2022-04-06 20:25:09 -05:00
evazion	f15f365375	Merge pull request #4952 from thayol/fix-negated-ord Search: "Fix" negated ordered metatags	2022-04-06 04:43:24 -05:00
Thayol	e45e42d479	Use longer lines instead of conditional variables (CodeClimate)	2022-03-31 23:38:40 +02:00
Thayol	70c81f7d49	Change local variable instead of passed object	2022-03-31 23:35:22 +02:00
Thayol	89b40a65ba	Refactor to hash from multiple ifs	2022-03-30 17:54:52 +02:00
evazion	defea08084	posts: fix exception in random:1 searches. Fix regression in `1ad0e8688`. Caused by `relation.order_values` returning an array of Arel nodes instead of an array of strings when doing a `random:1` search.	2022-03-21 01:29:10 -05:00
evazion	1ad0e8688d	posts: fix timeouts for searches using sequential navigation. Fix certain searches timing out when using sequential navigation (page=b1234). The problem was that the so-called "small search optimization" (AKA: force Postgres to use the tag index for small searches instead a sequential scan) wasn't triggering because the ORDER BY clause for sequential navigation was `posts.id desc`, and we were only checking for `posts.id DESC`.	2022-03-20 18:46:06 -05:00
Thayol	cbe7ee4897	Remove trailing whitespace	2022-01-21 18:02:08 +01:00
evazion	1518c3c4be	posts: fix search queries not being logged to NewRelic in some cases (#4900 ) Fix the /posts index controller not logging the normalized search query to NewRelic when the search failed, either because of a tag limit error, a search timeout, or a RSS feed rate limit error. Also don't log the number of search results when it's an API request or failed search. This is to avoid doing a potentially slow full post count when it's not otherwise needed.	2022-01-11 13:39:30 -06:00
evazion	72ea78e697	searchable: replace find_ordered with in_order_of. Rails 7 added an `in_order_of` method that does what our `find_ordered` method did before.	2022-01-07 14:24:57 -06:00
Thayol	5799b1bdbe	Remove double looping	2022-01-04 21:02:15 +01:00
Thayol	660ba43edb	Replace ordered metatags when negated	2022-01-04 20:38:27 +01:00
evazion	a7dc05ce63	Enable frozen string literals. Make all string literals immutable by default.	2021-12-14 21:33:27 -06:00
evazion	0baca68a37	search: make `order:random` truly random; add `random:N` metatag. Make the `order:random` metatag truly randomize the search. Add a `random:N` metatag that returns up to N random posts, like what `order:random` did before. `order:random` now returns the entire search in random order. Before it just returned a pageful of pseudorandom posts. This will be more accurate for small searches, but slower for large searches. If `order:random` times out, try `random:N` instead. The `random:N` metatag returns up to N pseudorandom posts. This is faster than `order:random` for large searches, but for small searches, it may return less than N posts, and the randomness may be biased. Some posts may be more likely than others to appear. N must be between 0 and 200. Also, `/posts?tags=touhou&random=1` now redirects to `/posts?tags=touhou+random:N`. Before the `random=1` param acted like a free `order:random` tag; now it redirects to a `random:N` search, so it counts against your tag limit.	2021-11-25 18:14:34 -06:00
evazion	5dc67613e6	search: optimize username metatags. Optimize metatag searches involving usernames, including user:, approver:, appealer:, commenter:, upvoter:, etc. Do `User.find_by_name` instead of `User.name_matches` because this fetches the user upfront instead of doing it inside a subquery. Using a subquery makes the SQL more complicated and leads to worse query plans. This especially helps searches involving multiple username metatags.	2021-11-25 00:40:53 -06:00
evazion	353e708538	votes: allow admins to remove post votes. Allow admins to remove votes on posts. This is for fixing vote abuse. Votes can be removed by going to the vote list on the /post_votes page, or by clicking on a post's score, then using the "Remove" option in the "..." dropdown menu next to the vote. Votes are soft-deleted - they're marked as deleted in the database, but not fully deleted. Removed votes are only visible to admins, not to regular users. When a vote is removed by an admin, it leaves a mod action. Technically it's possible to undelete votes, but there's no UI for it.	2021-11-23 23:18:54 -06:00
evazion	43c2870664	Fix #4917 : Add down_score/up_score orders and metasearches. Add `upvotes:N`, `downvotes:N`, `order:upvotes`, `order:downvotes`, `order:upvotes_asc`, `order:downvotes_asc` metatags. In the API, the field is called up_score / down_score. Here it's called `upvotes` and `downvotes` because this should be easier to understand for end users. Note that internally, `down_score` is negative. A post that matches `downvotes:>5` will have down_score < -5 internally.	2021-11-16 03:52:38 -06:00
evazion	148752d3c4	PostQueryBuilder: remove useless code. The workaround for `unaliased:fav:1` is no longer needed since favorites are no longer included in the post's tag_index.	2021-11-02 04:07:21 -05:00
evazion	a5ed8c72c9	search: fix parsing of invalid metatag values. * Change `age:` metatag to require time units. This means e.g. `age:<600` no longer works; instead you have to say `age:<600sec`. * Allow time units in the `age:` metatag to be abbreviated as long as they're unambiguous. This means `age:<60sec`, `age:<5min`, and `age:<5mon` now work, in addition to `age:<60s` and `age:<60seconds`. * Allow the `ratio:` metatag to be written like `ratio:16/9` in addition to `ratio:16:9`. * Fix invalid date searches like `date:foo` or `date:05-15-2021` to return nothing instead of raising an "undefined method 'beginning_of_day' for nil" exception. (`date:05-15-2021` is invalid because it's parsed as DD-MM-YYYY). * Fix invalid searches like `score:foo`, `ratio:foo`, and `mpixels:foo` to return nothing instead of being treated like `score:0`, `ratio:0`, `mpixels:0`. * Fix `age:<60m` to return nothing instead of silently being treated like `age:<60seconds`. * Fix `age:foo` to return nothing instead of silently being treated like `age:0d` (return all uploads from today). Fixes #4389.	2021-11-02 01:54:05 -05:00
evazion	84212acfae	Merge pull request #4905 from nottalulah/remove-locks-from-autocomplete remove references to locks	2021-10-25 21:18:36 -05:00
evazion	f1b5c34b4d	posts: show length of videos and animations in thumbnails. Show the length of videos and animated posts in the thumbnail. The length is shown the top left corner in MM:SS format. This replaces the play button icon. Show a speaker icon instead of a music note icon for posts with sound. Doing this requires doing `.includes(:media_asset)` in a bunch of places to avoid N+1 queries when we access the post's duration.	2021-10-25 02:56:55 -05:00
Lily	647848b499	remove references to locks	2021-10-24 15:16:48 -03:00
evazion	587a9d0c8f	tags: move tag category definitions out of the config file. Move all the code for defining tag categories from the config file to TagCategory. It didn't belong in the config because it's not possible to add new tag categories purely in the config without editing other things like the CSS. Also change it so that tag colors are hardcoded in the CSS instead of generated using ERB. Generating the CSS in ERB meant that the Docker build had to recompile the CSS on every commit, even when it didn't change, because it relied on Ruby code outside the CSS that we couldn't guarantee didn't change.	2021-10-12 21:17:17 -05:00
evazion	92e20713e3	search: fixup hardcoded small search threshold. Fixup for `f6abf39eb`.	2021-10-12 19:01:31 -05:00
evazion	f6abf39ebc	search: try to optimize slow searches. Try to optimize certain types of common slow searches: * Searches for mutually-exclusive tags (e.g. `1girl multiple_girls`, `touhou solo -1girl -1boy`) * Relatively large tags that are heavily skewed towards old posts (e.g. lucky_star, haruhi_suzumiya_no_yuuutsu, inazuma_eleven_(series), imageboard_desourced). * Mid-sized tags in the <30k post range that Postgres thinks are big enough for a post id index scan, but a tag index scan is faster. The general pattern is Postgres not using the tag index because it thinks scanning down the post id index would be faster, but it's actually much slower because it degrades to a full table scan. This usually happens when Postgres thinks a tag is larger or more common than it really is. Here we try to force Postgres into using the tag index when we know the search is small. One case that is still slow is `2girls -multiple_girls`. This returns no results, but we can't know that without searching all of `2girls`. The general case is searching for `A -B` where A is a subset of B and A and B are both large tags. Hopefully fixes #581, #654, #743, #1020, #1039, #1421, #2207, #4070, #4337, #4896, and various other issues raised over the years regarding slow searches.	2021-10-12 02:30:30 -05:00
evazion	0b22e873c9	search: cache timed out search counts. When a search is performed, we cache the post count so we don't have to calculate it again every time the user switches pages. However, if the count times out, we didn't cache it before, causing us to do a slow count on every page load. This usually happens on multi-tag searches that return a lot of results, `1girl solo` for example. This changes it so that the count is cached even when it times out. This will speed up large multi-tag searches. This also changes it so that the count is cached for a fixed 5 minutes. Before it was variable based on the size of the count, but this probably didn't make much difference.	2021-10-12 01:33:21 -05:00
evazion	f155023b77	posts: remove unused exception classes.	2021-10-11 18:58:15 -05:00
evazion	37a8dc5dbd	posts: use string_to_array index for tag searches. Use the `string_to_array(tag_string, ' ')` index instead of the `tag_index` for tag searches. The string_to_array index lets us treat the tag_string as an array for searching purposes. This lets us get rid of the tag_index column and the test_parser dependency in the future.	2021-10-10 22:00:10 -05:00
evazion	1653392361	posts: stop updating fav_string attribute. Stop updating the fav_string attribute on posts. The column still exists on the table, but is no longer used or updated. Like the pool_string in `7d503f08`, the fav_string was used in the past to facilitate `fav:X` searches. Posts had a hidden fav_string column that contained a list of every user who favorited the post. These were treated like fake hidden tags on the post so that a search for `fav:X` was treated like a tag search. The fav_string attribute has been unused for search purposes for a while now. It was only kept because of technicalities that required departitioning the favorites table first (`340e1008e`) before it could be removed. Basically, removing favorites with `@favorite.destroy` was slow because Rails always deletes object by ID, but we didn't have an index on favorites.id, and we couldn't easily add one until the favorites table was departitioned. Fixes #4652. See https://github.com/danbooru/danbooru/issues/4652#issuecomment-754993802 for more discussion of issues caused by the fav_string (in short: write amplification, post table bloat, and favorite inconsistency problems).	2021-10-09 22:36:26 -05:00
evazion	c4eeeb8531	search: optimize counting posts for fav: and pool: searches. Optimize counting the number of posts returned by fav:<name> and pool:<name> searches. Use cached counts to avoid slow count(*) queries for users with lots of favorites.	2021-10-08 21:26:42 -05:00
evazion	340e1008e9	favorites: merge favorites subtables. Merge the 100 favorite subtables into a single table. Previously the favorites table was partitioned by user id into 100 subtables to try to make searching by user id faster. This wasn't really necessary and probably slower than just making an index on (favorites.user_id, favorites.id) to satisfy ordfav searches. BTree indexes are logarithmic so dividing an index by 100 doesn't make it 100 times faster to search; instead it just removes a layer or two from the tree. This also adds a uniqueness index on (user_id, post_id) to prevent duplicate favorites. Previously we had to check for duplicates at the application layer, which required careful locking to do it correctly. Finally, this adds an index on favorites.id, which was surprisingly missing before. This made ordering and deleting favorites by id really slow because it degraded to a sequential scan.	2021-10-08 21:26:42 -05:00
evazion	595e02ab45	posts: add duration:<x> and order:duration metatags. Add duration:<x> and order:duration metatags for searching animated posts by duration. https://danbooru.donmai.us/posts?tags=animated+duration:<5.0 https://danbooru.donmai.us/posts?tags=animated+duration:>60 https://danbooru.donmai.us/posts?tags=animated+order:duration	2021-10-07 03:21:08 -05:00
evazion	126046cb69	posts: remove rating, note, and status locks. Remove the ability for users to lock ratings, note, and post statuses. Historically the majority of locked posts were from 10+ years ago when certain users habitually locked ratings and notes on every post they touched for no reason. Nowadays most posts have been unlocked. Only a handful of locked posts are left, none of which deserve to be locked. The is_rating_locked, is_note_locked, and is_status_locked columns still exist in the database, but aren't used.	2021-09-27 22:32:30 -05:00
evazion	313257b771	posts: add `exif:<value>` search metatags. Examples: * https://danbooru.donmai.us/posts?tags=exif:File:ColorComponents * https://danbooru.donmai.us/posts?tags=exif:GIF:GIFVersion * https://danbooru.donmai.us/posts?tags=exif:PNG:ColorType * https://danbooru.donmai.us/posts?tags=exif:PNG:ColorType=RGB * https://danbooru.donmai.us/posts?tags=exif:GIF:GIFVersion=89a * https://danbooru.donmai.us/posts?tags=exif:File:ColorComponents=3	2021-09-16 02:13:15 -05:00
evazion	d00aa847ae	search: allow mods to search `disapproved:<user>` for other users. Allow moderators to search `disapproved:<username>` with any user. Before mods could only search for their own disapprovals, even though they could see disapprovals by others.	2021-09-01 01:39:14 -05:00
Thayol	b9068b8a3e	Fix #4435 : Search: wildcards with no matches should return no results	2021-06-24 04:04:13 -05:00
evazion	00ca7526bb	docs: add remaining docs for classes in app/logical.	2021-06-24 01:31:41 -05:00
evazion	28c0a48117	discord: fix tag search commands being limited to 2 tags.	2021-03-14 16:42:07 -05:00

1 2 3 4 5 ...

276 Commits