danbooru

Author	SHA1	Message	Date
evazion	6386962357	Fix #5225 : PG::AmbiguousColumn: ERROR: column reference "bit_prefs" is ambiguous	2022-07-02 17:37:22 -05:00
evazion	1aeb52186e	Add AI tag model and UI. Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags. AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that repo for details about the model. The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is designed to be as space-efficient as possible, since in production we have over 300 million AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus indexes. You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are potentially mistagged (or more likely where the AI missed the tag). You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by confidence level. You can also search unposted media assets by AI tag. To generate tags, use the `autotag` script from the Autotagger repo, something like this: docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images \| gzip > tags.csv.gz To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.	2022-06-24 04:54:26 -05:00
evazion	af183467b6	post queries: switch to new post search engine. Switch to the post search engine using the new PostQuery parser. The new engine fully supports AND, OR, and NOT operators and grouping expressions with parentheses. Highlights: New OR operator: * `skirt or dress` (same as `~skirt ~dress`) Tags can be grouped with parentheses: * `1girl (skirt or dress)` * `(blonde_hair blue_eyes) or (red_hair green_eyes)` * `~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above) * `(pantyhose or thighhighs) (black_legwear or brown_legwear)` * `(~pantyhose ~thighhighs) (~black_legwear ~brown_legwear)` (same as above) Metatags can be OR'd together: * `user:evazion or fav:evazion` * `~user:evazion ~fav:evazion` Wildcard tags can combined with either AND or OR: * `black_* white_` (find posts with at least one black_ tag AND one white_* tag) * `black_* or white_` (find posts with at least one black_ tag OR one white_* tag) * `~black_* ~white_*` (same as above) See `4c7cfc73` for more syntax examples. Fixes #4949: And+or search? Fixes #5056: Wildcard searches return unexpected results when combined with OR searches	2022-04-17 23:20:22 -05:00
evazion	01a22930e7	posts: move attribute search methods from PostQueryBuilder to Post. Move `status_matches` etc methods from PostQueryBuilder to Post. This is to make refactoring to use the new query parser easier.	2022-04-06 20:25:09 -05:00
evazion	51ba56e8a3	Fix #5001 : Media assets not searchable through upload records. Fix this: https://danbooru.donmai.us/uploads.json?search[media_assets][md5]=b83daa7f1ae7e4127b1befd32f71ba10 failing with an ActiveRecord::StatementInvalid error. The bug was that for a `has_many through: ...` association, like `has_many :media_assets, through: :upload_media_assets`, we weren't joining on the associated table properly so we ended up generating invalid SQL.	2022-02-08 19:18:11 -06:00
evazion	33103f6dc4	pools: add ability to search for pools linking to given tag. Add ability to search for pools linking to a given tag in the pool description. Example: https://danbooru.donmai.us/pools?search[linked_to]=touhou (This isn't actually exposed in the UI to avoid cluttering the pool search form with rarely used options.) Pools with broken links can be found here: https://danbooru.donmai.us/dtext_links?search[has_linked_tag]=No&search[has_linked_wiki]=No&search[model_type]=Pool Lays the groundwork for fixing #4629.	2022-01-15 20:26:30 -06:00
evazion	72ea78e697	searchable: replace find_ordered with in_order_of. Rails 7 added an `in_order_of` method that does what our `find_ordered` method did before.	2022-01-07 14:24:57 -06:00
evazion	82211ba935	jobs: add ability to search jobs on /jobs page. Add ability to search jobs on the /jobs page by job type or by status. Fixes #2577 (Search filters for delayed jobs). This wasn't possible before with DelayedJobs because it stored the job data in a YAML string, which made it difficult to search jobs by type. GoodJobs stores job data in a JSON object, which is easier to search in Postgres.	2022-01-04 17:18:36 -06:00
evazion	a7dc05ce63	Enable frozen string literals. Make all string literals immutable by default.	2021-12-14 21:33:27 -06:00
evazion	6b9e1181e5	search: optimize ?search[user_name]=... searches. Optimize searches using the `search[user_name]=...` URL parameter. If we're not doing a wildcard search, then do a regular user lookup, which generates better SQL.	2021-11-20 03:19:04 -06:00
evazion	bc96eb864b	votes: make private favorites and upvotes a Gold-only option. Make private favorites and upvotes a Gold-only account option. Existing Members with private favorites enabled are allowed to keep it enabled, as long as they don't disable it. If they disable it, then they can't re-enable it again without upgrading to Gold first. This is a Gold-only option to prevent uploaders from creating multiple accounts to upvote their own posts. If private upvotes were allowed for Members, then it would be too easy to use fake accounts and private upvotes to upvote your own posts.	2021-11-18 04:11:51 -06:00
evazion	2845164872	search: support quoted phrases, OR, and NOT operators in full-text search. Make all full-text search fields support quoted phrases and OR and NOT operators. This affects all text search fields (any search field that looks like `_matches`). Examples: hakurei reimu - matches anything containing the words "hakurei" and "reimu", in any order. * hakuri or reimu - matches either "hakurei" or "reimu". * hakurei -reimu - matches "hakurei" but not "reimu" * "hakurei reimu" - matches the exact phrase "hakurei reimu" * "reimu hakurei" - matches the exact phrase "reimu hakurei" * https://danbooru.donmai.us/notes?search[body_matches]=reimu+hakurei * https://danbooru.donmai.us/notes?search[body_matches]=reimu+or+hakurei * https://danbooru.donmai.us/notes?search[body_matches]=reimu+-hakurei * https://danbooru.donmai.us/notes?search[body_matches]="hakurei+reimu" * https://danbooru.donmai.us/notes?search[body_matches]="reimu+hakurei" The phrase search ability partially fixes #4536 (Inconsistent behavior of search function for comments/forums). See `websearch_to_tsquery` [1] for full details of the search syntax. [1]: https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES	2021-10-16 19:13:09 -05:00
evazion	e3b836b506	Refactor full-text search to get rid of tsvector columns. Refactor full-text search on several tables (comments, dmails, forum_posts, forum_topics, notes, and wiki_pages) to use to_tsvector expression indexes instead of dedicated tsvector columns. This way full-text search works the same way across all tables. API changes: * Changed /wiki_pages.json?search[body_matches] to match against only the body. Before `body_matches` matched against both the title and the body. * Added /wiki_pages.json?search[title_or_body_matches] to match against both the title and the body. * Fixed /dmails.json?search[message_matches] to match against both the title and body when doing a wildcard search. Before a wildcard search only matched against the body. * Added /dmails.json?search[body_matches] to match against only the dmail body.	2021-10-16 07:44:27 -05:00
evazion	c0f744f84d	Fix #4893 : Add a FIELD_present parameter variation for text fields. Usage: * https://danbooru.donmai.us/wiki_pages.json?search[body_present]=true * https://danbooru.donmai.us/wiki_pages.json?search[body_present]=false	2021-10-13 04:10:23 -05:00
evazion	7976323f7a	wiki pages: change tsvector update trigger to not use test_parser. Change the wiki_pages tsvector_update_trigger to use `pg_catalog.english` instead of `public.danbooru`. This changes how wiki page text is parsed for full-text search to use the standard English parser instead of test_parser. This is to prepare for dropping test_parser. Using test_parser here was wrong anyway because it meant that punctuation wasn't removed from words when indexing wiki pages for full-text search.	2021-10-11 03:34:47 -05:00
evazion	37a8dc5dbd	posts: use string_to_array index for tag searches. Use the `string_to_array(tag_string, ' ')` index instead of the `tag_index` for tag searches. The string_to_array index lets us treat the tag_string as an array for searching purposes. This lets us get rid of the tag_index column and the test_parser dependency in the future.	2021-10-10 22:00:10 -05:00
evazion	ea6e47125e	metadata: add ability to search exif metadata. Usage: * https://danbooru.donmai.us/media_metadata?search[has_metadata]=true * https://danbooru.donmai.us/media_metadata?search[has_metadata]=false * https://danbooru.donmai.us/media_metadata?search[metadata_has_key]=GIF:GIFVersion * https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]=89a * https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]&search[metadata][GIF:BackgroundColor]=0	2021-09-16 00:25:21 -05:00
evazion	49d18e64e8	Fix #4869 : "Random" button raises exception when viewing ordfav. Fix exception during https://danbooru.donmai.us/posts/random?tags=ordfav:nonamethanks Before we were doing a query like this: SELECT "posts".* FROM "posts" INNER JOIN "favorites" ON "favorites"."post_id" = "posts"."id" WHERE (favorites.user_id % 100 = 64 AND favorites.user_id = 52664) AND "posts"."id" = 343894 ORDER BY favorites.id DESC, posts.id DESC, ID=343894 DESC but `ID=? DESC` is ambiguous during an ordfav: search because of the join on the favorites table. The fix is to qualify the reference as `posts.id`.	2021-08-30 16:46:03 -05:00
evazion	ad4c75eb1a	docs add more docs to app/{jobs,logical}. These were missed in the last commit.	2021-06-28 05:09:19 -05:00
evazion	00ca7526bb	docs: add remaining docs for classes in app/logical.	2021-06-24 01:31:41 -05:00
evazion	6b91e55283	comments: allow votes to be soft deleted. Make it so that when a user removes their own vote, the vote is soft deleted (the is_deleted flag is set) instead of hard deleted. Changes: * Add is_deleted flag to comment votes. * Relax uniqueness constraint so you can have multiple deleted votes on the same comment. You can still only have one active vote on the comment. * Add `soft_delete` method to Deletable concern.	2021-03-30 00:10:22 -05:00
evazion	81fe68d392	bans: change `expires_at` field to `duration`. Changes: * Change the `expires_at` field to `duration`. * Make moderators choose from a fixed set of standard ban lengths, instead of allowing arbitrary ban lengths. * List `duration` in seconds in the /bans.json API. * Dump bans to BigQuery. Note that some old bans have a negative duration. This is because their expiration date was before their creation date, which is because in 2013 bans were migrated to Danbooru 2 and the original ban creation dates were lost.	2021-03-11 02:59:58 -06:00
evazion	92b8f24724	ip addresses: move more logic to Danbooru::IpAddress. * Move `is_local?` from IpLookup to Danbooru::IpAddress. * Refactor more things to use Danbooru::IpAddress instead of using IPAddress directly.	2021-03-01 20:13:14 -06:00
evazion	031032326e	mentions: fix exception when mentioning nonexistent user.	2021-02-05 19:40:30 -06:00
evazion	3f16fe3d80	Fix #4680 : @-ing yourself sends you a DMail. Don't send a dmail when the user @-mentions themselves, whether in an edit or in the original message.	2021-02-03 23:46:59 -06:00
evazion	ef177a09cf	searchable: fixup bugs in `e7b454686`.	2021-01-11 19:47:20 -06:00
evazion	c1b865b160	searchable: add more enum attribute search options. Add `<enum>_not` and `<enum>_id_<op>` search options: * https://danbooru.donmai.us/mod_actions?search[category_not]=post_regenerate,post_regenerate_iqdb * https://danbooru.donmai.us/mod_actions?search[category_not]=48,49 * https://danbooru.donmai.us/mod_actions?search[category_id]=40..50 * https://danbooru.donmai.us/mod_actions?search[category_id_not]=40..50 * https://danbooru.donmai.us/mod_actions?search[category_id_gt]=40&search[category_id_lt]=50	2021-01-11 19:13:35 -06:00
evazion	e7b454686e	searchable: refactor `where_operator` method. Refactor the `where_operator` method so we can use it to avoid raw SQL in more places.	2021-01-11 19:13:29 -06:00
evazion	6d2eeb6f28	searchable: fix being unable to use multiple operators on same attribute. Fix searches like this not working: * https://danbooru.donmai.us/tags?search[id]=1..100&search[id_not]=50 Before one of these params would override the other.	2021-01-11 14:59:04 -06:00
evazion	fc5db679e4	autocomplete: optimize searching by artist/wiki page other names. Optimize searches for non-English phrases in autocomplete. These searches were pretty slow, and could sometimes cause sitewide lag spikes when users typed long strings of non-English text into the search box and caused an unintentional DoS. The trick is to use an `array_to_tsvector(other_names) USING gin` index on other_names. This supports fast string prefix matching against all elements of the array. The downside is that it doesn't allow infix or suffix matches, so we can't support wildcards in general. Wildcards didn't quite work anyway, since artist and wiki other names can contain literal '*' characters.	2021-01-10 03:35:12 -06:00
evazion	0899194f6b	Fix conflict between normalize and array_attribute macros. Fix the `normalize` and `array_attribute` macros conflicting with each other on the WikiPage model. This meant code like `wiki_page.other_names = "foo bar"` didn't work. Both macros defined a `other_names=` method, but one method overrode the other. The fix is to use anonymous modules and prepend so we can chain method calls with super.	2021-01-10 02:03:12 -06:00
evazion	9759701071	search: add way to search array attributes by regex. Add a `where_any_in_array_matches_regex` method and expose it to the API: * https://danbooru.donmai.us/artists?search[any_other_name_matches_regex]=^blah * https://danbooru.donmai.us/wiki_pages?search[any_other_name_matches_regex]=^blah * https://danbooru.donmai.us/saved_searches?search[any_label_matches_regex]=^blah In SQL, this does `WHERE '^blah' ~<< ANY(other_names)`, where `~<<` is a custom operator based on the `~` regex match operator, but with the arguments reversed. This allows it to be used with the ANY(array) operator. See also: * https://stackoverflow.com/a/22101172 * https://www.postgresql.org/docs/current/sql-createfunction.html * https://www.postgresql.org/docs/current/sql-createoperator.html * https://www.postgresql.org/docs/current/functions-comparisons.html	2021-01-10 02:03:02 -06:00
evazion	65adcd09c2	users: track logins, signups, and other user events. Add tracking of certain important user actions. These events include: * Logins * Logouts * Failed login attempts * Account creations * Account deletions * Password reset requests * Password changes * Email address changes This is similar to the mod actions log, except for account activity related to a single user. The information tracked includes the user, the event type (login, logout, etc), the timestamp, the user's IP address, IP geolocation information, the user's browser user agent, and the user's session ID from their session cookie. This information is visible to mods only. This is done with three models. The UserEvent model tracks the event type (login, logout, password change, etc) and the user. The UserEvent is tied to a UserSession, which contains the user's IP address and browser metadata. Finally, the IpGeolocation model contains the geolocation information for IPs, including the city, country, ISP, and whether the IP is a proxy. This tracking will be used for a few purposes: * Letting users view their account history, to detect things like logins from unrecognized IPs, failed logins attempts, password changes, etc. * Rate limiting failed login attempts. * Detecting sockpuppet accounts using their login history. * Detecting unauthorized account sharing.	2021-01-08 22:34:37 -06:00
evazion	da3e8e4726	searchable: fix bug with searching multiple association attributes. Fix a bug with searches like the following not working correctly: * https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_id]=1234 * https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_name]=abcd * https://danbooru.donmai.us/comments.json?search[post][rating]=s&search[post_tags_match]=touhou It wasn't possible to search for both `creator` and `creator_id` at the same time (or `post` and `post_tags_match`, etc). Only the `creator_id` param would be recognized. Also refactor some internals: * `search_includes` was renamed to `search_associated_attribute`. * `search_attribute` was split up into `search_basic_attribute` and `search_associated_attribute`.	2021-01-07 17:10:29 -06:00
BrokenEagle	db5f9ce243	Support multiple excludes for enum types It's not possible to pass it off to search_numeric_attribute directly since the column "category" does not match the prefix "category_id".	2021-01-06 20:21:56 +00:00
BrokenEagle	57de81686b	Support using all numeric searches for includes	2021-01-06 20:21:56 +00:00
BrokenEagle	4a439d72d6	Support multiple exclusions Since it does a not of numeric_attribute_matches which uses the post query builder, it now also support reverse ranges and reverse greater/less than.	2021-01-06 20:21:55 +00:00
evazion	7f1b798b05	searchable: refactor search_boolean_attribute.	2020-12-27 05:26:21 -06:00
evazion	efb836ac02	wikis: normalize Unicode characters in wiki bodies. * Introduce an abstraction for normalizing attributes. Very loosely modeled after https://github.com/fnando/normalize_attributes. * Normalize wiki bodies to Unicode NFC form. * Normalize Unicode space characters in wiki bodies (strip zero width spaces, normalize line endings to CRLF, normalize Unicode spaces to ASCII spaces). * Trim spaces from the start and end of wiki page bodies. This may cause wiki page diffs to show spaces being removed even when the user didn't explicitly remove the spaces themselves.	2020-12-21 20:47:50 -06:00
evazion	2c1da660fd	tags: allow tag abbreviations in searches and during tagging. Expand the tag abbreviation system introduced in `b0be8ae45` so that it works in searches and when tagging posts, not just in autocomplete. For example, you can tag a post with /evth and it will add the tag eyebrows_visible_through_hair. You can search for /evth and it will search for the tag eyebrows_visible_through_hair. Some more examples: * /ops is short for one-piece_swimsuit * /hooe is short for hair_over_one_eye * /saol is short for standing_on_one_leg * /tlozbotw is short for the_legend_of_zelda:_breath_of_the_wild If two tags have the same abbreviation, then the larger tag takes precedence. For example, /be is short for blue_eyes, not brown_eyes, because blue_eyes is the bigger tag. If there is an existing shortcut alias that conflicts with the abbreviation, then the alias take precedence. For example, /sh is short for suzumiya_haruhi, not short_hair, because there's an old alias for /sh -> suzumiya_haruhi.	2020-12-17 23:57:13 -06:00
evazion	ee4516f5fe	searchable: refactor searchable_includes. Pass searchable associations directly to search_attributes instead of defining them separately in searchable_includes.	2020-12-16 23:57:07 -06:00
evazion	e771c0fca8	searchable: don't automatically include id, created_at, updated_at. Don't make search methods on models call super in order to search certain default attributes (id, created_at, updated_at). Simplifies some magic.	2020-12-16 23:57:07 -06:00
evazion	2297bf5da5	Fix #4638 : Add exclusions to the numeric attributes. Add the following search operators: * /tags?search[post_count_eq]=42 * /tags?search[post_count_not_eq]=42 * /tags?search[post_count_gt]=42 * /tags?search[post_count_gteq]=42 * /tags?search[post_count_lt]=42 * /tags?search[post_count_lteq]=42 Works for all numeric attributes on all index actions.	2020-12-16 20:03:09 -06:00
evazion	35134abe8f	post query builder: fix incompatibilities with Rails 6.1. * Rename the `#negate` and `#and` methods that we monkey patch into ActiveRecord::Relation. These methods are now defined in Rails 6.1, but they shadow our methods and have slightly different behavior. * Fix a call to `invert`. It no longer accepts an argument.	2020-12-13 04:10:48 -06:00
evazion	f0299a8945	aliases: refactor tag moving code. * Factor out the code for moving tags from tag aliases to a separate TagMover class. * When aliasing two tags that have conflicting wikis, merge the old wiki into the new one instead of failing with an error. Merge the other names fields, replace the old wiki body with a message linking to the new wiki, and mark the old wiki as deleted. * When aliasing two tags that have conflicting artist entries, merge the old artist into the new one instead of silently ignore the conflict. Merge the group name, other names, and urls fields, and mark the old artist as deleted. * When two tags have conflicting wikis or artist entries, but the old wiki or artist entry is deleted, then just ignore the old wiki or artist and don't try to merge it. * Fix it so that when saved searches are rewritten, we rewrite negated searches too.	2020-08-26 17:05:41 -05:00
evazion	70b82010a7	search: fix info leak when searching nested associations. Fix an exploit in #4553. It was possible to use nested searches to infer the contents of private forum posts. For example: * https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=h* * https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=he* * https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hel* * https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hell* * https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hello* The above searches returned the user 'albert', indicating that the private forum post with id 121683 starts with the word 'hello'. By guessing the id of a private forum post (which can be done by searching for gaps in the id sequence), and by guessing text within the post (which can be done by sequentially guessing characters with wildcard searches), one could eventually infer the full text of a private forum post. The fix is to make nested searches only return records that are visible to the current user.	2020-08-18 15:21:39 -05:00
BrokenEagle	36fa8efcd5	Fix parameter hash detection Hash-like objects will respond to each_value, whereas arrays do not.	2020-08-18 05:34:14 +00:00
evazion	5db11a0b5f	Merge branch 'master' into attribute-searching	2020-08-17 14:23:00 -05:00
evazion	2b0cd3c90b	searchable: add support for searching enum fields. Allow searching enum fields by string, by id, or by array of comma-separated values. The category field in modactions is an example of an enum field that can be searched this way.	2020-08-07 19:24:57 -05:00
BrokenEagle	c141a358bd	Add support for chaining more search includes - A generalized search includes function was added -- The post and user includes functions were changed to use that - A search function for polymorphic includes was added - All models are given 3 class functions to control which includes are searchable, and extra restrictions for the "has_" params	2020-07-27 19:29:17 +00:00

1 2

64 Commits