Commit Graph

284 Commits

Author SHA1 Message Date
evazion
174c8e0067 Fix #5335: Queries with "ordfav:<username>" and geometry attributes (e.g. "ratio:", "height:") crashes the api/site.
Fix `Relation passed to #and must be structurally compatible. Incompatible values: [:joins] (ArgumentError)`
exception in `ordfav:evazion ratio:4:3` search. Broken by e849d8f1c.

We were effectively doing this:

    q1 = Post.joins(:favorites, :media_asset).where("favorites.user_id = ?", 52664).order("favorites.id DESC")
    q2 = Post.joins(:media_asset, :favorites).where("ROUND(media_assets.image_width::numeric / media_assets.image_height::numeric, 2) = 1.33")
    q3 = q1.and(q2)

This failed because Rails didn't like the fact that the joins were in a different order when the
queries were `and`-ed together.
2022-11-06 21:13:48 -06:00
evazion
e849d8f1c2 posts: optimize filetype: searches.
When searching posts by width, height, file size, or file extension, use the
values from the media_assets table rather than the posts table.

This makes filetype: searches faster because the file_ext is indexed on
the media assets table, but not on the posts table.

This paves the way for getting rid of the width, height, file_size, and
file_ext indexes on the posts table in the future. It's wasteful to
index these columns on both the posts table and the media assets table.
2022-11-02 02:03:14 -05:00
nonamethanks
ca31e7a47c Users: add Contributor and Approver user levels 2022-10-21 20:52:31 +02:00
evazion
4c03ea5be3 Fix #5132: Modqueue displays active posts when excluding any search term
Fix a bug where searching for a negated tag inside the modqueue would show
active posts.

The bug was that in a search like this:

    Post.in_modqueue.user_tag_match("-solo")

The `in_modqueue` condition would get sucked inside the tag search and negated
when we tried to apply the negation operator to the "solo" tag. So effectively
the `in_modqueue` condition would get negated and we would end up searching for
everything not in the modqueue.
2022-09-28 00:29:50 -05:00
evazion
59f166a637 Fix #5057: Modqueue: filtering by tag breaks ordering
Fix the order dropdown box on the modqueue page not working when filtering by tag.

This happened because when you do a tag search, the default order is set to `ORDER BY posts.id DESC`.
When you applied another order with the dropdown box, the new order would be tacked on to the old
ordering as a tiebreaker instead of replacing it, producing e.g. `ORDER BY posts.id DESC, queued_at DESC`
instead of `ORDER BY queued_at DESC`. The default order would always win because `posts.id` is
unique and doesn't have ties.

The fix is to have orders always override the previous order instead of adding to it.

Note that now if you use an `order:`, `ordfav:`, or `ordpool:` metatag in the search box on the
modqueue page, they will always be ignored in favor of the dropdown box.
2022-09-28 00:29:50 -05:00
evazion
116c1f1af8 searchable: factor out metatag value parser.
Factor out the code that parses metatag values (e.g. `score:>5`) and
search URL params (e.g. `search[score]=>5`) into a RangeParser class.

Also fix a bug where, if the `search[order]=custom` param was used
without a `search[id]` param, an exception would be raised. Fix another
bug where if an invalid `search[id]` was provided, then the custom order
would be ignored and the search would be returned with the default order
instead. Now if you use `search[order]=custom` without a valid
`search[id]` param, the search will return no results.
2022-09-26 22:50:45 -05:00
evazion
e5edd79180 flags: fix flagger:<name> not returning self-flagged uploads
Fix the search `flagger:evazion user:evazion` not returning the user's own self-flagged uploads.

Followup to a6e0872ce.

Fixes #4690: user profile 'flags' count links to /post_flags with different search criteria
2022-09-22 23:07:53 -05:00
evazion
a442658f8a Fix #5237: Deleted comments can be viewed by other users
* Fix it so non-moderators can't search deleted comments using the
  `updater`, `body`, `score`, `do_not_bump_post`, or `is_sticky` fields.
  Searching for these fields will exclude deleted comments.

* Fix it so non-moderators can search for their own deleted comments using the
  `creator` field, but not for deleted comments belonging to other users.

* Fix it so that if a regular user searches `commenter:<username>`, they
  can only see posts with undeleted comments by that user. If a moderator or
  the commenter themselves searches `commenter:<username>`, they can see all
  posts the user has commented on, including posts with deleted comments.

* Fix it so the comment count on user profiles only counts visible
  comments. Regular users can only see the number of undeleted comments
  a user has, while moderators and the commenter themselves can see the
  total number of comments.

Known issue:

* It's still possible to order deleted comments by score, which can let
  you infer the score of deleted comments.
2022-09-22 19:17:33 -05:00
evazion
1aeb52186e Add AI tag model and UI.
Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags.

AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that
repo for details about the model.

The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is
designed to be as space-efficient as possible, since in production we have over 300 million
AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus
indexes.

You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts
where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are
potentially mistagged (or more likely where the AI missed the tag).

You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by
confidence level. You can also search unposted media assets by AI tag.

To generate tags, use the `autotag` script from the Autotagger repo, something like this:

  docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images | gzip > tags.csv.gz

To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take
hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.
2022-06-24 04:54:26 -05:00
evazion
d346adabc9 Revert "posts: fix rounding errors in ratio: metatag."
This reverts commit 80ced3e418.

This turned out to be intentional. Rounding the aspect ratio to 2
decimal places is so that searches for exact ratios like `ratio:16:9` or
`ratio:1.78` work even when the ratio doesn't exactly match. Rounding to
2 decimal places means that the ratio: metatag has a 1% error tolerance.
2022-05-22 12:37:26 -05:00
evazion
80ced3e418 posts: fix rounding errors in ratio: metatag.
Fix the ratio: metatag sometimes including wrong results due to rounding
errors. For example, searching for `ratio:>=4.0` would include post
3220414, which has an aspect ratio of 3.99879. This would get rounded up
to 2 decimal places to 4.00.
2022-05-21 14:08:50 -05:00
evazion
9867514a78 Fix #5177: ordfav with commentary search raises exception. 2022-05-20 22:59:02 -05:00
evazion
181639368c posts: add is: and has: metatags.
Add the following metatags:

* is:parent
* is:child
* is:safe
* is:questionable
* is:explicit
* is:sfw (same as -rating:q,e)
* is:nsfw (same as rating:q,e)
* is:active
* is:deleted
* is:pending
* is:flagged
* is:appealed
* is:banned
* is:modqueue
* is:unmoderated
* is:jpg
* is:png
* is:gif
* is:mp4
* is:webm
* is:swf
* is:zip
* has:parent
* has:children
* has:source
* has:appeals
* has:flags
* has:replacements
* has:comments
* has:commentary
* has:notes
* has:pools

All of these searches were already possible with other metatags, but these might be more convenient.
2022-05-18 13:04:15 -05:00
evazion
141044d352 posts: refactor hardcoded ratings.
Refactor ratings to not be hardcoded in various places. Make it so
all ratings are defined in Post::RATINGS.

Also make it so that you can search multiple ratings at once with `rating:q,e`.
2022-05-18 13:04:15 -05:00
evazion
f117049750 users: remove 'hide deleted posts' account setting.
This setting automatically added the `-status:deleted` metatag to all searches. This meant deleted
posts were filtered out at the database level, rather than at the html level. This way searches
wouldn't have less-than-full pages.

The cost was that searches were slower, mainly because post counts weren't cached. Normally when you
search for a tag, we can get the post count from the tags table. If the search is actually like
`touhou -status:deleted`, then we don't know the count and we have to calculate it on demand.

This option is being removed because it did the opposite of what people thought it did. People
thought it made deleted posts visible, when actually it made them more hidden.
2022-05-01 00:47:46 -05:00
evazion
918f32c554 Fix #4461: Improve posts/index page titles. 2022-04-30 01:52:33 -05:00
evazion
bbe748bd2b posts: factor out post edit logic.
Factor out most of the tag edit logic from the Post class to a new
PostEdit class. The PostEdit class contains the logic for parsing tags
and metatags from the tag edit string, and for determining which tags
were added or removed by the edit.

Fixes various bugs caused by not calculating the set of added or removed
tags correctly, for example when tag category prefixes were used (e.g.
`copy:touhou`) or when the same tag was added and removed in the same
edit (e.g. `touhou -touhou`).

Fixes #5123: Tag categorization prefixes bypass deprecation check
Fixes #5126: Negating a deprecated tag will still cause the warning to show
Fixes #3477: Remove tag validator triggering on tag category changes
Fixes #4848: newpool: metatag doesn't parse correctly
2022-04-29 17:13:33 -05:00
evazion
eca0ab04f7 post queries: raise error on invalid searches.
Raise an error if the search is invalid for one of the following reasons:

* It contains multiple conflicting order: metatags (e.g. `order:score order:favcount` or `ordfav:a ordfav:b`).
* It contains a metatag that can't be used more than once: (e.g. `limit:5 limit:10`, `random:5 random:10`).
* It contains a metatag that can't be negated (e.g. `-order:score`, `-limit:20`, or `-random:20`).
* It contains a metatag that can't be used in an OR clause (e.g. ` touhou or order:score`, `touhou or limit:20`, `touhou or random:20`).
2022-04-17 23:20:22 -05:00
evazion
af183467b6 post queries: switch to new post search engine.
Switch to the post search engine using the new PostQuery parser. The new
engine fully supports AND, OR, and NOT operators and grouping expressions
with parentheses.

Highlights:

New OR operator:

* `skirt or dress` (same as `~skirt ~dress`)

Tags can be grouped with parentheses:

* `1girl (skirt or dress)`
* `(blonde_hair blue_eyes) or (red_hair green_eyes)`
* `~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `(pantyhose or thighhighs) (black_legwear or brown_legwear)`
* `(~pantyhose ~thighhighs) (~black_legwear ~brown_legwear)` (same as above)

Metatags can be OR'd together:

* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`

Wildcard tags can combined with either AND or OR:

* `black_* white_*` (find posts with at least one black_* tag AND one white_* tag)
* `black_* or white_*` (find posts with at least one black_* tag OR one white_* tag)
* `~black_* ~white_*` (same as above)

See 4c7cfc73 for more syntax examples.

Fixes #4949: And+or search?
Fixes #5056: Wildcard searches return unexpected results when combined with OR searches
2022-04-17 23:20:22 -05:00
evazion
86de5cb5d2 posts: fixup flagger: metatag.
Fix regression in 01a22930e.
2022-04-06 23:57:50 -05:00
evazion
01a22930e7 posts: move attribute search methods from PostQueryBuilder to Post.
Move `status_matches` etc methods from PostQueryBuilder to Post. This is
to make refactoring to use the new query parser easier.
2022-04-06 20:25:09 -05:00
evazion
f15f365375 Merge pull request #4952 from thayol/fix-negated-ord
Search: "Fix" negated ordered metatags
2022-04-06 04:43:24 -05:00
Thayol
e45e42d479 Use longer lines instead of conditional variables (CodeClimate) 2022-03-31 23:38:40 +02:00
Thayol
70c81f7d49 Change local variable instead of passed object 2022-03-31 23:35:22 +02:00
Thayol
89b40a65ba Refactor to hash from multiple ifs 2022-03-30 17:54:52 +02:00
evazion
defea08084 posts: fix exception in random:1 searches.
Fix regression in 1ad0e8688. Caused by `relation.order_values` returning
an array of Arel nodes instead of an array of strings when doing a
`random:1` search.
2022-03-21 01:29:10 -05:00
evazion
1ad0e8688d posts: fix timeouts for searches using sequential navigation.
Fix certain searches timing out when using sequential navigation (page=b1234).

The problem was that the so-called "small search optimization" (AKA: force Postgres
to use the tag index for small searches instead a sequential scan) wasn't triggering
because the ORDER BY clause for sequential navigation was `posts.id desc`, and we
were only checking for `posts.id DESC`.
2022-03-20 18:46:06 -05:00
Thayol
cbe7ee4897 Remove trailing whitespace 2022-01-21 18:02:08 +01:00
evazion
1518c3c4be posts: fix search queries not being logged to NewRelic in some cases (#4900)
Fix the /posts index controller not logging the normalized search query
to NewRelic when the search failed, either because of a tag limit error,
a search timeout, or a RSS feed rate limit error.

Also don't log the number of search results when it's an API request or
failed search. This is to avoid doing a potentially slow full post count
when it's not otherwise needed.
2022-01-11 13:39:30 -06:00
evazion
72ea78e697 searchable: replace find_ordered with in_order_of.
Rails 7 added an `in_order_of` method that does what our `find_ordered`
method did before.
2022-01-07 14:24:57 -06:00
Thayol
5799b1bdbe Remove double looping 2022-01-04 21:02:15 +01:00
Thayol
660ba43edb Replace ordered metatags when negated 2022-01-04 20:38:27 +01:00
evazion
a7dc05ce63 Enable frozen string literals.
Make all string literals immutable by default.
2021-12-14 21:33:27 -06:00
evazion
0baca68a37 search: make order:random truly random; add random:N metatag.
Make the `order:random` metatag truly randomize the search. Add a
`random:N` metatag that returns up to N random posts, like what
`order:random` did before.

`order:random` now returns the entire search in random order. Before it
just returned a pageful of pseudorandom posts. This will be more
accurate for small searches, but slower for large searches. If
`order:random` times out, try `random:N` instead.

The `random:N` metatag returns up to N pseudorandom posts. This is
faster than `order:random` for large searches, but for small searches,
it may return less than N posts, and the randomness may be biased. Some
posts may be more likely than others to appear. N must be between 0 and
200.

Also, `/posts?tags=touhou&random=1` now redirects to `/posts?tags=touhou+random:N`.
Before the `random=1` param acted like a free `order:random` tag; now it
redirects to a `random:N` search, so it counts against your tag limit.
2021-11-25 18:14:34 -06:00
evazion
5dc67613e6 search: optimize username metatags.
Optimize metatag searches involving usernames, including user:,
approver:, appealer:, commenter:, upvoter:, etc.

Do `User.find_by_name` instead of `User.name_matches` because this
fetches the user upfront instead of doing it inside a subquery. Using a
subquery makes the SQL more complicated and leads to worse query plans.
This especially helps searches involving multiple username metatags.
2021-11-25 00:40:53 -06:00
evazion
353e708538 votes: allow admins to remove post votes.
Allow admins to remove votes on posts. This is for fixing vote abuse.

Votes can be removed by going to the vote list on the /post_votes page,
or by clicking on a post's score, then using the "Remove" option in the
"..." dropdown menu next to the vote.

Votes are soft-deleted - they're marked as deleted in the database, but
not fully deleted. Removed votes are only visible to admins, not to
regular users. When a vote is removed by an admin, it leaves a mod
action.

Technically it's possible to undelete votes, but there's no UI for it.
2021-11-23 23:18:54 -06:00
evazion
43c2870664 Fix #4917: Add down_score/up_score orders and metasearches.
Add `upvotes:N`, `downvotes:N`, `order:upvotes`, `order:downvotes`,
`order:upvotes_asc`, `order:downvotes_asc` metatags.

In the API, the field is called up_score / down_score. Here it's called
`upvotes` and `downvotes` because this should be easier to understand
for end users.

Note that internally, `down_score` is negative. A post that matches
`downvotes:>5` will have down_score < -5 internally.
2021-11-16 03:52:38 -06:00
evazion
148752d3c4 PostQueryBuilder: remove useless code.
The workaround for `unaliased:fav:1` is no longer needed since favorites
are no longer included in the post's tag_index.
2021-11-02 04:07:21 -05:00
evazion
a5ed8c72c9 search: fix parsing of invalid metatag values.
* Change `age:` metatag to require time units. This means e.g.
  `age:<600` no longer works; instead you have to say `age:<600sec`.

* Allow time units in the `age:` metatag to be abbreviated as long as
  they're unambiguous. This means `age:<60sec`, `age:<5min`, and
  `age:<5mon` now work, in addition to `age:<60s` and `age:<60seconds`.

* Allow the `ratio:` metatag to be written like `ratio:16/9` in addition
  to `ratio:16:9`.

* Fix invalid date searches like `date:foo` or `date:05-15-2021`
  to return nothing instead of raising an "undefined method
  'beginning_of_day' for nil" exception. (`date:05-15-2021` is invalid
  because it's parsed as DD-MM-YYYY).

* Fix invalid searches like `score:foo`, `ratio:foo`, and `mpixels:foo`
  to return nothing instead of being treated like `score:0`, `ratio:0`,
  `mpixels:0`.

* Fix `age:<60m` to return nothing instead of silently being treated
  like `age:<60seconds`.

* Fix `age:foo` to return nothing instead of silently being treated like
  `age:0d` (return all uploads from today).

Fixes #4389.
2021-11-02 01:54:05 -05:00
evazion
84212acfae Merge pull request #4905 from nottalulah/remove-locks-from-autocomplete
remove references to locks
2021-10-25 21:18:36 -05:00
evazion
f1b5c34b4d posts: show length of videos and animations in thumbnails.
Show the length of videos and animated posts in the thumbnail. The
length is shown the top left corner in MM:SS format. This replaces the
play button icon.

Show a speaker icon instead of a music note icon for posts with sound.

Doing this requires doing `.includes(:media_asset)` in a bunch of
places to avoid N+1 queries when we access the post's duration.
2021-10-25 02:56:55 -05:00
Lily
647848b499 remove references to locks 2021-10-24 15:16:48 -03:00
evazion
587a9d0c8f tags: move tag category definitions out of the config file.
Move all the code for defining tag categories from the config file to
TagCategory. It didn't belong in the config because it's not possible to
add new tag categories purely in the config without editing other things
like the CSS.

Also change it so that tag colors are hardcoded in the CSS instead of
generated using ERB. Generating the CSS in ERB meant that the Docker
build had to recompile the CSS on every commit, even when it didn't
change, because it relied on Ruby code outside the CSS that we couldn't
guarantee didn't change.
2021-10-12 21:17:17 -05:00
evazion
92e20713e3 search: fixup hardcoded small search threshold.
Fixup for f6abf39eb.
2021-10-12 19:01:31 -05:00
evazion
f6abf39ebc search: try to optimize slow searches.
Try to optimize certain types of common slow searches:

* Searches for mutually-exclusive tags (e.g. `1girl multiple_girls`,
  `touhou solo -1girl -1boy`)
* Relatively large tags that are heavily skewed towards old posts
  (e.g. lucky_star, haruhi_suzumiya_no_yuuutsu, inazuma_eleven_(series),
  imageboard_desourced).
* Mid-sized tags in the <30k post range that Postgres thinks are
  big enough for a post id index scan, but a tag index scan is faster.

The general pattern is Postgres not using the tag index because it
thinks scanning down the post id index would be faster, but it's
actually much slower because it degrades to a full table scan. This
usually happens when Postgres thinks a tag is larger or more common than
it really is. Here we try to force Postgres into using the tag index
when we know the search is small.

One case that is still slow is `2girls -multiple_girls`. This returns no
results, but we can't know that without searching all of `2girls`. The
general case is searching for `A -B` where A is a subset of B and A and B
are both large tags.

Hopefully fixes #581, #654, #743, #1020, #1039, #1421, #2207, #4070,
 #4337, #4896, and various other issues raised over the years regarding
slow searches.
2021-10-12 02:30:30 -05:00
evazion
0b22e873c9 search: cache timed out search counts.
When a search is performed, we cache the post count so we don't have to
calculate it again every time the user switches pages. However, if the
count times out, we didn't cache it before, causing us to do a slow
count on every page load. This usually happens on multi-tag searches
that return a lot of results, `1girl solo` for example.

This changes it so that the count is cached even when it times out. This
will speed up large multi-tag searches.

This also changes it so that the count is cached for a fixed 5 minutes.
Before it was variable based on the size of the count, but this probably
didn't make much difference.
2021-10-12 01:33:21 -05:00
evazion
f155023b77 posts: remove unused exception classes. 2021-10-11 18:58:15 -05:00
evazion
37a8dc5dbd posts: use string_to_array index for tag searches.
Use the `string_to_array(tag_string, ' ')` index instead of the
`tag_index` for tag searches. The string_to_array index lets us treat
the tag_string as an array for searching purposes. This lets us get rid
of the tag_index column and the test_parser dependency in the future.
2021-10-10 22:00:10 -05:00
evazion
1653392361 posts: stop updating fav_string attribute.
Stop updating the fav_string attribute on posts. The column still exists
on the table, but is no longer used or updated.

Like the pool_string in 7d503f08, the fav_string was used in the past to
facilitate `fav:X` searches. Posts had a hidden fav_string column that
contained a list of every user who favorited the post. These were
treated like fake hidden tags on the post so that a search for `fav:X`
was treated like a tag search.

The fav_string attribute has been unused for search purposes for a while
now. It was only kept because of technicalities that required
departitioning the favorites table first (340e1008e) before it could be
removed. Basically, removing favorites with `@favorite.destroy` was
slow because Rails always deletes object by ID, but we didn't have an
index on favorites.id, and we couldn't easily add one until the
favorites table was departitioned.

Fixes #4652. See https://github.com/danbooru/danbooru/issues/4652#issuecomment-754993802
for more discussion of issues caused by the fav_string (in short: write
amplification, post table bloat, and favorite inconsistency problems).
2021-10-09 22:36:26 -05:00
evazion
c4eeeb8531 search: optimize counting posts for fav: and pool: searches.
Optimize counting the number of posts returned by fav:<name> and
pool:<name> searches. Use cached counts to avoid slow count(*) queries
for users with lots of favorites.
2021-10-08 21:26:42 -05:00