Commit Graph

60 Commits

Author SHA1 Message Date
evazion
51ba56e8a3 Fix #5001: Media assets not searchable through upload records.
Fix this:

  https://danbooru.donmai.us/uploads.json?search[media_assets][md5]=b83daa7f1ae7e4127b1befd32f71ba10

failing with an ActiveRecord::StatementInvalid error.

The bug was that for a `has_many through: ...` association, like
`has_many :media_assets, through: :upload_media_assets`, we weren't
joining on the associated table properly so we ended up generating
invalid SQL.
2022-02-08 19:18:11 -06:00
evazion
33103f6dc4 pools: add ability to search for pools linking to given tag.
Add ability to search for pools linking to a given tag in the pool
description. Example:

    https://danbooru.donmai.us/pools?search[linked_to]=touhou

(This isn't actually exposed in the UI to avoid cluttering the pool
search form with rarely used options.)

Pools with broken links can be found here:

    https://danbooru.donmai.us/dtext_links?search[has_linked_tag]=No&search[has_linked_wiki]=No&search[model_type]=Pool

Lays the groundwork for fixing #4629.
2022-01-15 20:26:30 -06:00
evazion
72ea78e697 searchable: replace find_ordered with in_order_of.
Rails 7 added an `in_order_of` method that does what our `find_ordered`
method did before.
2022-01-07 14:24:57 -06:00
evazion
82211ba935 jobs: add ability to search jobs on /jobs page.
Add ability to search jobs on the /jobs page by job type or by status.

Fixes #2577 (Search filters for delayed jobs). This wasn't possible
before with DelayedJobs because it stored the job data in a YAML string,
which made it difficult to search jobs by type. GoodJobs stores job data
in a JSON object, which is easier to search in Postgres.
2022-01-04 17:18:36 -06:00
evazion
a7dc05ce63 Enable frozen string literals.
Make all string literals immutable by default.
2021-12-14 21:33:27 -06:00
evazion
6b9e1181e5 search: optimize ?search[user_name]=... searches.
Optimize searches using the `search[user_name]=...` URL parameter. If
we're not doing a wildcard search, then do a regular user lookup, which
generates better SQL.
2021-11-20 03:19:04 -06:00
evazion
bc96eb864b votes: make private favorites and upvotes a Gold-only option.
Make private favorites and upvotes a Gold-only account option.

Existing Members with private favorites enabled are allowed to keep it
enabled, as long as they don't disable it. If they disable it, then they
can't re-enable it again without upgrading to Gold first.

This is a Gold-only option to prevent uploaders from creating multiple
accounts to upvote their own posts. If private upvotes were allowed for
Members, then it would be too easy to use fake accounts and private
upvotes to upvote your own posts.
2021-11-18 04:11:51 -06:00
evazion
2845164872 search: support quoted phrases, OR, and NOT operators in full-text search.
Make all full-text search fields support quoted phrases and OR and NOT
operators.

This affects all text search fields (any search field that looks like `*_matches`).

Examples:

* hakurei reimu   - matches anything containing the words "hakurei" and "reimu", in any order.
* hakuri or reimu - matches either "hakurei" or "reimu".
* hakurei -reimu  - matches "hakurei" but not "reimu"
* "hakurei reimu" - matches the exact phrase "hakurei reimu"
* "reimu hakurei" - matches the exact phrase "reimu hakurei"

* https://danbooru.donmai.us/notes?search[body_matches]=reimu+hakurei
* https://danbooru.donmai.us/notes?search[body_matches]=reimu+or+hakurei
* https://danbooru.donmai.us/notes?search[body_matches]=reimu+-hakurei
* https://danbooru.donmai.us/notes?search[body_matches]="hakurei+reimu"
* https://danbooru.donmai.us/notes?search[body_matches]="reimu+hakurei"

The phrase search ability partially fixes #4536 (Inconsistent behavior
of search function for comments/forums).

See `websearch_to_tsquery` [1] for full details of the search syntax.

[1]: https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
2021-10-16 19:13:09 -05:00
evazion
e3b836b506 Refactor full-text search to get rid of tsvector columns.
Refactor full-text search on several tables (comments, dmails,
forum_posts, forum_topics, notes, and wiki_pages) to use to_tsvector
expression indexes instead of dedicated tsvector columns. This way
full-text search works the same way across all tables.

API changes:

* Changed /wiki_pages.json?search[body_matches] to match against only
  the body. Before `body_matches` matched against both the title and the body.

* Added /wiki_pages.json?search[title_or_body_matches] to match against
  both the title and the body.

* Fixed /dmails.json?search[message_matches] to match against both the
  title and body when doing a wildcard search. Before a wildcard search
  only matched against the body.

* Added /dmails.json?search[body_matches] to match against only the dmail body.
2021-10-16 07:44:27 -05:00
evazion
c0f744f84d Fix #4893: Add a FIELD_present parameter variation for text fields.
Usage:

* https://danbooru.donmai.us/wiki_pages.json?search[body_present]=true
* https://danbooru.donmai.us/wiki_pages.json?search[body_present]=false
2021-10-13 04:10:23 -05:00
evazion
7976323f7a wiki pages: change tsvector update trigger to not use test_parser.
Change the wiki_pages tsvector_update_trigger to use
`pg_catalog.english` instead of `public.danbooru`. This changes how wiki
page text is parsed for full-text search to use the standard English
parser instead of test_parser. This is to prepare for dropping
test_parser. Using test_parser here was wrong anyway because it meant
that punctuation wasn't removed from words when indexing wiki pages for
full-text search.
2021-10-11 03:34:47 -05:00
evazion
37a8dc5dbd posts: use string_to_array index for tag searches.
Use the `string_to_array(tag_string, ' ')` index instead of the
`tag_index` for tag searches. The string_to_array index lets us treat
the tag_string as an array for searching purposes. This lets us get rid
of the tag_index column and the test_parser dependency in the future.
2021-10-10 22:00:10 -05:00
evazion
ea6e47125e metadata: add ability to search exif metadata.
Usage:

* https://danbooru.donmai.us/media_metadata?search[has_metadata]=true
* https://danbooru.donmai.us/media_metadata?search[has_metadata]=false
* https://danbooru.donmai.us/media_metadata?search[metadata_has_key]=GIF:GIFVersion
* https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]=89a
* https://danbooru.donmai.us/media_metadata?search[metadata][GIF:GIFVersion]&search[metadata][GIF:BackgroundColor]=0
2021-09-16 00:25:21 -05:00
evazion
49d18e64e8 Fix #4869: "Random" button raises exception when viewing ordfav.
Fix exception during https://danbooru.donmai.us/posts/random?tags=ordfav:nonamethanks

Before we were doing a query like this:

    SELECT
      "posts".*
    FROM
      "posts"
    INNER JOIN
      "favorites" ON "favorites"."post_id" = "posts"."id"
    WHERE
      (favorites.user_id % 100 = 64 AND favorites.user_id = 52664)
      AND "posts"."id" = 343894
    ORDER BY
      favorites.id DESC,
      posts.id DESC,
      ID=343894 DESC

but `ID=? DESC` is ambiguous during an ordfav: search because of the
join on the favorites table. The fix is to qualify the reference as
`posts.id`.
2021-08-30 16:46:03 -05:00
evazion
ad4c75eb1a docs add more docs to app/{jobs,logical}.
These were missed in the last commit.
2021-06-28 05:09:19 -05:00
evazion
00ca7526bb docs: add remaining docs for classes in app/logical. 2021-06-24 01:31:41 -05:00
evazion
6b91e55283 comments: allow votes to be soft deleted.
Make it so that when a user removes their own vote, the vote is soft
deleted (the is_deleted flag is set) instead of hard deleted.

Changes:

* Add is_deleted flag to comment votes.
* Relax uniqueness constraint so you can have multiple deleted votes on
  the same comment. You can still only have one active vote on the comment.
* Add `soft_delete` method to Deletable concern.
2021-03-30 00:10:22 -05:00
evazion
81fe68d392 bans: change expires_at field to duration.
Changes:

* Change the `expires_at` field to `duration`.
* Make moderators choose from a fixed set of standard ban lengths,
  instead of allowing arbitrary ban lengths.
* List `duration` in seconds in the /bans.json API.
* Dump bans to BigQuery.

Note that some old bans have a negative duration. This is because their
expiration date was before their creation date, which is because in 2013
bans were migrated to Danbooru 2 and the original ban creation dates
were lost.
2021-03-11 02:59:58 -06:00
evazion
92b8f24724 ip addresses: move more logic to Danbooru::IpAddress.
* Move `is_local?` from IpLookup to Danbooru::IpAddress.
* Refactor more things to use Danbooru::IpAddress instead of using
  IPAddress directly.
2021-03-01 20:13:14 -06:00
evazion
031032326e mentions: fix exception when mentioning nonexistent user. 2021-02-05 19:40:30 -06:00
evazion
3f16fe3d80 Fix #4680: @-ing yourself sends you a DMail.
Don't send a dmail when the user @-mentions themselves, whether in an
edit or in the original message.
2021-02-03 23:46:59 -06:00
evazion
ef177a09cf searchable: fixup bugs in e7b454686. 2021-01-11 19:47:20 -06:00
evazion
c1b865b160 searchable: add more enum attribute search options.
Add `<enum>_not` and `<enum>_id_<op>` search options:

* https://danbooru.donmai.us/mod_actions?search[category_not]=post_regenerate,post_regenerate_iqdb
* https://danbooru.donmai.us/mod_actions?search[category_not]=48,49
* https://danbooru.donmai.us/mod_actions?search[category_id]=40..50
* https://danbooru.donmai.us/mod_actions?search[category_id_not]=40..50
* https://danbooru.donmai.us/mod_actions?search[category_id_gt]=40&search[category_id_lt]=50
2021-01-11 19:13:35 -06:00
evazion
e7b454686e searchable: refactor where_operator method.
Refactor the `where_operator` method so we can use it to avoid raw SQL
in more places.
2021-01-11 19:13:29 -06:00
evazion
6d2eeb6f28 searchable: fix being unable to use multiple operators on same attribute.
Fix searches like this not working:

* https://danbooru.donmai.us/tags?search[id]=1..100&search[id_not]=50

Before one of these params would override the other.
2021-01-11 14:59:04 -06:00
evazion
fc5db679e4 autocomplete: optimize searching by artist/wiki page other names.
Optimize searches for non-English phrases in autocomplete. These
searches were pretty slow, and could sometimes cause sitewide lag spikes
when users typed long strings of non-English text into the search box
and caused an unintentional DoS.

The trick is to use an `array_to_tsvector(other_names) USING gin` index
on other_names. This supports fast string prefix matching against all
elements of the array. The downside is that it doesn't allow infix or
suffix matches, so we can't support wildcards in general. Wildcards
didn't quite work anyway, since artist and wiki other names can contain
literal '*' characters.
2021-01-10 03:35:12 -06:00
evazion
0899194f6b Fix conflict between normalize and array_attribute macros.
Fix the `normalize` and `array_attribute` macros conflicting with each
other on the WikiPage model. This meant code like
`wiki_page.other_names = "foo bar"` didn't work. Both macros defined a
`other_names=` method, but one method overrode the other.

The fix is to use anonymous modules and prepend so we can chain method
calls with super.
2021-01-10 02:03:12 -06:00
evazion
9759701071 search: add way to search array attributes by regex.
Add a `where_any_in_array_matches_regex` method and expose it to the API:

 * https://danbooru.donmai.us/artists?search[any_other_name_matches_regex]=^blah
 * https://danbooru.donmai.us/wiki_pages?search[any_other_name_matches_regex]=^blah
 * https://danbooru.donmai.us/saved_searches?search[any_label_matches_regex]=^blah

In SQL, this does `WHERE '^blah' ~<< ANY(other_names)`, where `~<<` is a
custom operator based on the `~` regex match operator, but with the
arguments reversed. This allows it to be used with the ANY(array) operator.

See also:

* https://stackoverflow.com/a/22101172
* https://www.postgresql.org/docs/current/sql-createfunction.html
* https://www.postgresql.org/docs/current/sql-createoperator.html
* https://www.postgresql.org/docs/current/functions-comparisons.html
2021-01-10 02:03:02 -06:00
evazion
65adcd09c2 users: track logins, signups, and other user events.
Add tracking of certain important user actions. These events include:

* Logins
* Logouts
* Failed login attempts
* Account creations
* Account deletions
* Password reset requests
* Password changes
* Email address changes

This is similar to the mod actions log, except for account activity
related to a single user.

The information tracked includes the user, the event type (login,
logout, etc), the timestamp, the user's IP address, IP geolocation
information, the user's browser user agent, and the user's session ID
from their session cookie. This information is visible to mods only.

This is done with three models. The UserEvent model tracks the event
type (login, logout, password change, etc) and the user. The UserEvent
is tied to a UserSession, which contains the user's IP address and
browser metadata. Finally, the IpGeolocation model contains the
geolocation information for IPs, including the city, country, ISP, and
whether the IP is a proxy.

This tracking will be used for a few purposes:

* Letting users view their account history, to detect things like logins
  from unrecognized IPs, failed logins attempts, password changes, etc.
* Rate limiting failed login attempts.
* Detecting sockpuppet accounts using their login history.
* Detecting unauthorized account sharing.
2021-01-08 22:34:37 -06:00
evazion
da3e8e4726 searchable: fix bug with searching multiple association attributes.
Fix a bug with searches like the following not working correctly:

* https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_id]=1234
* https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_name]=abcd
* https://danbooru.donmai.us/comments.json?search[post][rating]=s&search[post_tags_match]=touhou

It wasn't possible to search for both `creator` and `creator_id` at the
same time (or `post` and `post_tags_match`, etc). Only the `creator_id`
param would be recognized.

Also refactor some internals:

* `search_includes` was renamed to `search_associated_attribute`.
* `search_attribute` was split up into `search_basic_attribute` and
  `search_associated_attribute`.
2021-01-07 17:10:29 -06:00
BrokenEagle
db5f9ce243 Support multiple excludes for enum types
It's not possible to pass it off to search_numeric_attribute directly
since the column "category" does not match the prefix "category_id".
2021-01-06 20:21:56 +00:00
BrokenEagle
57de81686b Support using all numeric searches for includes 2021-01-06 20:21:56 +00:00
BrokenEagle
4a439d72d6 Support multiple exclusions
Since it does a not of numeric_attribute_matches which uses the
post query builder, it now also support reverse ranges and reverse
greater/less than.
2021-01-06 20:21:55 +00:00
evazion
7f1b798b05 searchable: refactor search_boolean_attribute. 2020-12-27 05:26:21 -06:00
evazion
efb836ac02 wikis: normalize Unicode characters in wiki bodies.
* Introduce an abstraction for normalizing attributes. Very loosely
  modeled after https://github.com/fnando/normalize_attributes.
* Normalize wiki bodies to Unicode NFC form.
* Normalize Unicode space characters in wiki bodies (strip zero width
  spaces, normalize line endings to CRLF, normalize Unicode spaces to
  ASCII spaces).
* Trim spaces from the start and end of wiki page bodies. This may cause
  wiki page diffs to show spaces being removed even when the user didn't
  explicitly remove the spaces themselves.
2020-12-21 20:47:50 -06:00
evazion
2c1da660fd tags: allow tag abbreviations in searches and during tagging.
Expand the tag abbreviation system introduced in b0be8ae45 so that it
works in searches and when tagging posts, not just in autocomplete.

For example, you can tag a post with /evth and it will add the tag
eyebrows_visible_through_hair. You can search for /evth and it will
search for the tag eyebrows_visible_through_hair.

Some more examples:

* /ops is short for one-piece_swimsuit
* /hooe is short for hair_over_one_eye
* /saol is short for standing_on_one_leg
* /tlozbotw is short for the_legend_of_zelda:_breath_of_the_wild

If two tags have the same abbreviation, then the larger tag takes
precedence. For example, /be is short for blue_eyes, not brown_eyes,
because blue_eyes is the bigger tag.

If there is an existing shortcut alias that conflicts with the
abbreviation, then the alias take precedence. For example, /sh is short
for suzumiya_haruhi, not short_hair, because there's an old alias for
/sh -> suzumiya_haruhi.
2020-12-17 23:57:13 -06:00
evazion
ee4516f5fe searchable: refactor searchable_includes.
Pass searchable associations directly to search_attributes instead of
defining them separately in searchable_includes.
2020-12-16 23:57:07 -06:00
evazion
e771c0fca8 searchable: don't automatically include id, created_at, updated_at.
Don't make search methods on models call super in order to search
certain default attributes (id, created_at, updated_at). Simplifies some
magic.
2020-12-16 23:57:07 -06:00
evazion
2297bf5da5 Fix #4638: Add exclusions to the numeric attributes.
Add the following search operators:

* /tags?search[post_count_eq]=42
* /tags?search[post_count_not_eq]=42
* /tags?search[post_count_gt]=42
* /tags?search[post_count_gteq]=42
* /tags?search[post_count_lt]=42
* /tags?search[post_count_lteq]=42

Works for all numeric attributes on all index actions.
2020-12-16 20:03:09 -06:00
evazion
35134abe8f post query builder: fix incompatibilities with Rails 6.1.
* Rename the `#negate` and `#and` methods that we monkey patch into
  ActiveRecord::Relation. These methods are now defined in Rails 6.1, but
  they shadow our methods and have slightly different behavior.
* Fix a call to `invert`. It no longer accepts an argument.
2020-12-13 04:10:48 -06:00
evazion
f0299a8945 aliases: refactor tag moving code.
* Factor out the code for moving tags from tag aliases to a separate
  TagMover class.

* When aliasing two tags that have conflicting wikis, merge the old wiki
  into the new one instead of failing with an error. Merge the other names
  fields, replace the old wiki body with a message linking to the new
  wiki, and mark the old wiki as deleted.

* When aliasing two tags that have conflicting artist entries, merge the
  old artist into the new one instead of silently ignore the conflict.
  Merge the group name, other names, and urls fields, and mark the old
  artist as deleted.

* When two tags have conflicting wikis or artist entries, but the old
  wiki or artist entry is deleted, then just ignore the old wiki or
  artist and don't try to merge it.

* Fix it so that when saved searches are rewritten, we rewrite negated
  searches too.
2020-08-26 17:05:41 -05:00
evazion
70b82010a7 search: fix info leak when searching nested associations.
Fix an exploit in #4553. It was possible to use nested searches to infer
the contents of private forum posts.

For example:

* https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=h*
* https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=he*
* https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hel*
* https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hell*
* https://danbooru.donmai.us/users?search[forum_posts][id]=121683&search[forum_posts][body_matches]=hello*

The above searches returned the user 'albert', indicating that the
private forum post with id 121683 starts with the word 'hello'.

By guessing the id of a private forum post (which can be done by
searching for gaps in the id sequence), and by guessing text within the
post (which can be done by sequentially guessing characters with
wildcard searches), one could eventually infer the full text of a
private forum post.

The fix is to make nested searches only return records that are visible
to the current user.
2020-08-18 15:21:39 -05:00
BrokenEagle
36fa8efcd5 Fix parameter hash detection
Hash-like objects will respond to each_value, whereas arrays do not.
2020-08-18 05:34:14 +00:00
evazion
5db11a0b5f Merge branch 'master' into attribute-searching 2020-08-17 14:23:00 -05:00
evazion
2b0cd3c90b searchable: add support for searching enum fields.
Allow searching enum fields by string, by id, or by array of
comma-separated values. The category field in modactions is an example
of an enum field that can be searched this way.
2020-08-07 19:24:57 -05:00
BrokenEagle
c141a358bd Add support for chaining more search includes
- A generalized search includes function was added
-- The post and user includes functions were changed to use that
- A search function for polymorphic includes was added
- All models are given 3 class functions to control which includes
  are searchable, and extra restrictions for the "has_" params
2020-07-27 19:29:17 +00:00
evazion
b551e3634f Fix misc rubocop warnings. 2020-06-16 21:36:15 -05:00
evazion
f38c38f26e search: split tag_match into user_tag_match / system_tag_match.
When doing a tag search, we have to be careful about which user we're
running the search as because the results depend on the current user.
Specifically, things like private favorites, private favorite groups,
post votes, saved searches, and flagger names depend on the user's
permissions, and whether non-safe or deleted posts are filtered out
depend on whether the user has safe mode on or the hide deleted posts
setting enabled.

* Refactor internal searches to explicitly state whether they're
  running as the system user (DanbooruBot) or as the current user.
* Explicitly pass in the current user to PostQueryBuilder instead of
  implicitly relying on the CurrentUser global.
* Get rid of CurrentUser.admin_mode? (used to ignore the hide deleted
  post setting) and CurrentUser.without_safe_mode (used to ignore safe
  mode).
* Change the /counts/posts.json endpoint to ignore safe mode and the
  hide deleted posts settings when counting posts.
* Fix searches not correctly overriding the hide deleted posts setting
  when multiple status: metatags were used (e.g. `status:banned status:active`)
* Fix fast_count not respecting the hide deleted posts setting when the
  status:banned metatag was used.
2020-05-07 03:29:44 -05:00
evazion
8cbcec285d search: fix multiple metatag searches not working in some cases.
Bug: in some cases searching for multiple metatags would cause one
metatag to be ignored. For example, a search for {{user:1 pool:2}} would
be treated as a search for {{pool:2}}.

Cause: we used `ActiveRecord::Relation#merge` to combine two relations,
which was wrong because `merge` doesn't combine `column IN (?)` clauses
correctly. If there are two `column IN (?)` clauses on the same column,
then `#merge` takes only the second clause and ignores the first.

Fix: write our own half-baked `#and` method to work around Rails'
broken-by-design `#merge` method.

ref: https://github.com/rails/rails/issues/33501.
2020-04-27 22:29:42 -05:00
evazion
18685ae5ae search: fixup broken class method references.
Fixup for 3dab648d0.
2020-04-23 13:38:19 -05:00