Commit Graph

1521 Commits

Author SHA1 Message Date
evazion
c1b865b160 searchable: add more enum attribute search options.
Add `<enum>_not` and `<enum>_id_<op>` search options:

* https://danbooru.donmai.us/mod_actions?search[category_not]=post_regenerate,post_regenerate_iqdb
* https://danbooru.donmai.us/mod_actions?search[category_not]=48,49
* https://danbooru.donmai.us/mod_actions?search[category_id]=40..50
* https://danbooru.donmai.us/mod_actions?search[category_id_not]=40..50
* https://danbooru.donmai.us/mod_actions?search[category_id_gt]=40&search[category_id_lt]=50
2021-01-11 19:13:35 -06:00
evazion
e7b454686e searchable: refactor where_operator method.
Refactor the `where_operator` method so we can use it to avoid raw SQL
in more places.
2021-01-11 19:13:29 -06:00
evazion
6d2eeb6f28 searchable: fix being unable to use multiple operators on same attribute.
Fix searches like this not working:

* https://danbooru.donmai.us/tags?search[id]=1..100&search[id_not]=50

Before one of these params would override the other.
2021-01-11 14:59:04 -06:00
evazion
95f39afcd9 tests: fixup broken tests. 2021-01-11 05:12:09 -06:00
evazion
be1251b6be autocomplete: optimize various types of bogus input.
Optimize autocomplete to ignore various types of bogus input that will
never match anything. It turns out it's not uncommon for people to do
things like paste random URLs into autocomplete, or hold down keys, or
enter long strings of gibberish text (sometimes in other languages).
Some things, like autocorrect and slash abbreviations, become
pathologically slow when fed certain types of bad input.

Autocomplete will abort and return nothing in the following situations:

* Searching for URLs (tags that start with http:// or https://).
* Overly long tags (strings longer than the 170 char tag name limit).
* Slash abbreviations longer than 10 chars (e.g. typing `/qwoijqoiqogirqewgoi`).
* Slash abbreviations that aren't alphanumeric (e.g. typing `/////////`).
* Autocorrect input that contains too much punctuation and not enough actual letters.
2021-01-11 05:12:09 -06:00
evazion
d18dc573fb artists: fix misnormalization of emoji in other names.
Fix `normalize_whitespace` to not strip zero-width joiner characters
(U+200D). These characters are used in emoji and stripping them breaks
some artist other names that use emoji.
2021-01-10 02:46:20 -06:00
evazion
0899194f6b Fix conflict between normalize and array_attribute macros.
Fix the `normalize` and `array_attribute` macros conflicting with each
other on the WikiPage model. This meant code like
`wiki_page.other_names = "foo bar"` didn't work. Both macros defined a
`other_names=` method, but one method overrode the other.

The fix is to use anonymous modules and prepend so we can chain method
calls with super.
2021-01-10 02:03:12 -06:00
evazion
5962152ee3 wiki pages, artists: fix normalization of other names.
Fix wiki pages and artists to normalize other names more consistently
and correctly.

For wiki pages, we strip leading / trailing / repeated underscores to
fix user typos, and we normalize to NFKC form to make search more consistent.

For artists, we allow leading / trailing / repeated underscores because
some artist names have these, and we normalize to NFC form because some
artists have weird names that would be lost by NFKC form. This does make
search less consistent.
2021-01-10 02:03:12 -06:00
evazion
e788d8d0b6 saved searches: fix normalization of labels.
Fix saved searches to remove additional invalid characters from labels:

* Remove repeated spaces or underscores.
* Remove leading and trailing spaces or underscores.
* Normalize Unicode characters to NFC form.

Also add a fix script to renormalize labels in old saved searches. A few
problems with existing searches:

* Some saved searches somehow had labels containing NULL elements.
* Some had leading or trailing underscores.
* Some had repeated underscores.
* Some had non-English characters in uppercase.
2021-01-10 02:03:12 -06:00
evazion
7fbac34962 wiki pages: fix normalization of wiki page title.
Fix the wiki page title "azur___lane" being normalized to "azur__lane"
instead of "azur_lane".
2021-01-10 02:03:12 -06:00
evazion
ff6e640fcb tests: add normalize_attribute helper method.
Add a custom Shoulda matcher for testing that a model correctly normalizes an attribute.

Usage:

    subject { build(:wiki_page) }
    should normalize_attribute(:title).from(" Azur  Lane ").to("azur_lane")
2021-01-10 02:03:12 -06:00
evazion
9759701071 search: add way to search array attributes by regex.
Add a `where_any_in_array_matches_regex` method and expose it to the API:

 * https://danbooru.donmai.us/artists?search[any_other_name_matches_regex]=^blah
 * https://danbooru.donmai.us/wiki_pages?search[any_other_name_matches_regex]=^blah
 * https://danbooru.donmai.us/saved_searches?search[any_label_matches_regex]=^blah

In SQL, this does `WHERE '^blah' ~<< ANY(other_names)`, where `~<<` is a
custom operator based on the `~` regex match operator, but with the
arguments reversed. This allows it to be used with the ANY(array) operator.

See also:

* https://stackoverflow.com/a/22101172
* https://www.postgresql.org/docs/current/sql-createfunction.html
* https://www.postgresql.org/docs/current/sql-createoperator.html
* https://www.postgresql.org/docs/current/functions-comparisons.html
2021-01-10 02:03:02 -06:00
evazion
65adcd09c2 users: track logins, signups, and other user events.
Add tracking of certain important user actions. These events include:

* Logins
* Logouts
* Failed login attempts
* Account creations
* Account deletions
* Password reset requests
* Password changes
* Email address changes

This is similar to the mod actions log, except for account activity
related to a single user.

The information tracked includes the user, the event type (login,
logout, etc), the timestamp, the user's IP address, IP geolocation
information, the user's browser user agent, and the user's session ID
from their session cookie. This information is visible to mods only.

This is done with three models. The UserEvent model tracks the event
type (login, logout, password change, etc) and the user. The UserEvent
is tied to a UserSession, which contains the user's IP address and
browser metadata. Finally, the IpGeolocation model contains the
geolocation information for IPs, including the city, country, ISP, and
whether the IP is a proxy.

This tracking will be used for a few purposes:

* Letting users view their account history, to detect things like logins
  from unrecognized IPs, failed logins attempts, password changes, etc.
* Rate limiting failed login attempts.
* Detecting sockpuppet accounts using their login history.
* Detecting unauthorized account sharing.
2021-01-08 22:34:37 -06:00
evazion
da3e8e4726 searchable: fix bug with searching multiple association attributes.
Fix a bug with searches like the following not working correctly:

* https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_id]=1234
* https://danbooru.donmai.us/comments.json?search[creator][level]=20&search[creator_name]=abcd
* https://danbooru.donmai.us/comments.json?search[post][rating]=s&search[post_tags_match]=touhou

It wasn't possible to search for both `creator` and `creator_id` at the
same time (or `post` and `post_tags_match`, etc). Only the `creator_id`
param would be recognized.

Also refactor some internals:

* `search_includes` was renamed to `search_associated_attribute`.
* `search_attribute` was split up into `search_basic_attribute` and
  `search_associated_attribute`.
2021-01-07 17:10:29 -06:00
evazion
65be2c99b0 Fix #4657: Hentai-Foundry: Document tree depth limit exceeded. 2021-01-06 03:05:36 -06:00
evazion
873d719481 tests: fix broken Newgrounds test.
Broken again by the source post being deleted.
2021-01-04 23:54:18 -06:00
evazion
3542c401d4 Merge pull request #4654 from nonamethanks/fix_gelbooru_normalization
Fix gelbooru source normalization
2021-01-04 01:31:17 -06:00
evazion
ed7e7b1d30 Merge pull request #4639 from nonamethanks/fix_pixiv_en_links
Update artist finder blacklist
2021-01-04 01:15:15 -06:00
evazion
98ee6c31c1 favorites: refactor fav:/ordfav: searches to not use fav_string.
Refactor fav:<name> and ordfav:<name> searches to use the favorites
table instead of the posts.fav_string.

This may be slower for fav:<name> searches. The fav_string effectively
treats favorites like secret tags on the post, so fav:<name> searches
were effectively the same as tag searches. Now they do a subquery on the
favorites table, which may not perform as well for things like multiple
fav:<name> metatags or negated fav:<name> metatags.

For ordfav:<name> searches, this may be faster. ordfav: searches had a
tag match clause (`tag_index @@ 'fav:123'`) in addition to a join on the
favs table. This was redundant, and in some cases it inhibited the query
planner from choosing a more optimal plan.

Partially addresses #4652 by eliminating another place where we depended
on the fav_string.
2021-01-03 19:18:31 -06:00
nonamethanks
b3f92dd2c7 Fix gelbooru source normalization 2021-01-03 20:12:40 +01:00
evazion
d0bb4ed398 user upgrades: add bank payment methods for European countries.
Add the following bank redirect payment methods:

* https://stripe.com/docs/payments/bancontact
* https://stripe.com/docs/payments/eps
* https://stripe.com/docs/payments/giropay
* https://stripe.com/docs/payments/ideal
* https://stripe.com/docs/payments/p24

These methods are used in Austria, Belgium, Germany, the Netherlands,
and Poland.

These methods require payments to be denominated in EUR, which means we
have to set prices in both USD and EUR, and we have to automatically
detect which currency to use based on the user's country. We also have
to automatically detect which payment methods to offer based on the
user's country. We do this by using Cloudflare's CF-IPCountry header to
geolocate the user's country.

This also switches to using prices and products defined in Stripe
instead of generated on-the-fly when creating the checkout.
2020-12-31 06:50:10 -06:00
evazion
ae5c0d1034 newrelic: log request path. 2020-12-31 06:50:10 -06:00
evazion
4b171bf97e user upgrades: add ability to refund upgrades. 2020-12-29 04:17:32 -06:00
evazion
87af02f689 user upgrades: add links to Stripe payment & receipt page.
Add links to the Stripe payment page and the Stripe receipt page on
completed user upgrades.

The Stripe payment link is a link to the payment details on the Stripe
dashboard and is only visible to the owner.
2020-12-29 00:19:52 -06:00
evazion
7e8f859b24 tags: eliminate Tag.category_for method.
Tag.category_for looked up a tag's category in the Redis cache. This was
only used in a few places (in related tags, and on the popular/missed
search pages). Get rid of this method so we can work towards getting rid
of caching tag categories in Redis.
2020-12-27 21:03:26 -06:00
evazion
d9db32640a user upgrades: fix checkout form leaking recipient's email.
The checkout form should be prefilled with the purchaser's email
address, not the recipient's.
2020-12-25 02:01:42 -06:00
evazion
74ed2a8b96 user upgrades: add UserUpgrade model.
Add a model to store the status of user upgrades.

* Store the upgrade purchaser and the upgrade receiver (these are
  different for a gifted upgrade, the same for a self upgrade).
* Store the upgrade type: gold, platinum, or gold-to-platinum upgrades.
* Store the upgrade status:
** pending: User is still on the Stripe checkout page, no payment
   received yet.
** processing: User has completed checkout, but the checkout status in
   Stripe is still 'unpaid'.
** complete: We've received notification from Stripe that the payment
   has gone through and the user has been upgraded.
* Store the Stripe checkout ID, to cross-reference the upgrade record on
  Danbooru with the checkout record on Stripe.

This is the upgrade flow:

* When the user clicks the upgrade button on the upgrade page, we call
  POST /user_upgrades and create a pending UserUpgrade.
* We redirect the user to the checkout page on Stripe.
* When the user completes checkout on Stripe, Stripe sends us a webhook
  notification at POST /webhooks/receive.
* When we receive the webhook, we check the payment status, and if it's
  paid we mark the UserUpgrade as complete and upgrade the user.
* After Stripe sees that we have successfully processed the webhook,
  they redirect the user to the /user_upgrades/:id page, where we show
  the user their upgrade receipt.
2020-12-24 21:15:04 -06:00
evazion
7762489d7d user upgrades: upgrade to new Stripe checkout system.
This upgrades from the legacy version of Stripe's checkout system to the
new version:

> The legacy version of Checkout presented customers with a modal dialog
> that collected card information, and returned a token or a source to
> your website. In contrast, the new version of Checkout is a smart
> payment page hosted by Stripe that creates payments or subscriptions. It
> supports Apple Pay, Dynamic 3D Secure, and many other features.

Basic overview of the new system:

* We send the user to a checkout page on Stripe.
* Stripe collects payment and sends us a webhook notification when the
  order is complete.
* We receive the webhook notification and upgrade the user.

Docs:

* https://stripe.com/docs/payments/checkout
* https://stripe.com/docs/payments/checkout/migration#client-products
* https://stripe.com/docs/payments/handling-payment-events
* https://stripe.com/docs/payments/checkout/fulfill-orders
2020-12-24 19:58:29 -06:00
evazion
dbb66ace90 routes: replace hardcoded routes in models with route helpers.
Add a Routes module that gives models access to route helpers outside of
views, and use it to replace various hardcoded routes.
2020-12-24 00:17:19 -06:00
evazion
6c99bbbf47 posts: limit sources to 1200 chars long.
The longest sources on Danbooru are DeviantArt wixmp.com sources, which
max out at ~900 chars.
2020-12-21 22:42:39 -06:00
evazion
6ac9882711 newrelic: log country of each request in newrelic.
Log the country of each HTTP request in NewRelic. Uses the CF-IPCountry
header set by Cloudflare.
2020-12-21 20:47:58 -06:00
evazion
efb836ac02 wikis: normalize Unicode characters in wiki bodies.
* Introduce an abstraction for normalizing attributes. Very loosely
  modeled after https://github.com/fnando/normalize_attributes.
* Normalize wiki bodies to Unicode NFC form.
* Normalize Unicode space characters in wiki bodies (strip zero width
  spaces, normalize line endings to CRLF, normalize Unicode spaces to
  ASCII spaces).
* Trim spaces from the start and end of wiki page bodies. This may cause
  wiki page diffs to show spaces being removed even when the user didn't
  explicitly remove the spaces themselves.
2020-12-21 20:47:50 -06:00
evazion
3ad4beac02 autocomplete: fix exception when completing unsupported metatags. 2020-12-20 01:27:48 -06:00
evazion
28926c2332 autocomplete: remove old autocomplete endpoints.
Remove /tag/autocomplete.json and /saved_searches/labels.json.
2020-12-20 00:51:29 -06:00
evazion
a129eb4251 wikis: force wiki names to follow same rules as tag names.
Don't allow wiki pages to have invalid names.

This incidentally means that you can't create wiki pages for pools. For
example, you can't create a wiki titled "pool:almost_heart-warming".
This is not a valid tag name, so it's not a valid wiki name either. This
was done in a handful of cases to translate Pixiv tags to Danbooru pools
(see: <https://danbooru.donmai.us/wiki_page_versions?search[title_like]=pool:*>)

Also fix it so that titles are normalized before validation, not before save.
2020-12-20 00:51:29 -06:00
evazion
7708e2e08f wikis: don't allow adding other names to artist wikis.
Prevent users from adding other names to artist wikis. These should be
added to the artist entry instead.
2020-12-20 00:51:29 -06:00
evazion
4cb39422b2 post replacements: rename <attr>_was to old_<attr>
Rename the following post replacement attributes:

* file_size_was -> old_file_size
* file_ext_was -> old_file_ext
* image_width_was -> old_image_width
* image_height_was -> old_image_height
* md5_was -> old_md5

In Rails 6.1, having attributes named `file_size` and `file_size_was` on
the same model breaks things because it conflicts with Rails' dirty
attribute tracking.
2020-12-19 14:26:07 -06:00
evazion
2c1da660fd tags: allow tag abbreviations in searches and during tagging.
Expand the tag abbreviation system introduced in b0be8ae45 so that it
works in searches and when tagging posts, not just in autocomplete.

For example, you can tag a post with /evth and it will add the tag
eyebrows_visible_through_hair. You can search for /evth and it will
search for the tag eyebrows_visible_through_hair.

Some more examples:

* /ops is short for one-piece_swimsuit
* /hooe is short for hair_over_one_eye
* /saol is short for standing_on_one_leg
* /tlozbotw is short for the_legend_of_zelda:_breath_of_the_wild

If two tags have the same abbreviation, then the larger tag takes
precedence. For example, /be is short for blue_eyes, not brown_eyes,
because blue_eyes is the bigger tag.

If there is an existing shortcut alias that conflicts with the
abbreviation, then the alias take precedence. For example, /sh is short
for suzumiya_haruhi, not short_hair, because there's an old alias for
/sh -> suzumiya_haruhi.
2020-12-17 23:57:13 -06:00
evazion
991896c4eb tags: don't allow tags more than 170 chars long.
Limit tag length to 170 chars. 170 chars was chosen because it's
longer than the longest active tag on Danbooru.

Tag length is limited because in some contexts we can't deal with
excessively long tags. Tag autocorrect for example uses the levenshtein
function in Postgres, which can't handle strings more than 255 chars long.
2020-12-17 21:38:24 -06:00
evazion
1809f67b2b tags: don't allow tags to begin with a '/'.
Disallow tags from starting with a '/' character. This is so that tag
abbreviations in autocomplete, which start with a '/', don't conflict
with regular tags.

Also disallow some other punctuation characters: `%{})]. Currently no
tags start with these characters. This is to reserve other special
characters in case we need them for other future syntax extensions.
2020-12-17 21:38:18 -06:00
evazion
b0659eb76c searchable: add tests for Searchable concern. 2020-12-16 23:57:04 -06:00
nonamethanks
3801e08ae6 Update pixiv url matching tests 2020-12-16 13:53:16 +01:00
evazion
26246b0ac9 autocomplete: fix exception when typing "/" in autocomplete.
Fix an exception that could occur when typing "/" by itself in
autocomplete and a regular tag starting with "/" was returned. This
caused an exception in `r[:antecedent].length` because the tag's
antecedent was nil.
2020-12-14 21:57:28 -06:00
evazion
c02c31b966 autocomplete: recognize Japanese tags in autocomplete.
Allowing typing Japanese tags in autocomplete. For example, typing 東方
in autocomplete will be completed to the touhou tag. Typing ぶくぶ will
complete to the bkub tag.

This works using wiki page and artist other names. Effectively, any name
listed as an other name in a wiki or artist page will be treated like an
alias for autocomplete purposes. This is limited to non-ASCII other names,
to prevent English other names from interfering with regular tag searches.
2020-12-14 18:58:11 -06:00
evazion
c82e05d828 users: add stricter checks for user promotions.
New rules for user promotions:

* Moderators can no longer promote other users to moderator level. Only
  Admins can promote users to Mod level. Mods can only promote up to Builder level.
* Admins can no longer promote other users to Admin level. Only Owners
  can promote users to Admin. Admins can only promote up to Mod level.
* Admins can no longer demote themselves or other admins.

These rules are being changed to account for the new Owner user level.

Also change it so that when a user upgrades their account, the promotion
is done by DanbooruBot. This means that the inviter and the mod action
will show DanbooruBot as the promoter instead of the user themselves.
2020-12-13 21:21:08 -06:00
evazion
b3ad13e6e3 users: add new owner level.
Add a new Owner user level for the site owner. Highly sensitive
operations like manually changing the passwords of other users will be
restricted to the site owner.
2020-12-13 21:18:24 -06:00
evazion
61e7d32f78 tests: fix FC2 artist normalization url test. 2020-12-13 04:10:48 -06:00
evazion
119268e118 autocomplete: fix exception when completing saved search labels.
Fix an exception that was thrown when trying to autocomplete saved
search labels (e.g. `search:all`) as an anonymous user. This was a
pre-existing bug.
2020-12-13 00:45:22 -06:00
evazion
b0be8ae456 autocomplete: rework tag autocomplete behavior.
Reworks tag autocomplete to work the same way for all users. Previously
autocomplete for Builders worked differently than autocomplete for
regular users.

This is how it works now:

* If the search starts with a slash (/), then do a tag abbreviation
  match. For example, `/evth` matches eyebrows_visible_through_hair.
* Otherwise if the search contains a wildcard (*), then just do a simple
  wildcard search.
* Otherwise do a tag prefix match against tags and aliases. For example,
  `black` matches all tags or aliases beginning with `black`.
* If the tag prefix match returns no results, then do a autocorrect match.

The differences for regular users:

* You can abbreviate tags with a slash (/).

The differences for Builders:

* Now tag abbreviations have to start with a slash (/).
* Autocorrect isn't performed unless a regular search returns no results.
* Results are always sorted by tag count. Before different types of
  results (regular tag matches, alias matches, abbreviation matches,
  and autocorrect matches) were all mixed together based on a tag
  weighting scheme.
2020-12-13 00:45:22 -06:00
evazion
adc1c2c2cc autocomplete: refactor javascript to use /autocomplete endpoint.
This refactors the autocomplete Javascript to use a single dedicated
/autocomplete.json endpoint instead of a bunch of separate endpoints.

This simplifies the autocomplete Javascript by making it so that instead
of calling a different endpoint for each type of query (for users, wiki
pages, pools, artists, etc), then having to parse the results of each
call to get the data we need, we can call a single endpoint that returns
exactly what we need.

This also means we don't have to parse searches clientside in order to
autocomplete metatags. Instead we can just pass the search term to the
server and let it parse the search, which is easy to do serverside.

Finally, this makes autocomplete easier to test, and it makes it easier
to add more sophisticated autocomplete behavior, since most of the logic
lives serverside.
2020-12-13 00:45:22 -06:00