* Deprecated tags can't be added to posts, but existing deprecated tags
in a post won't be removed
* Only empty tags can be marked as deprecated manually
* No tags can be manually undeprecated
** These limits don't apply to admins
* Deprecating or undeprecating a tag will create a new mod action to
prevent people from going rogue
* Added deprecate/undeprecate commands for BURs
* Deprecating a tag via BUR removes all implications to and from it as well
This is necessary for the `commentary:` metatag, which has different
behavior depending on whether the metatag value is quoted. For example,
`commentary:translated` finds translated commentaries, while
`commentary:"translated"` finds commentaries containing the literal word
"translated".
Refactor the `PostQuery::AST#simplify` method to split it into three
methods: `#trim` to eliminate redundant AND and OR clauses, `#simplify`
to expand deeply nested subexpressions, and `#sort` to sort the query
into alphabetical order.
This is so we can normalize queries written by users by parsing and
rewriting them, but without expanding out nested subexpressions, which
can substantially alter the way the query is written.
Stop the last remaining uses of the `artist_urls.normalized_url` column.
It's already no longer used by the artist finder. The only remaining
uses were by API users. Those users should use the `url` column instead.
Sometimes when a Newgrounds post is part of a set, there is a list of
links to other posts in the set in the artist's commentary. Exclude
these links because they're not really part of the commentary.
Example: https://www.newgrounds.com/art/view/boxofwant/annie-hughes-1 (NSFW)
Fix mass updates of the form `mass update A -> B` not being allowed.
This was originally because after `rename` was introduced, we wanted to
prevent people from using mass updates to move tags. Now, `mass update A -> B`
adds B to all posts tagged A instead of moving A to B. So `mass update A -> B`
should no longer be disallowed.
This also makes it so that it's an error to create a mass update with a
syntax error in the search. Before searches couldn't have syntax errors,
but now with the new query parser it's possible.
Fix the Tags field in the BUR search form not finding all BURs
mentioning that tag. Specifically, tags that were part of a mass update,
and that were prefixed with `~` or `-` (OR tags and NOT tags), weren't
indexed as tags affected by the BUR.
This requires re-running script/fixes/064_initialize_bulk_update_request_tags.rb
to fix old BURs.
Add a new tag tag search parser that supports full boolean expressions, including `and`,
`or`, and `not` operators and parenthesized subexpressions.
This is only the parser itself, not the code for converting the search into SQL. The new
parser isn't used yet for actual searches. Searches still use the old parser.
Some example syntax:
* `1girl 1boy`
* `1girl and 1boy` (same as `1girl 1boy`)
* `1girl or 1boy`
* `~1girl ~1boy` (same as `1girl or 1boy`)
* `1girl and ((blonde_hair blue_eyes) or (red_hair green_eyes))`
* `1girl ~(blonde_hair blue_eyes) ~(red_hair green_eyes)` (same as above)
* `1girl -(blonde_hair blue_eyes)`
* `*_hair *_eyes`
* `*_hair or *_eyes`
* `user:evazion or fav:evazion`
* `~user:evazion ~fav:evazion`
Rules:
AND is implicit between terms, but may be written explicitly:
* `a b c` is `a and b and c`
AND has higher precedence (binds tighter) than OR:
* `a or b and c or d` is `a or (b and c) or d`
* `a or b c or d e` is `a or (b and c) or (d and e)`
All `~` operators in the same subexpression are combined into a single OR:
* `a b ~c ~d` is `a b (c or d)`
* `~a ~b and ~c ~d` is `(a or b) (c or d)`
* `(~a ~b) (~c ~d)` is `(a or b) (c or d)`
A single `~` operator in a subexpression by itself is ignored:
* `a ~b` is `a b`
* `~a and ~b` is `a and b`, which is `a b`
* `(~a) ~b` is `a ~b`, which is `a b`
The parser is written as a backtracking recursive descent parser built on top of
StringScanner and a handful of parser combinators. The parser generates an AST, which is
then simplified using Boolean algebra to remove redundant nodes and to convert the
expression to conjunctive normal form (that is, a product of sums, or an AND of ORs).
Change mass updates to not automatically remove the left-hand side tags
from the post. This won't work with full boolean searches in the future
and already doesn't work with complex searches involving metatags or OR-tags.
Fix the page_url method not to return URLs like this:
https://seiga.nicovideo.jp/image/source/8017978 (page: https://seiga.nicovideo.jp/watch/mg310193)
These are direct image URLs, not page URLs. It's not generally possible
to get to the page URL from an image URL like this.
This fixes it so that we don't incorrectly set the source of NicoSeiga
uploads to the image URL.
Refactor source strategies to remove the `canonical_url` method.
`canonical_url` returned the URL that should be used as the source of
the post after upload. Now we simply use `Source::URL#page_url` to
determine the source after upload. If the source is an image URL that is
convertible to a page URL, then the image URL is used as the source. If
the source is an image URL that is not convertible to a page URL, then
the page URL is used as the source.
This simplifies source strategies so that all they have to care about is
implementing the `Source::URL#page_url` and `Sources::Strategies#page_url`
methods, and the preferred source will be chosen for posts automatically.
Add partial support for fetching videos from ArtStation posts that
contain videos. Most of this code is disabled for now because actually
downloading these videos requires bypassing a Cloudflare captcha.
Fix a PublicSuffix::DomainNotAllowed exception raised with viewing or editing a post
with a source like `Blog.`.
This happened when parsing the post's source. `Danbooru::URL.parse("Blog.")` would
heuristically parse the source into `http://blog`. Calling any methods related to the
URL's hostname or domain would lead to calling `PublicSuffix.parse("blog")`, which
would fail with PublicSuffix::DomainNotAllowed.
Fix regression in 1ad0e8688. Caused by `relation.order_values` returning
an array of Arel nodes instead of an array of strings when doing a
`random:1` search.
Support grabbing the full image for Tinami uploads, rather than the sample.
Getting the full image requires making a request like this:
curl -X POST \
-H 'Referer: https://www.tinami.com/' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Cookie: Tinami2SESSID=<redacted>;' \
--data-raw 'action_view_original=true&cont_id=1087268ðna_csrf=<redacted>' \
https://www.tinami.com/view/1087268
Then scraping the <img> tag from the resulting HTML page.
If the post has multiple images, then we need to scrape and pass the
`sub_id` of the image too.
Fixes#2818.
Show a warning when creating a duplicate artist; that is, when adding a
URL that already belongs to another artist.
This is a soft warning rather than a hard error because there are some
cases where multiple artists legitimately share the same site or account.
Change how artist URLs are normalized in artist entries. Don't try to secretly
convert image URLs to profile URLs in artist entries. For example, if someone puts a
Pixiv image URL in an artist entry, don't secretly try to fetch the source and
convert it into a profile URL in the `normalized_url` field.
We did this because years ago, it was standard practice to put image URLs in artist
entries. Pixiv image URLs used to contain the artist's username, so we used to put
image URLs in artist entries for artist finding purposes. But Pixiv changed it so
that image URLs no longer contained the username, so we dealt with it by adding a
`normalized_url` column to artist_urls and secretly converting image URLs to profile
URLs in this field. But this is no longer necessary because now we don't normally put
image URLs in artist entries in the first place.
Now the `profile_url` method in `Source::URL` is used to normalize URLs in artist
entries. This lets us parse various profile URL formats and normalize them into a
single canonical form.
This also removes the `normalize_for_artist_finder` method from source strategies.
Instead the `profile_url` method is used for artist finding purposes. So the profile
URL returned by the source strategy needs to be the same as the URL in the artist
entry in order for artist finding to work.