Add a dtext_links table for tracking links between wiki pages. This is
to allow for broken link detection and "what links here" searches, among
other uses.
DText is processed in three phases: a preprocessing phase, the regular
parsing phases, and a postprocessing phase.
In the preprocessing phase we extract all the wiki links from all the
dtext messages on the page (more precisely, we do this in forum threads
and on comment pages, because these are the main places with lots of
dtext). This is so we can lookup all the tags and wiki pages in one
query, which is necessary because in the worst case (in certain forum
threads and in certain list_of_* wiki pages) there can be hundreds of
tags per page.
In the postprocessing phase we fixup the html generated by the ragel
parser to add CSS classes to wiki links. We do this in a postprocessing
step because it's easier than doing it in the ragel parser itself.
There are a handful of places where we need to strip markup from a piece
of dtext, primarily in <meta> description tags in the wiki. Currently
the dtext parser handles this by having a special mode where it parses
the text but doesn't output html tags. Here we refactor to instead parse
the text normally then strip out the html tags after the fact.
This is more flexible and allows us to simplify a lot of things in the
dtext parser. This also produces more readable output than before in
certain cases.
Previously the page-based (numbered) paginator would always count the
total_pages, even in API calls when it wasn't needed. This could be very
slow in some cases. Refactor so that total_pages isn't calculated unless
it's called.
While we're at it, refactor to condense all the sequential vs. numbered
pagination logic into one module. This incidentally fixes a couple more
bugs:
* "page=b0" returned all pages rather than nothing.
* Bad parameters like "page=blaha123" and "page=a123blah" were accepted.
* Add ability to search /post_versions by added tags, removed tags, or
changed tags (added or removed).
* Add 'History' link to the sidebar of the /posts index. This is a
shortcut for a /post_versions search of the current tag.
Don't create a neutral feedback, create a mod action, or dmail the user
after changing a user's name. The name change is already recorded in
/user_name_change_requests, so creating feedbacks and mod actions is
redundant. They also expose private information (when a user deletes
their account, old name changes aren't supposed to be visible any more).
Remove all infrastructure around approving or rejecting user name
changes. Name changes haven't been moderated for several years.
* Remove status, approver_id, change_reason, and rejection_reason fields.
* Remove approve and reject controller actions.
* Automatically fix all tags with incorrect counts during daily
maintenance (previously only tags with negative counts were fixed).
* Log fixed tags to NewRelic.
* Remove the ability to manually fix tag counts with the "Fix" button on
the /tags listing. This is no longer necessary now that tags are
fixed automatically.
`User.find_by_name` used `where_ilike` to do a case-insensitve name
search, but it didn't escape `*` or `\` characters first, so it didn't
handle names containing these characters properly.
* Always display 'Saved searches' link in subnav bar, even if the user
hasn't created any saved searches yet.
* Eliminate use of `has_saved_searches` bitpref on users.
* Only check for conflicts with existing aliases/implications when
requests are created or approved, not when requests are rejected.
* Use `update!(status: "deleted")` instead of `update(status: "deleted")`
so that if rejecting the request fails we fail immediately instead of
continuing on and updating the forum topic.
* Wrap `reject!` and `TagChangeRequestPruner.reject_expired` in
transactions so that if updating either the request or the forum
fails, they both get rolled back.
Stop maintaining pool category pseudo tags (pool:series, pool:collection)
in pool strings. They're no longer used and the changes to the
`Post#pools` method in dc4d2e54b caused issues with this.
Also allow Members to change the category of large pools again. This was
only restricted because maintaining these pseudotags forced us to update
every post in the pool whenever a pool's category was changed.
Stop using the pool_string field internally, but keep maintaining it
until we can drop it later.
* Stop using the pool_string for `pool:<name>` metatag searches.
* Stop using the pool_string in the `Post#pools` method. This is used to
get the list of pools on post show pages.
Revert optimization from a6163258b. Turns out that we have to resolve
aliases in fast_count, otherwise for aliased tags we'll return an empty
count.
Fixes#4156.
* Don't try to call `sadd` when a search returns no results (`sadd`
fails in this case).
* Add a timeout when populating the search.
* Don't offload the search to read replica. The main db is fine.
* Disable synchronous population of searches. This was too slow.
* Change the source index on posts from `(lower(source) gin_trgm_ops) WHERE source != ''`
to just `(source gin_trgm_ops)`. The WHERE clause prevented the index
from being used in source:<url> searches because we didn't specify
the `source != ''` clause in the search itself. Excluding blank
sources only saved a marginal amount of space anyway. This fixes
timeouts in source:<url> searches and in the bookmarklet (since we do
a source dupe check on the upload page too).
* Also switch from indexing `lower(name)` to `name` on pools and users.
We don't need to lowercase the column because GIN indexes can be used
with both LIKE and ILIKE queries.
Make /favorites redirect to a ordfav:<user> search instead of having a
separate view just for favorites. This duplicated a lot of code for no
good reason.
Rewrite the implementation of related tags to be simpler, faster, and
more accurate:
* The related tags are now calculated by taking a random sample of 1000
posts, finding the top 250 most frequent tags among those posts, then
ordering those tags by cosine similarity.
* Related tags can generally be calculated in 50-300ms at these sample
sizes. Very high sample sizes (25000+ posts) are still relatively fast
(1-3 seconds), but generally they don't improve accuracy much.
* Related tags are now cached in redis rather than in the tags table.
The related_tags column in the tags table is no longer used.
* Only the related tags in the search taglist are cached. The related
tags returned by the 'Related tags' button are not cached.
* The cache lifetime is a fixed 4 hours.
* The 'Related tags' button now works with metatags.
* The /related_tag page now works with metatags and multitag searches.
Fixes#4134, #4146.
* Drop support for `source:pixiv/artist-name` searches. This was a hack
that only worked on old pixiv urls that haven't been used for years.
* Replace the old SourcePattern(lower(source)) index with a trigram index.
Drop support for https://danbooru.donmai.us/cache/tags.json. This was a
nightly dump of the tags table that was originally added in #1012. It
was never documented and never really used except for by the DanbooruUp
extension.
Bug: sending dmails failed for members.
Cause: using lambdas with `rakismet_attrs` failed because unexpected
arguments are passed to the lambdas. Using procs works because the
arguments are ignored.
Also fix the tests to actually test akismet. We didn't catch this
because the tests mocked out the `spam?` call.
Certain parts of comment rendering triggered sql queries that we didn't
really need to do. Rework things to avoid this.
* Preload comment creators in order to display commenter names with link_to_user.
* Preload comment votes in order to display "undo vote" links. Only preload
votes for members since anonymous users can't vote and don't have "undo
vote" links.
* Rework various conditionals to do the filtering in Ruby so that we
avoid issuing any extra queries in sql.
* Avoid issuing any queries at all when the post doesn't have any
comments (when last_commented_at is blank).
Also fixes a bug where mod actions weren't logged on mass updates.
Creating the mod action silently failed because it was called when
CurrentUser wasn' set.