Move the parsing for the [bur:<id>], [ta:<id>], [ti:<id>] pseudo tags to
the main parser in `DText.format_text`. This fixes a bug where wiki
links inside bulk update requests on the forum weren't properly
colorized because the text of the BUR was embedded after we scanned for
wiki links, not before.
This also ensures that tags inside bulk update requests will be recorded
in the dtext_links table, meaning that forum posts can be properly
searched by tags.
This incidentally means that these request pseudo tags can now be used
outside the forum.
* Let gold users use upvote:self, downvote:self metatags to search for
their own votes.
* Don't let mods use upvote:<user>, downvote:<user> metatags to see
votes by other users. Only let admins see other users' votes.
* Add vote count to profile page.
* Remove the single alias and implication request forms. From now
on, bulk update requests are the only way to request aliases or
implications.
* Remove the forum topic ID field from the bulk update request form.
Instead, to attach a BUR to an existing topic you go to the topic then
you click "Request alias/implication" at the top of the page.
* Update the bulk update request form to give better examples for the
script format and to explain the difference between aliases and
implications.
Instead of sending IQDB the image url and letting it download the file,
download the file on Danbooru's end and send IQDB the downloaded file.
This fixes several issues:
* Some sites need the referer header set to avoid hotlink protection
when downloading the file. Danbooru knows how to deal with this but
IQDB doesn't.
* We need to enforce certain restrictions when downloading files,
including setting max filesize limits, setting max timeouts, and not
allowing downloads from forbidden IPs (to avoid SSRF attacks).
Danbooru knows how to handle these things but IQDB doesn't.
Fix the moebooru strategy to fallback to returning the image url if we
can't find the preview url. Fixes iqdb lookups failing in some cases
because the strategy didn't return a valid url for preview_url.
* Pick the largest character or copyright tags by post count. Previously
we picked the tags with the longest names, which was nonsensical.
* Remove tag cateogory logic from config file. We can't avoid hardcoding
some knowledge about tag categories here, so there's no point in trying.
This affects tab titles on post show pages as well as filenames in
downloaded images.
Add a dtext_links table for tracking links between wiki pages. This is
to allow for broken link detection and "what links here" searches, among
other uses.
* Fix the IQDB lookup to handle error messages returned by IQDB.
* Fix `undefined method `>=' for nil:NilClass` error when trying to upload
ugoiras. Using the preview image for IQDB lookups should be faster and
give the same results as the full image, plus it should work with things
like ugoiras and videos that IQDB can't handle.
Also add an image_url param so that API users can continue using the
full image if needed.
https://danbooru.donmai.us/forum_topics/9127?page=283#forum_post_160508
There was a recent outage that was caused by the read replica
(yukinoshita.donmai.us) being temporarily unavailable. The pg driver in
rails got hardstuck trying to connect to the replica, which brought down
the whole site. The app servers stopped responding and could only be
brought down with SIGKILL. Even try to boot the rails console didn't
work.
We only really used this to calculate tag counts inside Post.fast_count,
which wasn't really beneficial since the read replica is slower than the
main database.
Previously the search form on the /iqdb_queries page submitted directly
to the iqdb service (karasuma.donmai.us), which redirected back to
Danbooru with the search results.
This was different than API requests, which submitted to
/iqdb_queries.json which proxied the call to iqdb through Danbooru.
Because of this, searches on the /iqdb_queries page had different
behavior than API requests. Things like filesize limits and referrer
spoofing were handled differently.
Now searches on the /iqdb_queries page submit directly to Danbooru. This
is simpler and it means that API requests and HTML requests have the
same behavior.
* Change cutoffs on upload page to max 5 results, min. 20% similarity.
* Change cutoffs on standalone /iqdb_queries page to max 20 results, min. 0% similarity.
* /iqdb_queries.json: add `limit` and `similarity` params to change default cutoffs.
DText is processed in three phases: a preprocessing phase, the regular
parsing phases, and a postprocessing phase.
In the preprocessing phase we extract all the wiki links from all the
dtext messages on the page (more precisely, we do this in forum threads
and on comment pages, because these are the main places with lots of
dtext). This is so we can lookup all the tags and wiki pages in one
query, which is necessary because in the worst case (in certain forum
threads and in certain list_of_* wiki pages) there can be hundreds of
tags per page.
In the postprocessing phase we fixup the html generated by the ragel
parser to add CSS classes to wiki links. We do this in a postprocessing
step because it's easier than doing it in the ragel parser itself.
* Parse the wiki page with the actual dtext parser instead of by hand.
This is so that wiki links inside things like [nodtext] or [code]
blocks are handled properly.
* Only include tags that exist and are nonempty. Don't include links to
dead pages or blank tags.
There are a handful of places where we need to strip markup from a piece
of dtext, primarily in <meta> description tags in the wiki. Currently
the dtext parser handles this by having a special mode where it parses
the text but doesn't output html tags. Here we refactor to instead parse
the text normally then strip out the html tags after the fact.
This is more flexible and allows us to simplify a lot of things in the
dtext parser. This also produces more readable output than before in
certain cases.
Previously the page-based (numbered) paginator would always count the
total_pages, even in API calls when it wasn't needed. This could be very
slow in some cases. Refactor so that total_pages isn't calculated unless
it's called.
While we're at it, refactor to condense all the sequential vs. numbered
pagination logic into one module. This incidentally fixes a couple more
bugs:
* "page=b0" returned all pages rather than nothing.
* Bad parameters like "page=blaha123" and "page=a123blah" were accepted.
Disable database timeouts durings daily maintenance. Fixes
`regenerate_post_counts!` timing out. Remove calls to without_timeout
because otherwise it will reenable the timeout when trying to restore
the old timeout (see 97cc873a3f).
This was a mod-only report that used Google BigQuery to search post
versions by tag. 2b4ee0ee8 allows all users to search post versions by
tag, so this report is no longer necessary.
* Automatically fix all tags with incorrect counts during daily
maintenance (previously only tags with negative counts were fixed).
* Log fixed tags to NewRelic.
* Remove the ability to manually fix tag counts with the "Fix" button on
the /tags listing. This is no longer necessary now that tags are
fixed automatically.
* Only check for conflicts with existing aliases/implications when
requests are created or approved, not when requests are rejected.
* Use `update!(status: "deleted")` instead of `update(status: "deleted")`
so that if rejecting the request fails we fail immediately instead of
continuing on and updating the forum topic.
* Wrap `reject!` and `TagChangeRequestPruner.reject_expired` in
transactions so that if updating either the request or the forum
fails, they both get rolled back.
Replace the `method_attributes` and `hidden_attributes` methods with
`api_attributes`. `api_attributes` can be used as a class macro:
# include only the given attributes.
api_attributes :id, :created_at, :creator_name, ...
# include all default attributes plus the `creator_name` method.
api_attributes including: [:creator_name]
or as an instance method:
def api_attributes
[:id, :created_at, :creator_name, ...]
end
By default, all attributes are included except for IP addresses and
tsvector columns.
In xml responses, if the result is an empty array we want the response
to look like this:
<posts type="array"/>
not like this (the default):
<nil-classes type="array"/>
This refactors controllers so that this is done automatically instead of
having to manually call `@things.to_xml(root: "things")` everywhere. We
do this by overriding the behavior of `respond_with` in `ApplicationResponder`
to set the `root` option by default in xml responses.
Stop using the pool_string field internally, but keep maintaining it
until we can drop it later.
* Stop using the pool_string for `pool:<name>` metatag searches.
* Stop using the pool_string in the `Post#pools` method. This is used to
get the list of pools on post show pages.
* Change the source index on posts from `(lower(source) gin_trgm_ops) WHERE source != ''`
to just `(source gin_trgm_ops)`. The WHERE clause prevented the index
from being used in source:<url> searches because we didn't specify
the `source != ''` clause in the search itself. Excluding blank
sources only saved a marginal amount of space anyway. This fixes
timeouts in source:<url> searches and in the bookmarklet (since we do
a source dupe check on the upload page too).
* Also switch from indexing `lower(name)` to `name` on pools and users.
We don't need to lowercase the column because GIN indexes can be used
with both LIKE and ILIKE queries.