Fix regression in #4475. Fetch the commentary as html instead of
plaintext so that we don't lose links or other formatting.
Also fix it so that /jump.php redirect links are replaced with the
actual url.
* Move the source normalization logic out of the post model
and into individual sources' strategies.
* Rewrite normalization tests to be handled into each source's test,
and expand them significantly. Previously we were only testing
a very small subset of domains and variants.
* Fix up normalization for several sites.
* Normalize fav.me urls into normal deviantart urls.
* Move image thumbnail generation code to MediaFile::Image.
* Move video thumbnail generation code to MediaFile::Video.
* Move ugoira->webm conversion code to MediaFile::Ugoira.
This separates thumbnail generation from the upload process so that it's
possible to generate thumbnails outside of uploads.
Fixes bug described in d3e4ac7c17 (commitcomment-39049351)
When dealing with searches, there are several variables we have to keep
in mind:
* Whether tag aliases should be applied.
* Whether search terms should be sorted.
* Whether the rating:s and -status:deleted metatags should be added by
safe mode and the hide deleted posts setting.
Which of these things we need to do depends on the context:
* We want to apply aliases when actually doing the search, calculating
the count, looking up the wiki excerpt, recording missed/popular
searches in Reportbooru, and calculating related tags for the sidebar,
but not when displaying the raw search as typed by the user (for
example, in the page title or in the tag search box).
* We want to sort the search when calculating cache keys for fast_count
or related tags, and when recording missed/popular searches, but not
in the page title or when displaying the raw search.
* We want to add rating:s and -status:deleted when performing the
search, calculating the count, or recording missed/popular searches,
but not when calculating related tags for the sidebar, or when
displaying the page title or raw search.
Here we introduce normalized_query and try to use it in contexts where
query normalization is necessary. When to use the normalized query
versus the raw unnormalized query is still subtle and prone to error.
This was a hack to deal with the Cloudflare check sometimes being slow
or timing out during tests. The call to https://api.cloudflare.com/client/v4/ips
could hang if there were IPv6 connectivity problems. If this happens, make
sure that IPv6 is configured properly and that `curl -v --http1.1 -6 https://api.cloudflare.com/client/v4/ips`
works.
that's the latest commit made to deviantart files before switching from
the developer API to the Javascript backend from the new "Eclipse"
frontend.
This is necessary because it's basically impossible to download posts
now with the JS backend without being logged in, i.e. having the cookies
from a logged in user, which can't be used for very long even if
exporting them from a browser. You would have to save the cookies
deviantart sends you back via the "Set-Cookie" header in a database
somewhere in addition to the other added complexity.
also
* (temporarily) replace HttpartyCache with HTTParty as it's long been
removed
* fix one case of "last argument as keyword parameter"
* change repository url (5d1a1cc87e)
* remove self-explanatory comment
The name AliasAndImplicationImporter is a holdover from the time before
bulk update requests existed. This was a bad name because it doesn't do
any actual importing, instead it's used for parsing and executing bulk
update requests.
* Remove unnecessary rename_aliased_pages option. This option was always enabled.
* Don't try to rename the artist and wiki page inside AliasAndImplicationImporter
when an alias is approved. This is already handled by TagAlias#process!.
Refactor RelatedTagCalculator and RelatedTagQuery to take a PostQuery
object instead of a raw tag string.
* Fixes the related tag sidebar on the post index page having to reparse
the query and reevaluate aliases.
* Fixes related tags being affected by the current user's safe mode and
hide deleted posts settings.
Some searches, such as searches for private favorites or for the
status:unmoderated tag, return different results for different users.
These searches need to have their counts cached separately for each user
so that we don't return incorrect page counts when two different users
perform the same search.
This can also potentially leak private information, such as the number
of posts flagged, downvoted, or disapproved by a given user.
Partial fix for #4280.
* Refactor fast_count to return nil instead of 1,000,000 if the exact count times out.
* Remove the estimate_post_counts and blank_tag_search_fast_count global config options.
* Replace the hardcoded post count estimates inside fast_count with a
method that parses Postgres's estimated row count from EXPLAIN.
* /counts/posts.json:
** Remove the `raise_on_timeout` parameter.
** Add an `estimate_count=<true|false>` parameter.
** Return null instead of 1,000,000 if the exact count times out.
Change PostQueryBuilder to add rating:s and -status:deleted to the
search inside the constructor instead of inside `#build` and
`#fast_count`. This lets up clean up `#fast_count` so it doesn't have to
reparse the query after adding these tags. This caused aliases to be
evaluated more than once on the post index page.
Make PostQueryBuilder apply aliases earlier, immediately after parsing
the search.
On the post index page there are multiple places where we need to apply
aliases:
* When running the search with PostQueryBuilder#build.
* When calculating the search count with PostQueryBuilder#fast_count.
* When calculating the related tags for the sidebar.
* When tracking missed searches and popular searches for Reportbooru.
* When looking up wiki excerpts.
Applying aliases after parsing ensures we only have to apply aliases
once for all of these things.
We also normalize the order of tags in searches and strip repeated tags.
This is so that we have consistent cache keys for fast_count.
* Fixes searches for aliased tags being counted as missed searches (fixes#4433).
* Fixes wiki excerpts not showing up when searching for aliased tags.
When doing a tag search, we have to be careful about which user we're
running the search as because the results depend on the current user.
Specifically, things like private favorites, private favorite groups,
post votes, saved searches, and flagger names depend on the user's
permissions, and whether non-safe or deleted posts are filtered out
depend on whether the user has safe mode on or the hide deleted posts
setting enabled.
* Refactor internal searches to explicitly state whether they're
running as the system user (DanbooruBot) or as the current user.
* Explicitly pass in the current user to PostQueryBuilder instead of
implicitly relying on the CurrentUser global.
* Get rid of CurrentUser.admin_mode? (used to ignore the hide deleted
post setting) and CurrentUser.without_safe_mode (used to ignore safe
mode).
* Change the /counts/posts.json endpoint to ignore safe mode and the
hide deleted posts settings when counting posts.
* Fix searches not correctly overriding the hide deleted posts setting
when multiple status: metatags were used (e.g. `status:banned status:active`)
* Fix fast_count not respecting the hide deleted posts setting when the
status:banned metatag was used.
* Add MediaFile abstraction. A MediaFile represents an image or video file.
* Move filetype detection and dimension parsing code from uploads to MediaFile.
* Add unaliased:<tag> metatag. This allows you to search for a tag
without applying aliases. This is mainly useful for debugging purposes
and for searching for large tags that are in the process of being
aliased but haven't had all their posts moved yet.
* Remove the "raw" url param from the posts index page. The "raw" param
also caused the search to ignore aliases, but it was undocumented and
exploitable. It was possible to use the raw param to view private
favorites since favorites are treated like a hidden tag.
Forgot to account for negated metatags in normalize_query after e987f070.
Fixes a bug where wrong page counts were displayed for searches
involving negated metatags due to incorrect query normalization.
Fix not being able to negate the following metatags:
* id (didn't support ranges)
* md5
* width
* height
* mpixels
* ratio
* score
* favcount
* filesize
* date
* age
* tagcount
* pixiv
Bug: in some cases searching for multiple metatags would cause one
metatag to be ignored. For example, a search for {{user:1 pool:2}} would
be treated as a search for {{pool:2}}.
Cause: we used `ActiveRecord::Relation#merge` to combine two relations,
which was wrong because `merge` doesn't combine `column IN (?)` clauses
correctly. If there are two `column IN (?)` clauses on the same column,
then `#merge` takes only the second clause and ignores the first.
Fix: write our own half-baked `#and` method to work around Rails'
broken-by-design `#merge` method.
ref: https://github.com/rails/rails/issues/33501.
`normalize_query` is used in certain places on the post index page where
we don't want to pay the cost of looking up tag aliases (namely inside
fast_count, in post_search_count_js, and in tag change notices). Don't
normalize aliases by default unless we need to.
Several fixes for the "This tag is under discussion" notice on the post
index page:
* Fix the notice appearing for BURs that aren't pending.
* Fix the notice never going away because of the cache never expiring.
* List all topics when a tag is involved in multiple BURs.
* Link to the forum post instead of the forum topic (fix#4421).
* Optimization: don't check for BURs when the search isn't a simple
single tag search.
* Add a `tags` field to the bulk update requests table for tracking all
tags involved in the request (excluding tags in mass updates that are
negated/optional/wildcards). Known issue: doesn't handle tag type
prefixes in mass updates correctly (e.g. `mass update foo -> artist:bar`
doesn't detect the tag `bar`).
* Allow searching the /bulk_update_requests page by tags.
We don't really need to cache the notice here, but we do it anyway to
reduce queries on the post index page.