Refactor models so that we define attribute API permissions in policy
files instead of directly in models.
This is cleaner because a) permissions are better handled by policies
and b) which attributes are visible to the API is an API-level concern
that models shouldn't have to care about.
This fixes an issue with not being able to precompile CSS/JS assets
unless the database was up and running. This was a problem when building
Docker images because we don't have a database at build time. We needed
the database because `api_attributes` was a class-level macro in some
places, which meant it ran at boot time, but this triggered a database
call because api_attributes used database introspection to get the list
of allowed API attributes.
* Move the source normalization logic out of the post model
and into individual sources' strategies.
* Rewrite normalization tests to be handled into each source's test,
and expand them significantly. Previously we were only testing
a very small subset of domains and variants.
* Fix up normalization for several sites.
* Normalize fav.me urls into normal deviantart urls.
Fixes bug described in d3e4ac7c17 (commitcomment-39049351)
When dealing with searches, there are several variables we have to keep
in mind:
* Whether tag aliases should be applied.
* Whether search terms should be sorted.
* Whether the rating:s and -status:deleted metatags should be added by
safe mode and the hide deleted posts setting.
Which of these things we need to do depends on the context:
* We want to apply aliases when actually doing the search, calculating
the count, looking up the wiki excerpt, recording missed/popular
searches in Reportbooru, and calculating related tags for the sidebar,
but not when displaying the raw search as typed by the user (for
example, in the page title or in the tag search box).
* We want to sort the search when calculating cache keys for fast_count
or related tags, and when recording missed/popular searches, but not
in the page title or when displaying the raw search.
* We want to add rating:s and -status:deleted when performing the
search, calculating the count, or recording missed/popular searches,
but not when calculating related tags for the sidebar, or when
displaying the page title or raw search.
Here we introduce normalized_query and try to use it in contexts where
query normalization is necessary. When to use the normalized query
versus the raw unnormalized query is still subtle and prone to error.
Some searches, such as searches for private favorites or for the
status:unmoderated tag, return different results for different users.
These searches need to have their counts cached separately for each user
so that we don't return incorrect page counts when two different users
perform the same search.
This can also potentially leak private information, such as the number
of posts flagged, downvoted, or disapproved by a given user.
Partial fix for #4280.
Make PostQueryBuilder apply aliases earlier, immediately after parsing
the search.
On the post index page there are multiple places where we need to apply
aliases:
* When running the search with PostQueryBuilder#build.
* When calculating the search count with PostQueryBuilder#fast_count.
* When calculating the related tags for the sidebar.
* When tracking missed searches and popular searches for Reportbooru.
* When looking up wiki excerpts.
Applying aliases after parsing ensures we only have to apply aliases
once for all of these things.
We also normalize the order of tags in searches and strip repeated tags.
This is so that we have consistent cache keys for fast_count.
* Fixes searches for aliased tags being counted as missed searches (fixes#4433).
* Fixes wiki excerpts not showing up when searching for aliased tags.
When doing a tag search, we have to be careful about which user we're
running the search as because the results depend on the current user.
Specifically, things like private favorites, private favorite groups,
post votes, saved searches, and flagger names depend on the user's
permissions, and whether non-safe or deleted posts are filtered out
depend on whether the user has safe mode on or the hide deleted posts
setting enabled.
* Refactor internal searches to explicitly state whether they're
running as the system user (DanbooruBot) or as the current user.
* Explicitly pass in the current user to PostQueryBuilder instead of
implicitly relying on the CurrentUser global.
* Get rid of CurrentUser.admin_mode? (used to ignore the hide deleted
post setting) and CurrentUser.without_safe_mode (used to ignore safe
mode).
* Change the /counts/posts.json endpoint to ignore safe mode and the
hide deleted posts settings when counting posts.
* Fix searches not correctly overriding the hide deleted posts setting
when multiple status: metatags were used (e.g. `status:banned status:active`)
* Fix fast_count not respecting the hide deleted posts setting when the
status:banned metatag was used.
Fix not being able to negate the following metatags:
* id (didn't support ranges)
* md5
* width
* height
* mpixels
* ratio
* score
* favcount
* filesize
* date
* age
* tagcount
* pixiv
* Move various search parser helper methods (`has_metatag?`,
`is_single_tag?` et al) from PostSets and the Tag model to
PostQueryBuilder.
* Fix various minor bugs stemming from trying to check if a search query
contains certain metatags using regexes or other adhoc techniques.
* Make scan_query, parse_query, normalize_query into instance methods
instead of class methods. This is to a) clean up the API and b)
prepare for moving certain tag utility methods into PostQueryBuilder.
* Fix a few cases where a caller used scan_query when they should have
used split_query or parse_tag_edit.
Fix a severe performance regression on the posts/index page introduced
by 6ca42947.
Short answer: scan_query dynamically allocated a regex inside an
inner loop that was called thousands of times per pageload.
Long answer:
* The post index page checks each post to see if they're tagged loli/shota,
* This triggers a call to Post#tag_array for every post.
* Post#tag_array called scan_query to split the tag string.
* scan_query loops over the tag string, checking if each tag matches the
regex /#{METATAGS.join("|")}:/.
* This regex uses string interpolation, which makes Ruby treat as a
dynamic value rather than a static value. Ruby doesn't know the
interpolation is static here. This causes the regex to be reallocated
on every iteration of the loop, or in other words, for every tag in
the tag string.
* This caused us to do thousands of regex allocations per pageload. On
average, a posts/index pageload contains 20 posts with ~35 tags per
post, or 7000+ total tags. Doing this many allocations killed performance.
The fix:
* Don't use scan_query for Post#tag_array. We don't have to fully parse
the tag_string here, we can use a simple split.
* Use the /o regex flag to tell Ruby to treat the regex as static and
only evaluate the interpolation once.
* Fix not being able to use the status: metatag twice in the same search.
* Fix status:active excluding banned posts.
* Fix status:garbage returning all posts.
* Allow tagging a post with a `disapproved:<disinterest|breaks_rules|poor_quality>` to disapprove it.
* Disallow disapproving active posts.
Fixes#4384.
* Add a "View original" sidebar option.
* Rename the "View large" sidebar option to "View smaller".
* Remove the "Loading..." message when switching image sizes.
* Fix the V hotkey not working after using it once.
* Change #image-resize-link to .image-view-original link (note that
there are two of these links now, one in the notice bar and one in the
sidebar).
* Add a `data-post-current-image-size` attribute on the <body> element
and use it to control visibility of links and notices.
Side effects:
* The data-current-user-is-voter <body> attribute has been removed.
* {{upvote:self}} no longer works. {{upvote:<name>}} should be used instead.
* Add options for changing the order of the modqueue (newest first,
oldest first, highest scoring first, lowest scoring first).
* Change the default order from oldest posts first to most recently
flagged or uploaded posts first.
* Add an order:modqueue metatag to order by most recently flagged or
uploaded in standard searches.
* Include appeals and flags.
* Avoid an existence query for pools.
* Avoid a query checking if the user has previously approved the post.
This is a rare condition and it will be prevented anyway if the user
tries to reapprove the post.
Refactor to use a recursive CTE to calculate implied tags in SQL, rather
than storing them in a descendant_names field. This avoids the
complexity of keeping the stored field up to date. It's also more
flexible, since it allows us to find both descendant tags (tags that
imply a given tag) as well as ancestor tags (tags that are implied by a
given tag).
* Allow approvers to approve a post by tagging it with status:active.
* Allow approvers to ban a post by tagging it with status:banned.
* Allow approvers to unban a post by tagging it with -status:banned.
Remove various associated fields that were included by default on
certain endpoints. API users can use the only param to include the
full association if they need these fields.
* /artists.json: urls.
* /artist_urls.json: artist.
* /comments.json: creator_name and updater_name.
* /notes.json: creator_name.
* /pools.json: creator_name.
* /posts.json: uploader_name, children_ids, pixiv_ugoira_frame_data.
* /post_appeals.json: is_resolved.
* /post_versions.json: updater_name.
* /uploads.json: uploader_name.
- The only string works much the same as before with its comma separation
-- Nested includes are indicated with square brackets "[ ]"
-- The nested include is the value immediately preceding the square brackets
-- The only string is the comma separated string inside those brackets
- Default includes are split between format types when necessary
-- This prevents unnecessary includes from being added on page load
- Available includes are those items which are allowed to be accessible to the user
-- Some aren't because they are sensitive, such as the creator of a flag
-- Some aren't because the number of associated items is too large
- The amount of times the same model can be included to prevent recursions
-- One exception is the root model may include the same model once
--- e.g. the user model can include the inviter which is also the user model
-- Another exception is if the include is a has_many association
--- e.g. artist urls can include the artist, and then artist urls again
* Add ability to report dmails.
* Enable reports for comments, forum posts, and dmails.
* Allow Members to send reports.
* Don't allow users to report the same thing twice.
The belongs_to_creator macro was used to initialize the creator_id field
to the CurrentUser. This made tests complicated because it meant you had
to create and set the current user every time you wanted to create an
object, when lead to the current user being set over and over again. It
also meant you had to constantly be aware of what the CurrentUser was in
many different contexts, which was often confusing. Setting creators
explicitly simplifies everything greatly.
- Posts and topics have an added moderation_reports function
-- This is so all moderation reports can be loaded in a single query
- Those moderation reports are passed into the render functions separately
-- This is so the individual comments/posts don't have to be queried
Unify the `name_to_id`, `named`, and `find_by_name` methods into a
single `find_by_name_or_id` method that has consistent behavior in how
names are normalized.
* Convert notices from helpers to partials.
* Eliminate PostSets::PostRelationship class in favor of post_sets/posts template.
* Eliminate COUNT(*) queries when calculating the number of child posts.
* Eliminate redundant parent load and parent exists queries.