search: try to optimize slow searches.

Try to optimize certain types of common slow searches:

* Searches for mutually-exclusive tags (e.g. `1girl multiple_girls`,
  `touhou solo -1girl -1boy`)
* Relatively large tags that are heavily skewed towards old posts
  (e.g. lucky_star, haruhi_suzumiya_no_yuuutsu, inazuma_eleven_(series),
  imageboard_desourced).
* Mid-sized tags in the <30k post range that Postgres thinks are
  big enough for a post id index scan, but a tag index scan is faster.

The general pattern is Postgres not using the tag index because it
thinks scanning down the post id index would be faster, but it's
actually much slower because it degrades to a full table scan. This
usually happens when Postgres thinks a tag is larger or more common than
it really is. Here we try to force Postgres into using the tag index
when we know the search is small.

One case that is still slow is `2girls -multiple_girls`. This returns no
results, but we can't know that without searching all of `2girls`. The
general case is searching for `A -B` where A is a subset of B and A and B
are both large tags.

Hopefully fixes #581, #654, #743, #1020, #1039, #1421, #2207, #4070,
 #4337, #4896, and various other issues raised over the years regarding
slow searches.
This commit is contained in:
evazion
2021-10-11 23:39:03 -05:00
parent 0b22e873c9
commit f6abf39ebc
4 changed files with 62 additions and 15 deletions

View File

@@ -7,7 +7,8 @@ module PostSets
MAX_PER_PAGE = 200
MAX_SIDEBAR_TAGS = 25
attr_reader :page, :random, :post_count, :format, :tag_string, :query, :normalized_query
attr_reader :page, :random, :format, :tag_string, :query, :normalized_query
delegate :post_count, to: :normalized_query
def initialize(tags, page = 1, per_page = nil, user: CurrentUser.user, random: false, format: "html")
@query = PostQueryBuilder.new(tags, user, tag_limit: user.tag_query_limit, safe_mode: CurrentUser.safe_mode?, hide_deleted_posts: user.hide_deleted_posts?)
@@ -97,27 +98,16 @@ module PostSets
random || query.find_metatag(:order) == "random"
end
def get_post_count
if %w[json atom xml].include?(format.downcase)
# no need to get counts for formats that don't use a paginator
nil
else
normalized_query.fast_count
end
end
def get_random_posts
::Post.user_tag_match(tag_string).random(per_page)
end
def posts
@posts ||= begin
@post_count = get_post_count
if is_random?
get_random_posts.paginate(page, search_count: false, limit: per_page, max_limit: max_per_page).load
else
normalized_query.build.paginate(page, count: post_count, search_count: !post_count.nil?, limit: per_page, max_limit: max_per_page).load
normalized_query.paginated_posts(page, count: post_count, search_count: !post_count.nil?, limit: per_page, max_limit: max_per_page).load
end
end
end