Don't show deprecated tags in the related tags or translated tags lists
when editing a post. It doesn't make sense to recommended adding tags
that can't be added to the post.
Refactor full-text search on several tables (comments, dmails,
forum_posts, forum_topics, notes, and wiki_pages) to use to_tsvector
expression indexes instead of dedicated tsvector columns. This way
full-text search works the same way across all tables.
API changes:
* Changed /wiki_pages.json?search[body_matches] to match against only
the body. Before `body_matches` matched against both the title and the body.
* Added /wiki_pages.json?search[title_or_body_matches] to match against
both the title and the body.
* Fixed /dmails.json?search[message_matches] to match against both the
title and body when doing a wildcard search. Before a wildcard search
only matched against the body.
* Added /dmails.json?search[body_matches] to match against only the dmail body.
Change the wiki_pages tsvector_update_trigger to use
`pg_catalog.english` instead of `public.danbooru`. This changes how wiki
page text is parsed for full-text search to use the standard English
parser instead of test_parser. This is to prepare for dropping
test_parser. Using test_parser here was wrong anyway because it meant
that punctuation wasn't removed from words when indexing wiki pages for
full-text search.
Remove the `category_name` field from the `/wiki_page.json` API. This
field was originally added only because it was needed by our autocomplete
Javascript. It was also misnamed, it wasn't the tag's category name, it
was the category's ID.
Users should use `https://danbooru.donmai.us/wiki_pages.json?only=title,tag`
instead if they need this.
This triggered a N+1 query pattern when dumping wiki pages to BigQuery,
which made dumping wiki pages very slow. It also meant this field was
included in the database dump, even though it wasn't a real database
column.
Fix the `normalize` and `array_attribute` macros conflicting with each
other on the WikiPage model. This meant code like
`wiki_page.other_names = "foo bar"` didn't work. Both macros defined a
`other_names=` method, but one method overrode the other.
The fix is to use anonymous modules and prepend so we can chain method
calls with super.
Fix wiki pages and artists to normalize other names more consistently
and correctly.
For wiki pages, we strip leading / trailing / repeated underscores to
fix user typos, and we normalize to NFKC form to make search more consistent.
For artists, we allow leading / trailing / repeated underscores because
some artist names have these, and we normalize to NFC form because some
artists have weird names that would be lost by NFKC form. This does make
search less consistent.
* Introduce an abstraction for normalizing attributes. Very loosely
modeled after https://github.com/fnando/normalize_attributes.
* Normalize wiki bodies to Unicode NFC form.
* Normalize Unicode space characters in wiki bodies (strip zero width
spaces, normalize line endings to CRLF, normalize Unicode spaces to
ASCII spaces).
* Trim spaces from the start and end of wiki page bodies. This may cause
wiki page diffs to show spaces being removed even when the user didn't
explicitly remove the spaces themselves.
Don't allow wiki pages to have invalid names.
This incidentally means that you can't create wiki pages for pools. For
example, you can't create a wiki titled "pool:almost_heart-warming".
This is not a valid tag name, so it's not a valid wiki name either. This
was done in a handful of cases to translate Pixiv tags to Danbooru pools
(see: <https://danbooru.donmai.us/wiki_page_versions?search[title_like]=pool:*>)
Also fix it so that titles are normalized before validation, not before save.
Allowing typing Japanese tags in autocomplete. For example, typing 東方
in autocomplete will be completed to the touhou tag. Typing ぶくぶ will
complete to the bkub tag.
This works using wiki page and artist other names. Effectively, any name
listed as an other name in a wiki or artist page will be treated like an
alias for autocomplete purposes. This is limited to non-ASCII other names,
to prevent English other names from interfering with regular tag searches.
This refactors the autocomplete Javascript to use a single dedicated
/autocomplete.json endpoint instead of a bunch of separate endpoints.
This simplifies the autocomplete Javascript by making it so that instead
of calling a different endpoint for each type of query (for users, wiki
pages, pools, artists, etc), then having to parse the results of each
call to get the data we need, we can call a single endpoint that returns
exactly what we need.
This also means we don't have to parse searches clientside in order to
autocomplete metatags. Instead we can just pass the search term to the
server and let it parse the search, which is easy to do serverside.
Finally, this makes autocomplete easier to test, and it makes it easier
to add more sophisticated autocomplete behavior, since most of the logic
lives serverside.
When aliasing A to B, update any wikis linking to [[A]] to link to [[B]]
instead.
This is a best-effort process based on rough heuristics. There are a few
known problems:
* We don't always know how to capitalize the new tag. We try to mimic
the capitalization of the old tag, such that if the old tag was
capitalized (because it was at the beginning of a sentence), or if
every word in the old link was capitalized (because it's a proper
noun), then the new link will be capitalized in the same way. This can
handle simple general tags and character tags, but will fail for
copyright tags with mixed capitalization. For example, we don't know
that [[jojo_no_kimyou_na_bouken]] should be capitalized as [[JoJo no
Kimyou na Bouken]]. If we don't know how to capitalize the new tag, we
leave the old tag as-is so it can manually be fixed.
* Some aliases might require changing how a tag is pluralized. If we
changed [[rat]] to [[mouse]], then we should change `[[rat]]s` to
[[mice]]. We don't try to deal with this.
* In general, some changes might require entire sentences to be
rewritten to keep the grammar correct. Changing something like
[[skirt lift]] to [[lifting skirt]] could break the grammar of the
sentence. We don't try to deal with this.
Move html_data_attributes definitions from models to policies. Which
attributes are permitted as data-* attributes is a view level concern
and should be defined on the policy level, not the model level. Models
should be agnostic about how they're used in views.
Also add inputs on the search page for both the linked_to and the
not_linked_to search parameters. Additionally, normalize the title
first since autocomplete adds trailing spaces. The search query was
also simplified a bit by taking advantage of Rails associations.
Refactor models so that we define attribute API permissions in policy
files instead of directly in models.
This is cleaner because a) permissions are better handled by policies
and b) which attributes are visible to the API is an API-level concern
that models shouldn't have to care about.
This fixes an issue with not being able to precompile CSS/JS assets
unless the database was up and running. This was a problem when building
Docker images because we don't have a database at build time. We needed
the database because `api_attributes` was a class-level macro in some
places, which meant it ran at boot time, but this triggered a database
call because api_attributes used database introspection to get the list
of allowed API attributes.
Rename is_active to is_deleted. This is for better consistency with
other models, and to reduce confusion over what "active" means for
artists. Sometimes users think active is for whether the artist is
actively producing work.
* Replace the .category-N CSS classes on tags with .tag-type-N. Before
we were inconsistent about whether tag colors were indicated with
.category-N or .tag-type-N. Now it's always .tag-type-N.
* Fix various places to not use Tag.category_for. Tag.category_for does
one Redis call per tag lookup, which leads to N Redis calls on many
pages. This was inefficient because usually we either already had the
tags from the database, or we could fetch them easily.
- The only string works much the same as before with its comma separation
-- Nested includes are indicated with square brackets "[ ]"
-- The nested include is the value immediately preceding the square brackets
-- The only string is the comma separated string inside those brackets
- Default includes are split between format types when necessary
-- This prevents unnecessary includes from being added on page load
- Available includes are those items which are allowed to be accessible to the user
-- Some aren't because they are sensitive, such as the creator of a flag
-- Some aren't because the number of associated items is too large
- The amount of times the same model can be included to prevent recursions
-- One exception is the root model may include the same model once
--- e.g. the user model can include the inviter which is also the user model
-- Another exception is if the include is a has_many association
--- e.g. artist urls can include the artist, and then artist urls again
Remove support for the `search[sort]` param on certain index pages. This
hasn't been used for years, and it caused the `search[order]=` param to
be added to pagination links even when the order was blank.
Allow all users to view and edit artist entries and wiki pages belonging
to banned artists. There was little need to hide these pages from
Members, it was mainly to appease artists who didn't like us even
linking to their sites.
These restrictions also had multiple flaws:
* Banned artist information was still visible in the API.
* It was still possible to edit banned artists using the API.
* It was still possible for unprivileged users to revert banned
artist entries or wiki pages to previous versions.
* The restrictions were inconsistent: in various places they were
either Member-only, Gold-only, or Builder-only.
* Warn when renaming a wiki that still has links from other wikis.
* When renaming a wiki that still has posts, just show a warning instead
of returning an error and making the user confirm the rename.