* Fix the suggested tags list in the related tags box not showing rating tags.
* Fix the suggested tags list showing tags that have been aliased to another tag.
Add a Suggested tags list to the Related Tags box. The suggested tags
are just the AI tags for the post.
Suggested tags are currently hidden in CSS for beta testing. Use custom
CSS to unhide them.
Exclude the posts, post_votes, favorites, media_assets, and ai_tags
tables from the BigQuery dumps. These usually take too long to complete
and also consume huge amounts of memory in the background workers.
Allow searching for e.g. `ai:monochrome,0%` to find posts where the AI has 0% confidence
the post should be tagged monochrome.
This is useful for finding mistagged posts. You can search `monochrome ai:monochrome,0%`
to find posts that are potentially mistagged as monochrome, that is, posts that are
tagged monochrome but where the AI has 0% confidence that it should be tagged monochrome.
Not all cases will be mistags. The majority will be cases where the AI failed to identify
the tag. This is also useful for studying the AI's failures.
Index on (tag_id, score) instead of (tag_id) to allow tags to be
filtered and sorted by confidence more efficiently. This index takes up
about the same amount of space as an index on tag_id alone, so including
the score in the index is essentially free.
Change the /ai_tags page to show only posts by default, not both posts
and unposted media assets mixed together. Showing media assets tended to
confuse users about why they couldn't add tags to these images. It also
distracted from the page's primary use case, which is gardening posts.
Add a hidden option for 540px size thumbnails. This option currently
isn't listed in the thumbnail size menu, but it can be enabled with an
URL param: https://danbooru.donmai.us/posts?size=540.
Fix a bug where `Post#has_tag?` would return false for tags that
contained colons. This caused the /ai_tags page to incorrectly say
certain tags weren't present on the post.
Add "Add" and "Remove" buttons beneath thumbnails on the /ai_tags page.
These let you add the tag to the post if it's correct, or remove it if
it's wrong.
Add option for "Absurd" (720px) size thumbnails to the thumbnail size
menu. 720px size thumbnails require .webp support so they won't work in
older browsers.
Allow searching for e.g. `ai:solo,>=90%` to to find posts that have the
solo tag with >=90% confidence. The default confidence level is 50%. The
delimiter is a comma because it's one of the few characters not allowed
in tag names.
Enabled filtering by posted media assets on the /ai_tags page. This was
disabled in development because it was too slow, but seems to be fine in
production.
Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags.
AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that
repo for details about the model.
The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is
designed to be as space-efficient as possible, since in production we have over 300 million
AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus
indexes.
You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts
where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are
potentially mistagged (or more likely where the AI missed the tag).
You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by
confidence level. You can also search unposted media assets by AI tag.
To generate tags, use the `autotag` script from the Autotagger repo, something like this:
docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images | gzip > tags.csv.gz
To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take
hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.
This is so admins can overrule flags and always have the final say in whether a
post is approved, even in the event of coordinated or sockpuppet flagging.
Fixes#4980: Way to mark flags as invalid for admins
The following metatags no longer count against the tag search limit:
* is
* id
* date
* age
* filesize
* filetype
* parent
* child
* md5
* width
* height
* duration
* mpixels
* ratio
* score
* upvotes
* downvotes
* favcount
* embedded
* tagcount
* pixiv_id
* pixiv
These are mostly metatags that have to do with properties of the post
itself. Other metatags still count because they involve things like
subqueries or joins, or they're more tag-like in function.