Add AI tag model and UI.
Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags. AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that repo for details about the model. The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is designed to be as space-efficient as possible, since in production we have over 300 million AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus indexes. You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are potentially mistagged (or more likely where the AI missed the tag). You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by confidence level. You can also search unposted media assets by AI tag. To generate tags, use the `autotag` script from the Autotagger repo, something like this: docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images | gzip > tags.csv.gz To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.
This commit is contained in:
@@ -20,6 +20,7 @@ class MediaAsset < ApplicationRecord
|
||||
has_many :upload_media_assets, dependent: :destroy
|
||||
has_many :uploads, through: :upload_media_assets
|
||||
has_many :uploaders, through: :uploads, class_name: "User", foreign_key: :uploader_id
|
||||
has_many :ai_tags
|
||||
|
||||
delegate :metadata, to: :media_metadata
|
||||
delegate :is_non_repeating_animation?, :is_greyscale?, :is_rotated?, to: :metadata
|
||||
|
||||
Reference in New Issue
Block a user