danbooru

Author	SHA1	Message	Date
evazion	00db63e885	Fix #5336 : Nuke old danboorubot replacement comments Add a fix script that imports the md5 for old post replacements from the corresponding DanbooruBot replacement comment, then deletes all replacement comments. There are about 250 replacements left that still have a null md5 because they don't have a matching comment. This is because if a post was replaced but the file didn't change, it didn't leave a comment.	2022-11-08 02:26:50 -06:00
evazion	f083f29c3b	users: add is_deleted flag. Add is_deleted flag to users table in preparation for fixing #4555.	2022-11-06 01:41:14 -05:00
evazion	4ae3ebf845	artists: add SQL script to find incorrect artist URLs.	2022-11-05 19:09:56 -05:00
evazion	4c0f62254e	script/fixes/123_refresh_media_metadata.rb: refresh metadata in parallel.	2022-11-03 22:09:24 -05:00
evazion	acc511ab7d	media assets: fix dimensions of flash files. Use ExifTool to get the dimensions of Flash files instead of calculating it ourselves. Avoids copying third-party code. Fixes a bug where Flash files with fractional dimensions (e.g. 607.6 x 756.6) had their dimensions rounded down instead of rounded up. Fixes another bug where Flash files could return negative dimensions. This happened for two files: * https://danbooru.donmai.us/media_assets/228662 (-179.2 x -339.2) * https://danbooru.donmai.us/media_assets/228664 (-179.2 x -339.2) Now we round these up to 1x1. This is still wrong, but it's less wrong than before.	2022-10-31 17:30:40 -05:00
evazion	27e4ae3d33	script/fixes/123_refresh_media_metadata.rb: don't wrap in transaction. Don't wrap the metadata refresh script in a transaction because it could be a very long running operation and it's not good to leave a transaction open that long.	2022-10-31 02:29:07 -05:00
evazion	214a877c3c	users: fix typo in contributor/approver migration script. Fixup for #5306.	2022-10-30 21:19:05 -05:00
evazion	d65a35d4ae	media assets: add fix script to refresh metadata. Add a script to go through every media asset and check the metadata (width, height, duration, filesize, md5, EXIF metadata) and update it if it's changed. This is necessary after upgrading ExifTool because the metadata it returns may have changed.	2022-10-30 14:49:12 -05:00
nonamethanks	ca31e7a47c	Users: add Contributor and Approver user levels	2022-10-21 20:52:31 +02:00
evazion	873c67db58	emails: disallow names ending with a period. Update email validation rules to disallow the percent character (e.g. `foo%bar@gmail.com`) and names ending with a period (e.g. `foo.@gmail.com`). Names ending with a period are invalid according to the RFCs and cause `Mail::Address.new` to raise an exception. The percent character is technically legal, but only one email used it and it was probably a typo.	2022-10-17 22:13:19 -05:00
evazion	e31977ac29	emails: move EmailValidator into Danbooru::EmailAddress.	2022-10-17 22:13:19 -05:00
evazion	f6516e0e37	emails: add script to fix typo'd emails. Add fix script to fix emails containing typos, such as `name@gamil.com`.	2022-10-14 23:28:23 -05:00
evazion	01d10a54f8	ugoira: store frame delays in MediaMetadata model. Store Ugoira frame delays in the MediaMetadata model as a fake EXIF field instead of in the PixivUgoiraFrameData model. This way we can get rid of the PixivUgoiraFrameData model completely. This is a step towards fixing #5264.	2022-10-09 22:25:20 -05:00
evazion	0cfd0ff436	emails: add fix script to renormalize email addresses. Whenever the email address normalization procedure changes, the `normalized_address` column of the email address table must be updated. This is normally when the list of canonical domain mappings changes. Renormalizing addresses may also require deleting duplicates.	2022-10-03 02:55:30 -05:00
evazion	86e69e3401	emails: add fix script to delete duplicate email addresses. In the past it was possible for users to create multiple accounts with the same email address. We had about 9000 such accounts. This removes the email address from these accounts. When multiple accounts have the same email address, the account that visited the site last gets to keep the address.	2022-10-02 23:59:54 -05:00
evazion	21747e1f8e	emails: add fix script to fix invalid email addresses. Add a fix script that fixes invalid email addresses if they can be fixed, otherwise they're deleted. For a long time we didn't have any email validation, so we ended up with a lot of invalid email addresses containing typos or other random garbage. This tries to fix the most common typos when possible, otherwise the email address is deleted. In many cases the user created two accounts, one with a typo in the email and one with the correct email. In these cases we can't fix the invalid email, so we just delete it.	2022-10-02 20:44:10 -05:00
evazion	85cb434b2c	users: fix bug in invalid username deletion script. Fix a bug in script/fixes/115_delete_invalid_users.rb where certain usernames containing punctuation weren't deleted.	2022-10-02 03:42:51 -05:00
evazion	3dc765ca9d	mod actions: add fix script to populate subject field. Add a fix script to populate the mod_actions subject field by parsing mod action descriptions. Most mod actions contain an ID, so finding the subject is easy, but some don't. And some mod actions refer to deleted objects, such as deleted posts or comments. In these cases the subject will be null. For IP bans, the mod action description only contains the IP, but it's possible to have multiple bans for the same IP. So we look for IP bans created by the same user, for the same IP, within the same time range. For user bans, the mod action only contains the banned user's name and the ban reason. This makes it difficult to find the banned user's ID in some cases, because it's possible for the user to have changed their name, and for the name change to have not been recorded, and for the banner to have edited the ban reason, or for the ban to have been deleted. So we try multiple things until we find the closest match.	2022-09-25 21:19:43 -05:00
evazion	aea3837f9a	users: delete accounts with invalid names. Add a fix script to delete all accounts with invalid usernames. Also change it so the owner-level user can delete accounts belonging to other users. Users who have logged in in the last year and who have a valid email address will be given a one week warning. After that all accounts with invalid names will be deleted. Anyone who has visited the site in the last 6 months will have already seen a warning page that their name must be changed to keep using the site.	2022-09-19 05:09:44 -05:00
evazion	2119a8efc5	mod actions: fix messages to use consistent format. Fix mod actions to use the same message format everywhere. Before mod actions were formatted in various inconsistent ways: * "deleted post #1234" * "comment #1234 updated by <user>" * "<user> updated forum #1234" * "<user> level changed Member -> Builder" Now all mod actions consistently use this format: * "deleted post #1234" * "updated comment #1234" * "updated forum #1234" * "promoted <user> from Member to Builder" This way mod actions are formatted consistently with other actions on the /user_actions page, where everything is written as "<user> did X". Also add a fix script to fix existing mod actions.	2022-09-18 21:56:57 -05:00
evazion	ec382357b8	tags: populate `words` column. Add code for parsing tags into words and for populating the `words` column in the tags table.	2022-09-01 23:54:07 -05:00
evazion	1aeb52186e	Add AI tag model and UI. Add a database model for storing AI-predicted tags, and add a UI for browsing and searching these tags. AI tags are generated by the Danbooru Autotagger (https://github.com/danbooru/autotagger). See that repo for details about the model. The database schema is `ai_tags (media_asset_id integer, tag_id integer, score smallint)`. This is designed to be as space-efficient as possible, since in production we have over 300 million AI-generated tags (6 million images and 50 tags per post). This amounts to over 10GB in size, plus indexes. You can search for AI tags using e.g. `ai:scenery`. You can do `ai:scenery -scenery` to find posts where the scenery tag is potentially missing, or `scenery -ai:scenery` to find posts that are potentially mistagged (or more likely where the AI missed the tag). You can browse AI tags at https://danbooru.donmai.us/ai_tags. On this page you can filter by confidence level. You can also search unposted media assets by AI tag. To generate tags, use the `autotag` script from the Autotagger repo, something like this: docker run --rm -v ~/danbooru/public/data/360x360:/images ghcr.io/danbooru/autotagger ./autotag -c -f /images \| gzip > tags.csv.gz To import tags, use the fix script in script/fixes/. Expect a Danbooru-size dataset to take hours to days to generate tags, then 20-30 minutes to import. Currently this all has to be done by hand.	2022-06-24 04:54:26 -05:00
evazion	fec92d765a	users: change default blacklist to `furry -rating:g`.	2022-06-02 00:06:34 -05:00
evazion	173e43b192	user upgrades: add upgrade code system. Add a system for upgrading accounts using upgrade codes. Users purchase an upgrade code off-site then redeem it on-site to upgrade their account to Gold. Upgrade codes are randomly pre-generated and are one time use only. Codes have enough randomness that guessing a code is infeasible.	2022-06-01 18:31:46 -05:00
evazion	4ba993319a	media assets: add file_key, is_public columns. `file_key` is a random 9-character base-62 string that will be used as the image filename in the future. `is_public` is whether the image can be viewed without authentication or not. Users running downstream boorus must run `bin/rails db:migrate` and `script/fixes/109_generate_media_asset_file_keys.rb` after this commit.	2022-05-04 23:19:53 -05:00
evazion	703fd05025	favgroups: don't allow favgroups to be named 'any' or 'none'. 'any' and 'none' are now reserved keywords for the favgroup: metatag. Also add a fix script to rename existing favgroups.	2022-04-17 23:17:18 -05:00
evazion	226faae8ec	BURs: fix tags field not finding all BURs with that tag. Fix the Tags field in the BUR search form not finding all BURs mentioning that tag. Specifically, tags that were part of a mass update, and that were prefixed with `~` or `-` (OR tags and NOT tags), weren't indexed as tags affected by the BUR. This requires re-running script/fixes/064_initialize_bulk_update_request_tags.rb to fix old BURs.	2022-03-29 21:06:24 -05:00
evazion	4b1264991f	users: remove 'spoilers' tag from default blacklist. Rationale: * The spoilers tag is the most frequently removed tag from the default blacklist. * It's frustrating for regular users to have posts randomly hidden because of trivial spoilers from a series they don't care about. * The spoilers tag is used way too liberally for things that aren't considered spoilers on other sites. * If you're looking up fanart on the internet, you should expect to see a certain level of spoilers. * The tag is used very inconsistently, with some characters like Nia_(blade)_(xenoblade) getting the spoilers tag half the time and the rest of the time not.	2022-03-20 16:49:36 -05:00
evazion	04c03fa4e6	artist: normalize more artist url formats.	2022-03-16 17:17:50 -05:00
evazion	04226d3409	pixiv: normalize pixiv urls in artist entries. Normalize Pixiv URLs to `https://www.pixiv.net/users/1234` format.	2022-03-14 16:43:19 -05:00
evazion	223742c365	weibo: normalize weibo urls in artist entries. Normalize all Weibo URLs in artist entries to one of these forms: * https://www.weibo.com/u/5399876326 * https://www.weibo.com/p/1005055399876326 * https://www.weibo.com/chengziyou666	2022-03-13 21:16:56 -05:00
evazion	eb032d54c1	uploads: set upload_media_asset.status to active. Fix the status being set to pending instead of active for new upload media assets.	2022-02-14 00:40:40 -06:00
evazion	04d242c60c	uploads: save filename, image URL, page URL for uploads. * Save the filename for files uploaded from disk. This could be used in the future to extract source data if the filename is from a known site. * Save both the image URL and the page URL for files uploaded from source. This is needed for multi-file uploads. The image URL is the URL of the file actually downloaded from the source. This can be different from the URL given by the user, if the user tried to upload a sample URL and we automatically changed it to the original URL. The page URL is the URL of the page containing the image. We don't always know this, for example if someone uploads a Twitter image without the bookmarklet, then we can't find the page URL. * Add a fix script to backfill URLs for existing uploads. For file uploads, the filename will be set to "unknown.jpg". For source uploads, we fetch the source data again to get the image and page URLs. This may fail for uploads that have been deleted from the source since uploading.	2022-02-12 15:22:41 -06:00
evazion	9a23970ab1	uploads: fix media_asset_count.	2022-02-12 15:22:24 -06:00
evazion	1a61e329ba	uploads: add column for error messages. Change it so uploads store errors in an `error` column instead of in the `status` field.	2022-02-07 15:44:39 -06:00
evazion	19a9cf3d2f	uploads: delete old upload records from before the rework. Delete all old upload records from before the upload rework in `abdab7a0a` / `f11c46b4f`. Uploads from before the rework don't have any attached media assets, so they're not valid under the new system because we can't find which files they were for. Before the rework, completed uploads were only saved for 1 hour, and failed uploads were only saved for 3 days, so deleting this data doesn't really lose anything that wouldn't have been deleted before.	2022-02-07 15:11:09 -06:00
evazion	6d2a2eee59	Fix #4017 : Artist tag in upload page should account for aliases Disallow creating artist entries for aliased tags. Add a fix script to move existing artist entries for tags that have been aliased.	2022-02-01 12:33:45 -06:00
evazion	61c043c6b1	posts: normalize Unicode to NFC form in post sources. Fix strings like "pokémon" (NFD form) and "pokémon" (NFC form) being considered different strings in sources. Also add a fix script to fix existing sources. There were only 15 posts with unnormalized sources.	2022-01-31 14:16:49 -06:00
evazion	d2a24e6b10	Fix #4971 : NoMethodError when trying to display some modreports. Delete modreports for hard-deleted comments. There were a total of six invalid modreports for deleted comments.	2022-01-22 18:12:07 -06:00
evazion	56722df753	forum: delete posts when topic is deleted. Fix it so that when a forum topic is deleted, all posts in the topic are deleted too. Also make it so that when a forum topic is undeleted, all posts in it are undeleted too. Before when a topic was deleted, only the topic itself was marked as deleted, not the posts inside the topic. This meant that when a spam topic was deleted, the OP wouldn't be marked as deleted, so any modreports against it wouldn't be marked as handled. Also change it so that it's not possible to undelete a post in a deleted topic, or to delete the OP of a topic without deleting the topic itself. Finally, add a fix script to delete all active posts in deleted topics, and to undelete all deleted OPs in active topics.	2022-01-21 22:35:20 -06:00
evazion	c8d27c2719	Fix #4669 : Track moderation report status. * Add ability to mark moderation reports as 'handled' or 'rejected'. * Automatically mark reports as handled when the comment or forum post is deleted. * Send a dmail to the reporter when their report is handled. * Don't show the report notice on comments or forum posts when all reports against it have been handled or rejected. * Add a fix script to mark all existing reports for deleted comments, forum posts, or dmails as handled.	2022-01-20 20:50:23 -06:00
evazion	98aee048f2	artists: fix old artists with invalid names. There are a lot of old artist entries with Japanese names. These names are now invalid and these artist entries can't be edited because they fail validation checks. Add a fix script to delete all artist entries with non-ASCII names, and rename them to `artist_1234`.	2022-01-20 16:01:31 -06:00
evazion	02c9498860	artists: normalize group names. Normalize artist group names following the same rules as artist other names. This means artist group names now use underscores instead of spaces. It also means extra space characters at the beginning and end of names is stripped, and Unicode characters are normalized. Fixes #4647, which was caused by users accidentally replacing group names with a single space character when trying to remove a group.	2022-01-20 00:17:06 -06:00
evazion	acf565be7b	Fix #4678 : Validate custom CSS. * Make it an error to add invalid custom CSS to your account. * Add a fix script to remove custom CSS from all accounts with invalid CSS.	2022-01-15 23:20:49 -06:00
evazion	33103f6dc4	pools: add ability to search for pools linking to given tag. Add ability to search for pools linking to a given tag in the pool description. Example: https://danbooru.donmai.us/pools?search[linked_to]=touhou (This isn't actually exposed in the UI to avoid cluttering the pool search form with rarely used options.) Pools with broken links can be found here: https://danbooru.donmai.us/dtext_links?search[has_linked_tag]=No&search[has_linked_wiki]=No&search[model_type]=Pool Lays the groundwork for fixing #4629.	2022-01-15 20:26:30 -06:00
evazion	c3c4f5a2a7	Fix #4957 : Autotag non-web_source. Autotag non-web_source on posts that have a non-http:// or https:// URL. Add a fix script to backfill old posts. Syntactically invalid URLs are still considered web sources. For example, `https://google,com` technically isn't a valid URL, but it's not considered a non-web source.	2022-01-14 22:58:27 -06:00
evazion	2e1c7ce6d3	Fix #4951 : chartags:0 returning posts with chartags. * Add fix script to fix posts with incorrect tag_count_* fields. * Simplify the code for updating tag_count_* fields (no functional change).	2022-01-10 13:33:56 -06:00
evazion	85e1ae3c9b	favorites: fix posts with incorrect fav_count fields. There were about 4000 posts with an incorrect fav_count.	2022-01-09 19:31:45 -06:00
evazion	ab4214dc00	emails: mark all invalid emails as undeliverable.	2022-01-09 13:24:53 -06:00
evazion	c09cd9e9fd	users: fix incorrect count columns on users table. Fix incorrect post_upload_count, note_update_count, and unread_dmail_count columns on the users table.	2022-01-09 12:51:10 -06:00

1 2 3 4

199 Commits