Stop updating the fav_string attribute on posts. The column still exists
on the table, but is no longer used or updated.
Like the pool_string in 7d503f08, the fav_string was used in the past to
facilitate `fav:X` searches. Posts had a hidden fav_string column that
contained a list of every user who favorited the post. These were
treated like fake hidden tags on the post so that a search for `fav:X`
was treated like a tag search.
The fav_string attribute has been unused for search purposes for a while
now. It was only kept because of technicalities that required
departitioning the favorites table first (340e1008e) before it could be
removed. Basically, removing favorites with `@favorite.destroy` was
slow because Rails always deletes object by ID, but we didn't have an
index on favorites.id, and we couldn't easily add one until the
favorites table was departitioned.
Fixes#4652. See https://github.com/danbooru/danbooru/issues/4652#issuecomment-754993802
for more discussion of issues caused by the fav_string (in short: write
amplification, post table bloat, and favorite inconsistency problems).
Optimize counting the number of posts returned by fav:<name> and
pool:<name> searches. Use cached counts to avoid slow count(*) queries
for users with lots of favorites.
Include the favorites table in the nightly database dumps in BigQuery.
Previously we couldn't do this because we didn't have an index on
the favorite ID, which we needed to iterate across the table efficiently.
Note that this doesn't include private favorites. Note also that if a
user switches their favorites from private to public, then their
favorites will begin to appear in these dumps.
Order https://danbooru.donmai.us/favorites.json by favorite ID instead
of by post ID. This way you can get a feed of recently added favorites.
Previously this wasn't possible because we didn't have an index on
favorite ID.
Merge the 100 favorite subtables into a single table.
Previously the favorites table was partitioned by user id into 100
subtables to try to make searching by user id faster. This wasn't really
necessary and probably slower than just making an index on
(favorites.user_id, favorites.id) to satisfy ordfav searches. BTree
indexes are logarithmic so dividing an index by 100 doesn't make it 100
times faster to search; instead it just removes a layer or two from the
tree.
This also adds a uniqueness index on (user_id, post_id) to prevent
duplicate favorites. Previously we had to check for duplicates at the
application layer, which required careful locking to do it correctly.
Finally, this adds an index on favorites.id, which was surprisingly
missing before. This made ordering and deleting favorites by id really
slow because it degraded to a sequential scan.
Add fix script to remove duplicate favorites. When a user has duplicate
favorites on the same post, the earliest favorite will be kept and the
rest will be removed.
Let all users have unlimited favorites. Formerly the limit was 10k
favorites for regular members, 20k for Gold, and unlimited for Platinum.
Limiting favorites doesn't make sense since upvotes are unlimited.
Stop using the pool_string attribute on posts:
* Stop updating it when adding or removing posts from pools.
* Stop returning pool_string in the /posts.json API.
* Stop including the `data-pools` attribute on thumbnails.
The pool_string attribute was used in the past to facilitate pool:X
searches. Posts had a hidden pool_string attribute that contained a list
of every pool the post belonged to. These pools were treated like fake
hidden tags on the post and a search for `pool:X` was treated like a tag
search.
The pool_string has no longer been used for this purpose for a long time
now, and was only maintained for API compatibility purposes. Getting rid
of it eliminates a bunch of legacy cruft relating to adding and removing
posts from pools.
If you need to see which pools a post belongs to, do this:
* https://danbooru.donmai.us/pools.json?search[post_ids_include_any]=318550
The `data-pools` attribute on thumbnails was used by some people to add
custom borders to pooled posts with custom CSS. This will no longer
work. This was already broken because it included things like collection
pools and deleted pools, which you probably didn't want. Use a
userscript to add this attribute back to thumbnails if you need it.
Start storing the duration of animations and videos in the `duration`
field on the media_assets table. This had to wait until 3d30bfd69 was
deployed, which had to wait until Postgres was upgraded in order to add
the duration column to the media_assets table without downtime.
Also add a fix script to backfill the duration on existing posts. Usage:
TAGS=animated ./script/fixes/079_fix_duration.rb
Fix certain animated PNGs returning NaN as the duration because the
frame rate was being reported as "0/0" by FFMpeg. This happens when the
animation has zero delay between frames. This is supposed to mean a PNG
with an infinitely fast frame rate, but in practice browsers limit it to
around 10FPS. The exact frame rate browsers will use is unknown and
implementation defined.
Update the Postgres client binaries (psql et al) to version 14.0. This
is so they match the server version, and so that pg_amcheck is
available, which was introduced in 14.0.
This requires updating the base image to Ubuntu 21.04 at the same time
because the Postgres repo doesn't support version 14.0 on Ubuntu 20.10.
* Skip Nijie tests because they fail a lot due to Nijie rate limiting us.
* Skip ArtStation downloads tests because they sometimes return different file sizes.
* Fix random duplicate favgroup errors because favgroup names weren't random enough.
This reverts commit 9d62f71cd9.
This caused a problem where if you pushed multiple branches or tags at
once, for example to the betabooru and production branches, then the
Docker image would only get built for one branch. This led to deploys
not fetching the latest image.
Fix regression in 2eb89a835 that broke the modqueue page because the
arguments to `paginated_search` changed and weren't updated here.
Also fix incorrect YARD documentation syntax.
* On /pools, hide deleted pools by default in HTML responses. Don't
filter out deleted pools in API responses.
* API change: on /forum_topics, only hide deleted forum topics by
default for HTML responses, not for API responses. Explicitly do
https://danbooru.donmai.us/forum_topics.json?search[is_deleted]=false
to filter out deleted topics.
* API change: on /tags, only hide empty tags by default for HTML
responses, not for API responses. Explicitly do
https://danbooru.donmai.us/tags.json?search[is_empty]=false to filter
out empty tags.
* API change: on /pools, default to 20 posts per page for API responses,
not 40.
* API change: add `search[is_empty]` param to /tags.json endpoint.
`search[hide_empty]=true` is deprecated in favor of `search[is_empty]=false`.
* On /pools, add option to show/hide deleted pools in search form.
* Fix the /forum_topics page putting `search[order]=sticky&limit=40` in
the URL when browsing past page 1.
Fix maintenance tasks failing to run in production. In production they
were losing the database connection and not re-establishing it, so they
couldn't queue jobs. `ApplicationRecord.verify!` will check if the
connection is lost and re-establish it if it is.
The database connection was being lost because in production we use a
Kubernetes service IP for the database IP, which is essentially a
virtual IP that maps to the real IP. This mapping is implemented with
IPVS[1][2], which has a default idle connection timeout of 5 minutes. If
the connection isn't used for more than 5 minutes, then it's closed.
Since maintenance only runs once an hour, the database connection would
be lost because it was idle for too long.
1: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs
2: https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/
Remove the ability for users to lock ratings, note, and post statuses.
Historically the majority of locked posts were from 10+ years ago when
certain users habitually locked ratings and notes on every post they
touched for no reason. Nowadays most posts have been unlocked. Only a
handful of locked posts are left, none of which deserve to be locked.
The is_rating_locked, is_note_locked, and is_status_locked columns still
exist in the database, but aren't used.
* Move maintenance.html.bak to maintenance.html.
* Add Github / Twitter / Discord icons to footer.
* Include some CSS to make it look more like the regular site.
* Auto-refresh every 10 seconds.
Add methods to MediaFile to calculate the duration, frame count, and
frame rate of animated GIFs, PNGs, Ugoiras, and videos.
Some considerations:
* It's possible to have a GIF or PNG that's technically animated but
just has one frame. These are treated as non-animated images.
* It's possible to have an animated GIF that has an unspecified
frame rate. In this case we assume the frame rate is 10 FPS; this is
browser dependent and may not be correct.
* Animated GIFs, PNGs, and Ugoiras all support variable frame rates.
Technically, each frame has a separate delay, and the delays can be
different frame-to-frame. We report only the average frame rate.
* Getting the duration of an APNG is surprisingly hard. Most tools don't
have good support for APNGs since it's a rare and non-standardized
format. The best we can do is get the frame count using ExifTool and the
frame rate using ffprobe, then calculate the duration from that.
Add the following:
* Container name, machine name, worker id.
* Container uptime, puma uptime, worker uptime.
* Number of requests processed by current worker.
* ExifTool version.
Also change /status page to show information in tables instead of lists.