posts: normalize Unicode to NFC form in post sources.

Fix strings like "pokémon" (NFD form) and "pokémon" (NFC form) being
considered different strings in sources.

Also add a fix script to fix existing sources. There were only 15 posts
with unnormalized sources.
This commit is contained in:
evazion
2022-01-31 10:56:27 -06:00
parent 0132c5f0a5
commit 61c043c6b1
4 changed files with 26 additions and 33 deletions

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env ruby
require_relative "base"
with_confirmation do
CurrentUser.scoped(User.system, "127.0.0.1") do
Post.where("source ~ '[^[:ascii:]]'").find_each do |post|
next if post.source.unicode_normalize(:nfc) == post.source
post.update!(source: post.source)
puts({ id: post.id, old_source: post.source_before_last_save, new_source: post.source })
end
end
end