artists: fix misnormalization of emoji in other names.
Fix `normalize_whitespace` to not strip zero-width joiner characters (U+200D). These characters are used in emoji and stripping them breaks some artist other names that use emoji.
This commit is contained in:
@@ -46,8 +46,9 @@ module Danbooru
|
||||
# Normalize various horizontal space characters to ASCII space.
|
||||
text = gsub(/\p{Zs}|\t/, " ")
|
||||
|
||||
# Strip various zero width space characters.
|
||||
text = text.gsub(/[\u180E\u200B\u200C\u200D\u2060\uFEFF]/, "")
|
||||
# Strip various zero width space characters. Zero width joiner (200D)
|
||||
# is allowed because it's used in emoji.
|
||||
text = text.gsub(/[\u180E\u200B\u200C\u2060\uFEFF]/, "")
|
||||
|
||||
# Normalize various line ending characters to CRLF.
|
||||
text = text.gsub(/\r?\n|\r|\v|\f|\u0085|\u2028|\u2029/, "\r\n")
|
||||
|
||||
Reference in New Issue
Block a user