twitter: normalize artist commentaries to nfkc (#3719)

Fixes hashtags not being interpreted when the author uses a fullwidth
number sign (#, U+FF03).

ref: https://github.com/r888888888/danbooru/issues/3719#issuecomment-419535610
This commit is contained in:
evazion
2018-09-10 21:45:50 -05:00
parent a0ebd90409
commit 9a980367f6
2 changed files with 21 additions and 1 deletions

View File

@@ -106,7 +106,7 @@ module Sources::Strategies
end
url_replacements = url_replacements.to_h
desc = artist_commentary_desc
desc = artist_commentary_desc.unicode_normalize(:nfkc)
desc = CGI::unescapeHTML(desc)
desc = desc.gsub(%r!https?://t\.co/[a-zA-Z0-9]+!i, url_replacements)
desc = desc.gsub(%r!#([^[:space:]]+)!, '"#\\1":[https://twitter.com/hashtag/\\1]')