artists: normalize urls added to artist entries.

When a URL is added to an artist entry, normalize it to a standard form.

Artist URLs have both a `url` column and a `normalized_url` column. The
`normalized_url` is used for artist finding and the `url` is the raw URL
entered by the user. Previously only the `normalized_url` field was
normalized; now the URL entered by the user is also converted to a
normalized form.

This means that if an URL like this is added to an artist entry:

* http://www.pixiv.net/member.php?id=1234
* http://www.pixiv.net/en/users/1234
* http://www.twitter.com/DanbooruBot/
* http://mobile.twitter.com/DanbooruBot/

It will get normalized to this:

* https://www.pixiv.net/users/1234
* https://twitter.com/DanbooruBot

This fixes problems with duplicate URLs being added to artist entries
because URLs weren't normalized to a single form.
This commit is contained in:
evazion
2022-03-18 01:47:25 -05:00
parent 455ee9a52a
commit 10dac3ee51
5 changed files with 42 additions and 16 deletions

View File

@@ -103,7 +103,7 @@ class ArtistURL < ApplicationRecord
end
def self.normalize_url(url)
Danbooru::URL.parse(url)&.to_normalized_s.presence || url
Source::URL.parse(url)&.profile_url || Danbooru::URL.parse(url)&.to_normalized_s || url
end
def url=(url)