Commit Graph

8 Commits

Author SHA1 Message Date
nonamethanks
e1b9166a56 Sources: do not use an empty else in case blocks 2022-04-22 03:53:18 +02:00
evazion
d96db350f3 pixiv: fix non-www Pixiv urls not being recognized.
Fix non-www Pixiv URLs (e.g. `https://pixiv.net/users/3584828`) URLs not
being recognized by the URL parser.
2022-04-03 03:07:42 -05:00
evazion
a272c19b98 Fix #5078: Pixiv booth upload broken.
Allow image URLs from https://booth.pximg.net to be uploaded. Fix bug
where Booth.pm URLs were incorrectly caught by the Pixiv extractor.
2022-03-30 03:25:42 -05:00
evazion
10dac3ee51 artists: normalize urls added to artist entries.
When a URL is added to an artist entry, normalize it to a standard form.

Artist URLs have both a `url` column and a `normalized_url` column. The
`normalized_url` is used for artist finding and the `url` is the raw URL
entered by the user. Previously only the `normalized_url` field was
normalized; now the URL entered by the user is also converted to a
normalized form.

This means that if an URL like this is added to an artist entry:

* http://www.pixiv.net/member.php?id=1234
* http://www.pixiv.net/en/users/1234
* http://www.twitter.com/DanbooruBot/
* http://mobile.twitter.com/DanbooruBot/

It will get normalized to this:

* https://www.pixiv.net/users/1234
* https://twitter.com/DanbooruBot

This fixes problems with duplicate URLs being added to artist entries
because URLs weren't normalized to a single form.
2022-03-18 02:06:50 -05:00
evazion
133c45ee29 sources: parse more profile url formats.
Add support for parsing these URL formats:

* https://www.artstation.com/felipecartin/profile
* https://www.deviantart.com/nlpsllp/gallery
* https://fantia.jp/asanagi
* https://www.lofter.com/front/blog/home-page/noshiqian
* https://www.lofter.com/app/xiaokonggedmx
* https://www.lofter.com/blog/semblance
* https://q.nicovideo.jp/users/18700356
* https://dic.nicovideo.jp/u/11141663
* https://3d.nicovideo.jp/users/109584
* https://3d.nicovideo.jp/u/siobi
* https://game.nicovideo.jp/atsumaru/users/7757217
* https://www.pixiv.net/user/13569921/series/81967
* https://pixiv.cc/zerousagi/
* https://www.plurk.com/u/ddks2923
* https://www.plurk.com/m/u/leiy1225
* https://www.plurk.com/s/u/salmonroe13
* https://www.plurk.com/RSSSww/invite/4
* https://skeb.jp/@okku_oxn/works
* https://www.tumblr.com/blog/view/artofelaineho/187614935612
* https://www.tumblr.com/blog/view/artofelaineho
* https://www.tumblr.com/blog/artofelaineho
* https://www.tumblr.com/dashboard/blog/dankwartart
* https://rosarrie.tumblr.com/archive
* https://whereisnovember.tumblr.com/tagged/art
* https://twitpic.com/photos/Type10TK
* https://www.weibo.com/detail/4676597657371957
* https://www.weibo.com/u/5957640693/home?wvr=5
* https://www.weibo.com/lvxiuzi0/home
2022-03-15 00:49:54 -05:00
evazion
9343f7c912 Source::URL: add profile_url method.
Add a method for converting a source URL into a profile URL. This will
be used for normalizing profile URLs in artist entries.

Also add the ability to parse a few more profile URL formats.
2022-03-13 03:54:17 -05:00
evazion
52a2d3418c pixiv: fixup bugs in 1c620f805.
* Fix error when uploading non-ugoira files.
* Fix sample image URLs not being rewritten to full images correctly. We
  have to get the full image URL from the API because given an
  /img-master/ URL, we don't know what the original file extension is.
2022-03-08 23:07:24 -06:00
evazion
1c620f8055 sources: factor out Source::URL::Pixiv.
* Drop support for preview_urls. This means that IQDB lookups may be
  slower, especially for ugoiras, since we have to download the full
  ugoira now. However, ugoira lookups should produce better results,
  since the ugoira thumbnail chosen by Pixiv wasn't necessarily the same
  as the thumbnail chosen by Danbooru.

* Drop support for uploading single manga pages:

    http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=18557054&page=2

  Previously uploading an URL like this would only upload a single image
  out of a multi-image work. Now it will upload all images in the work.
  Pixiv no longer supports URLs like this, so we don't either.

* Add support for parsing URLs like this:

    https://i.pximg.net/c/360x360_70/custom-thumb/img/2022/03/08/00/00/56/96755248_p0_custom1200.jpg

  Apparently artists can choose a custom thumbnail now (not like anyone
  will try to upload one though).
2022-03-08 22:17:38 -06:00