Commit Graph

11293 Commits

Author SHA1 Message Date
evazion
71f42d67a7 tinami: return nothing if getting the full image fails.
Fix to make sure `image_urls` returns an empty array instead of `[nil]`
if grabbing the full image URL fails for whatever reason.
2022-03-19 23:42:34 -05:00
evazion
7f58cfbe5e tinami: get the full image.
Support grabbing the full image for Tinami uploads, rather than the sample.

Getting the full image requires making a request like this:

    curl -X POST \
    -H 'Referer: https://www.tinami.com/' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -H 'Cookie: Tinami2SESSID=<redacted>;' \
    --data-raw 'action_view_original=true&cont_id=1087268&ethna_csrf=<redacted>' \
    https://www.tinami.com/view/1087268

Then scraping the <img> tag from the resulting HTML page.

If the post has multiple images, then we need to scrape and pass the
`sub_id` of the image too.

Fixes #2818.
2022-03-19 23:22:09 -05:00
evazion
0ddc09f011 forms: remove "Use * for wildcard" hints. 2022-03-19 21:14:37 -05:00
evazion
7dd6cf56ce Merge pull request #5051 from NamelessContributor/update-site-icons
url icons: replace some icons with better versions
2022-03-19 19:07:13 -05:00
evazion
f7cc941882 Merge pull request #5049 from NamelessContributor/show-hints-on-mobile
simple_form: Don't hide hints on mobile. Fix #4997
2022-03-19 19:02:46 -05:00
evazion
4215b77aaf Merge pull request #5048 from NamelessContributor/patch-1
editorconfig: add scss
2022-03-19 19:01:51 -05:00
evazion
01b683798e sources: add Tinami support. 2022-03-19 00:50:36 -05:00
evazion
40cbc0423c sources: add Instagram profile url normalization. 2022-03-18 18:20:29 -05:00
evazion
f0cd6227c3 tests: fix referer spoofing test. 2022-03-18 17:26:40 -05:00
evazion
c6e528a073 artstation: normalize https://artstation.com/artist/username/albums/all/ urls. 2022-03-18 17:10:26 -05:00
evazion
cc54b5f730 fanbox: normalize http://www.pixiv.net/fanbox/creator/3113804/post urls. 2022-03-18 17:10:26 -05:00
evazion
26d23c49d0 pawoo: normalize https://pawoo.net/users/evazion urls. 2022-03-18 17:10:26 -05:00
evazion
512c72bbc9 Show warnings in edit forms.
Make all edit forms show warnings in addition to errors. Also render
warnings and errors using DText.

Currently this only affects artists and wiki pages because they're the
only pages that have warnings.
2022-03-18 17:10:26 -05:00
evazion
912e996027 Fix #4470: Check URLs for duplicates when creating artists
Show a warning when creating a duplicate artist; that is, when adding a
URL that already belongs to another artist.

This is a soft warning rather than a hard error because there are some
cases where multiple artists legitimately share the same site or account.
2022-03-18 17:10:23 -05:00
evazion
a78b6528dc weibo: fix https://m.weibo.cn/detail/1234 urls not finding the artist.
Fix https://m.weibo.cn/detail/4506950043618873 type URLs not finding the
artist because the profile_url method returned nil instead of the actual
profile URL.
2022-03-18 06:01:51 -05:00
evazion
03d2098d6d artists: fix artist finder returning wrong results when given nil url.
Fix the artist finder returning incorrect results when given a nil URL.
This only happened when an artist with a URL like this existed:

    http:///blog.naver.com/dan_rak

Note the triple `///`; the extra `/` messed up the artist finder.

The artist finder may be given a nil URL when a source strategy returns
a nil profile URL, usually because the source is bad_id.
2022-03-18 06:01:36 -05:00
evazion
42144eaa4b Fix #5012: Fc2 image link paste not uploading.
Fix referer spoofing not working for certain fc2.com image URLs.

Spoofing the referer like this redirects to an HTML error page:

* curl -H "Referer: http://wwwew.web.fc2.com" http://wwwew.web.fc2.com/e/405.jpg

Spoofing it like this works:

* curl -H "Referer: http://wwwew.web.fc2.com/e/405.jpg" http://wwwew.web.fc2.com/e/405.jpg
2022-03-18 04:39:13 -05:00
evazion
c64df46de4 artists: make artist finder use url instead of normalized_url.
Make the artist finder search for artists using the `url` field instead
of the `normalized_url` field. This lets us get rid of `normalized_url`
in the future.

As described in 10dac3ee5, artist URLs have both a `url` column and a
`normalized_url` column. The `normalized_url` column was the one used
for artist finding. The `url` was secretly normalized behind the scenes
so that artist finding would work no matter how the URL was written in
the artist entry. This is no longer necessary now that URLs are directly
normalized in artist entries.

This fixes various cases where artist finding didn't work for non-obvious
reasons, usually because the URL wasn't written in the right format so
it wasn't properly normalized behind the scenes.

This also makes it so that artist finding is case-insensitive, which
fixes #4821. Hopefully no sites are perverse enough to allow two
different usernames that differ only in case.

Users running their own Danbooru instance may have to fix the URLs in
their artist entries for artist finding to work again. There are a few
fix scripts to help with this:

* script/fixes/104_normalize_weibo_artist_urls.rb
* script/fixes/105_normalize_pixiv_artist_urls.rb
* script/fixes/106_normalize_artist_urls.rb
2022-03-18 04:00:16 -05:00
evazion
10dac3ee51 artists: normalize urls added to artist entries.
When a URL is added to an artist entry, normalize it to a standard form.

Artist URLs have both a `url` column and a `normalized_url` column. The
`normalized_url` is used for artist finding and the `url` is the raw URL
entered by the user. Previously only the `normalized_url` field was
normalized; now the URL entered by the user is also converted to a
normalized form.

This means that if an URL like this is added to an artist entry:

* http://www.pixiv.net/member.php?id=1234
* http://www.pixiv.net/en/users/1234
* http://www.twitter.com/DanbooruBot/
* http://mobile.twitter.com/DanbooruBot/

It will get normalized to this:

* https://www.pixiv.net/users/1234
* https://twitter.com/DanbooruBot

This fixes problems with duplicate URLs being added to artist entries
because URLs weren't normalized to a single form.
2022-03-18 02:06:50 -05:00
evazion
455ee9a52a fc2: parse more url types. 2022-03-18 02:06:30 -05:00
evazion
03d2a86ef1 artists: normalize fc2.com profile urls. 2022-03-17 19:42:57 -05:00
evazion
04c03fa4e6 artist: normalize more artist url formats. 2022-03-16 17:17:50 -05:00
NamelessContributor
c669b16c32 url icons: replace some icons with better versions
mangaupdates, nijie, sakura-ne-jp: replaced with higher resolution,
  the old icons look blurry on hidpi screens.
skeb, tumblr: replaced with transparent versions.
2022-03-15 16:12:08 +01:00
NamelessContributor
c3a5ce9019 simple_form: Don't hide hints on mobile. Fix #4997
Hints are displayed below their fields on small screens.
2022-03-15 14:39:37 +01:00
NamelessContributor
1a4d9dc7da editorconfig: add scss 2022-03-15 14:06:16 +01:00
evazion
ded03df1ff tests: fix more broken tests. 2022-03-15 05:14:56 -05:00
evazion
644dfaf74c tests: fix broken tests. 2022-03-15 04:45:30 -05:00
evazion
133c45ee29 sources: parse more profile url formats.
Add support for parsing these URL formats:

* https://www.artstation.com/felipecartin/profile
* https://www.deviantart.com/nlpsllp/gallery
* https://fantia.jp/asanagi
* https://www.lofter.com/front/blog/home-page/noshiqian
* https://www.lofter.com/app/xiaokonggedmx
* https://www.lofter.com/blog/semblance
* https://q.nicovideo.jp/users/18700356
* https://dic.nicovideo.jp/u/11141663
* https://3d.nicovideo.jp/users/109584
* https://3d.nicovideo.jp/u/siobi
* https://game.nicovideo.jp/atsumaru/users/7757217
* https://www.pixiv.net/user/13569921/series/81967
* https://pixiv.cc/zerousagi/
* https://www.plurk.com/u/ddks2923
* https://www.plurk.com/m/u/leiy1225
* https://www.plurk.com/s/u/salmonroe13
* https://www.plurk.com/RSSSww/invite/4
* https://skeb.jp/@okku_oxn/works
* https://www.tumblr.com/blog/view/artofelaineho/187614935612
* https://www.tumblr.com/blog/view/artofelaineho
* https://www.tumblr.com/blog/artofelaineho
* https://www.tumblr.com/dashboard/blog/dankwartart
* https://rosarrie.tumblr.com/archive
* https://whereisnovember.tumblr.com/tagged/art
* https://twitpic.com/photos/Type10TK
* https://www.weibo.com/detail/4676597657371957
* https://www.weibo.com/u/5957640693/home?wvr=5
* https://www.weibo.com/lvxiuzi0/home
2022-03-15 00:49:54 -05:00
evazion
04226d3409 pixiv: normalize pixiv urls in artist entries.
Normalize Pixiv URLs to `https://www.pixiv.net/users/1234` format.
2022-03-14 16:43:19 -05:00
evazion
223742c365 weibo: normalize weibo urls in artist entries.
Normalize all Weibo URLs in artist entries to one of these forms:

* https://www.weibo.com/u/5399876326
* https://www.weibo.com/p/1005055399876326
* https://www.weibo.com/chengziyou666
2022-03-13 21:16:56 -05:00
evazion
1d9a15a119 weibo: handle a couple more profile url types.
Parse these profile URL types:

* https://www.weibo.cn/endlessnsmt
* https://www.weibo.com/p/1005055399876326

Also add anchors around the regexes so they have to match the full string.
2022-03-13 20:32:57 -05:00
evazion
be9ef0c49f artists: add m.weibo.cn urls to artist finder blacklist. 2022-03-13 03:54:17 -05:00
evazion
cf8b8207e2 artists: change how artist urls are normalized.
Change how artist URLs are normalized in artist entries. Don't try to secretly
convert image URLs to profile URLs in artist entries. For example, if someone puts a
Pixiv image URL in an artist entry, don't secretly try to fetch the source and
convert it into a profile URL in the `normalized_url` field.

We did this because years ago, it was standard practice to put image URLs in artist
entries. Pixiv image URLs used to contain the artist's username, so we used to put
image URLs in artist entries for artist finding purposes. But Pixiv changed it so
that image URLs no longer contained the username, so we dealt with it by adding a
`normalized_url` column to artist_urls and secretly converting image URLs to profile
URLs in this field. But this is no longer necessary because now we don't normally put
image URLs in artist entries in the first place.

Now the `profile_url` method in `Source::URL` is used to normalize URLs in artist
entries. This lets us parse various profile URL formats and normalize them into a
single canonical form.

This also removes the `normalize_for_artist_finder` method from source strategies.
Instead the `profile_url` method is used for artist finding purposes. So the profile
URL returned by the source strategy needs to be the same as the URL in the artist
entry in order for artist finding to work.
2022-03-13 03:54:17 -05:00
evazion
9343f7c912 Source::URL: add profile_url method.
Add a method for converting a source URL into a profile URL. This will
be used for normalizing profile URLs in artist entries.

Also add the ability to parse a few more profile URL formats.
2022-03-13 03:54:17 -05:00
evazion
787b5c8e27 sources: merge Sta.sh strategy into DeviantArt strategy.
This turns out to be a little simpler than keeping them separate. The
only thing special we have to do for Sta.sh is use the Sta.sh page when
we have a DeviantArt image with a Sta.sh referer.
2022-03-12 00:57:43 -06:00
evazion
f2028c14fb Fix #5045: Exception on uploads when SauceNAO is the referrer URL.
Bug: We assumed the referer URL was from the same site as the target
URL. We tried to call methods on the referer only supported by the
target URL.

Fix: Ignore the referer URL when it's from a different site than the
target URL.
2022-03-12 00:04:39 -06:00
evazion
28971fe103 sources: factor out site_name method. 2022-03-11 23:20:53 -06:00
evazion
b4aea72d04 sources: remove preview_urls method from base strategy.
Remove the `preview_urls` method from strategies. The only place this was used was
when doing IQDB searches, to download the thumbnail image from the source instead of
the full image.

This wasn't worth it for a few reasons:

* Thumbnails on other sites are sometimes not the size we want, which could affect
  IQDB results.
* Grabbing thumbnails is complex for some sites. You can't always just rewrite the
  image URL. Sometimes it requires extra API calls, which can be slower than just
  grabbing the full image.
* For videos and animations, thumbnails from other sites don't always match our
  thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which
  means we don't always pick the first frame, which could affect IQDB results.

API changes:

* /iqdb_queries?search[file_url] now downloads the URL as is without any modification.
  Before it tried to change thumbnail and sample size image URLs to the full version.

* /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that
  contains multiple images. Before it would grab only the first image and silently
  ignore the rest.
2022-03-11 03:22:23 -06:00
evazion
2f61486ac6 sources: remove image_url method from base strategy.
Remove the `image_url` method from source strategies. This method would
return only the first image if a source had multiple images. The
`image_urls` method should be used instead. Tests were the main place
that still used `image_url` instead of `image_urls`.

Also make post replacements return an error if replacing with a source
that contains multiple images, instead of just blindly replacing the
post with the first image in the source.
2022-03-11 01:59:21 -06:00
evazion
4701027f45 sources: remove unused methods from base strategy.
Remove unused `urls`, `parsed_urls`, and `domains` methods.
2022-03-10 23:11:00 -06:00
evazion
2460ac0927 Merge pull request #5044 from nonamethanks/modqueue-thumb-size
Modqueue: support variable size thumbnails
2022-03-10 15:32:20 -06:00
evazion
5016d9ad26 Merge pull request #5043 from nonamethanks/fantia-support
Add Fantia support
2022-03-10 15:21:03 -06:00
evazion
8bba1b0a54 weibo: add test for m.weibo.cn/detail urls. 2022-03-10 15:04:02 -06:00
evazion
29fc072cf1 Merge pull request #5042 from nonamethanks/weibo-fix-typo
weibo: fix typo in strategy
2022-03-10 15:01:12 -06:00
evazion
0252b3608c Merge pull request #5041 from NamelessContributor/fix-mode-menu
post_mode_menu: update preview link selector
2022-03-10 14:58:21 -06:00
nonamethanks
5b5f61c2ea Modqueue: support variable size thumbnails 2022-03-10 20:39:45 +01:00
nonamethanks
a6549bc6fe Add Fantia support
Also fixes a regression in 74fdeef10c
that stopped mastodon urls from being given the right priority.
2022-03-10 17:43:32 +01:00
evazion
43a665a66d sources: factor out Source::URL::NicoSeiga. 2022-03-10 04:53:51 -06:00
nonamethanks
93adba06e5 weibo: fix typo in strategy 2022-03-10 08:31:23 +01:00
evazion
34854185be sources: factor out Source::URL::DeviantArt and Source::URL::Stash. 2022-03-10 00:29:49 -06:00