Commit Graph

62 Commits

Author SHA1 Message Date
evazion
644dfaf74c tests: fix broken tests. 2022-03-15 04:45:30 -05:00
evazion
cf8b8207e2 artists: change how artist urls are normalized.
Change how artist URLs are normalized in artist entries. Don't try to secretly
convert image URLs to profile URLs in artist entries. For example, if someone puts a
Pixiv image URL in an artist entry, don't secretly try to fetch the source and
convert it into a profile URL in the `normalized_url` field.

We did this because years ago, it was standard practice to put image URLs in artist
entries. Pixiv image URLs used to contain the artist's username, so we used to put
image URLs in artist entries for artist finding purposes. But Pixiv changed it so
that image URLs no longer contained the username, so we dealt with it by adding a
`normalized_url` column to artist_urls and secretly converting image URLs to profile
URLs in this field. But this is no longer necessary because now we don't normally put
image URLs in artist entries in the first place.

Now the `profile_url` method in `Source::URL` is used to normalize URLs in artist
entries. This lets us parse various profile URL formats and normalize them into a
single canonical form.

This also removes the `normalize_for_artist_finder` method from source strategies.
Instead the `profile_url` method is used for artist finding purposes. So the profile
URL returned by the source strategy needs to be the same as the URL in the artist
entry in order for artist finding to work.
2022-03-13 03:54:17 -05:00
evazion
f2028c14fb Fix #5045: Exception on uploads when SauceNAO is the referrer URL.
Bug: We assumed the referer URL was from the same site as the target
URL. We tried to call methods on the referer only supported by the
target URL.

Fix: Ignore the referer URL when it's from a different site than the
target URL.
2022-03-12 00:04:39 -06:00
evazion
28971fe103 sources: factor out site_name method. 2022-03-11 23:20:53 -06:00
evazion
b4aea72d04 sources: remove preview_urls method from base strategy.
Remove the `preview_urls` method from strategies. The only place this was used was
when doing IQDB searches, to download the thumbnail image from the source instead of
the full image.

This wasn't worth it for a few reasons:

* Thumbnails on other sites are sometimes not the size we want, which could affect
  IQDB results.
* Grabbing thumbnails is complex for some sites. You can't always just rewrite the
  image URL. Sometimes it requires extra API calls, which can be slower than just
  grabbing the full image.
* For videos and animations, thumbnails from other sites don't always match our
  thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which
  means we don't always pick the first frame, which could affect IQDB results.

API changes:

* /iqdb_queries?search[file_url] now downloads the URL as is without any modification.
  Before it tried to change thumbnail and sample size image URLs to the full version.

* /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that
  contains multiple images. Before it would grab only the first image and silently
  ignore the rest.
2022-03-11 03:22:23 -06:00
evazion
43a665a66d sources: factor out Source::URL::NicoSeiga. 2022-03-10 04:53:51 -06:00
evazion
7b009cc893 nicoseiga: fix inability to login to nicoseiga.
NicoSeiga changed it so that on every login, you must enter a 2FA code
sent by email. This broke the NicoSeiga strategy. The fix is to just use
a static session cookie instead (and hope it doesn't expire, and isn't
tied to an IP).

The `nico_seiga_login` and `nico_seiga_password` config settings have
been removed from config/danbooru_default_config.rb and replaced by
`nico_seiga_user_session`. If you run your own Danbooru instance, you
will have to update your config file manually.
2022-02-22 12:23:01 -06:00
evazion
b6538fde38 uploads: fix NicoSeiga sources not working.
Fix uploads for NicoSeiga sources not working because the strategy
returned URLs like the one below in the list of image_urls, which
require a login to download:

    https://seiga.nicovideo.jp/image/source/10315315

Also fix certain URLs like https://dic.nicovideo.jp/oekaki/52833.png not
working, because they didn't contain an image ID and the image_urls
method returned an empty list in this case.
2022-02-15 17:12:02 -06:00
evazion
a7dc05ce63 Enable frozen string literals.
Make all string literals immutable by default.
2021-12-14 21:33:27 -06:00
nonamethanks
cb6196c259 Nicoseiga: auto-add spoiler tags to commentary 2021-04-06 14:08:49 +02:00
nonamethanks
3179509791 Uploads: Check if strategy is enabled before use
Avoid returning bare API tracebacks from pixiv et al when login details
are not configured, and instead raise a generic error.
2020-07-11 04:56:46 +02:00
evazion
44f826d8fa nicoseiga: optimize image_url method.
The image_url method makes a request to `https://seiga.nicovideo.jp/images/source/:image_id`
to see where this URL redirects to. Before we did a GET request, which caused it to download
the full image. This could fail with a timeout error if the download took too long. We also
cached the request, which caused the full image to be cached, even though we only need the
headers. Change it to a HEAD request so we don't have to download the entire image just to
check the URL.
2020-06-24 22:54:04 -05:00
evazion
8c6759bbd7 nicoseiga: fix login endpoint.
* Update the login endpoint. The old endpoint returns 404 now.

  POST https://account.nicovideo.jp/api/v1/login ->
  POST https://account.nicovideo.jp/login/redirector?site=seiga

* Let Danbooru::Http cache the login request instead of caching it manually.

* Let Danbooru::Http automatically follow redirects instead of dealing
  with the Location header manually.
2020-06-22 18:46:47 -05:00
evazion
95fee75d9a nicoseiga: fix uploads not working for certain direct image urls.
Fix Nicoseiga strategy to work with certain direct image urls that we
can't otherwise extract any information from.

Examples:

* https://dic.nicovideo.jp/oekaki/52833.png
2020-06-22 16:53:50 -05:00
evazion
1aa0f65187 sources: fix rubocop warnings. 2020-06-16 00:10:37 -05:00
nonamethanks
5b186f3072 Support for new nicoseiga cdn domain 2020-06-15 04:01:34 +02:00
nonamethanks
6fc4d3ec44 Nicoseiga: Add support for drm-served manga 2020-06-15 03:37:51 +02:00
nonamethanks
260bc997f6 NicoSeiga: Add preview urls 2020-06-15 03:37:51 +02:00
nonamethanks
9f0e85e1b5 Refactor nicoseiga strategy
* Get rid of mechanize, fully switch to Danbooru::Http
* Switch to mobile api, improving speed
* Merge main and manga clients
* Add full support for manga pages
* Add support for anonymous and r-15 images
* Don't fail when attempting to upload oekaki direct links
* Various misc fixes
2020-06-15 03:37:51 +02:00
evazion
88d9fc4e5e sources: simplify artist finder url normalization.
Get rid of `normalized_for_artist_finder?` and `normalizable_for_artist_finder?`.
This was legacy bullshit that was originally designed to avoid API calls
when saving artist entries containing old Pixiv direct image urls that
had already been normalized, or that couldn't be normalized because they
were bad id.

Nowadays we store profile urls in artist entries instead of direct image
urls, so we don't normally need to do any API calls to normalize the
profile url. Strategies should take care to avoid triggering API calls
inside `profile_url` when possible.
2020-05-29 15:35:15 -05:00
nonamethanks
307df3b3e4 Refactor source normalization
* Move the source normalization logic out of the post model
  and into individual sources' strategies.
* Rewrite normalization tests to be handled into each source's test,
  and expand them significantly. Previously we were only testing
  a very small subset of domains and variants.
* Fix up normalization for several sites.
* Normalize fav.me urls into normal deviantart urls.
2020-05-21 22:46:51 +02:00
evazion
309821bf73 rubocop: fix various style issues. 2019-12-22 21:23:37 -06:00
evazion
8209a75e95 nicoseiga: remove referer spoofing.
NicoSeiga doesn't appear to have any hotlink protection, so we don't
need to spoof the referer.
2019-10-07 13:15:48 -05:00
Albert Yi
d8d4a5ae6f refactor nico seiga manga support 2019-02-25 15:53:07 -08:00
Albert Yi
90ce42a537 add support for nico seiga manga (fixes #4060) 2019-02-25 14:44:45 -08:00
evazion
1f73e60514 sources: add methods for customizing new artist entries.
* Rename `unique_id` to `tag_name`.

* Add `other_names` and `profile_urls` methods that sources can override
  to provide extra names or urls when creating new artist entries.
2018-12-27 15:03:11 -06:00
evazion
5cf6a43918 sources: fix sources sometimes choosing wrong strategy (fix #3968)
Fix sources choosing the wrong strategy when the referer belongs to a
different site (for example, when uploading a twitter post with a pixiv
referer).

* Fix `match?` to only consider the main url, not the referer.

* Change `match?` to match against a list of domains given by the `domains` method.

* Change `match?` to an instance method.
2018-11-04 13:00:17 -06:00
evazion
39f9e01b13 nicoseiga: fix canonical_url to use the image url. 2018-09-22 11:07:18 -05:00
Albert Yi
266c7c0d5b cache api clients 2018-09-11 14:19:17 -07:00
Albert Yi
48f2a79d13 fix artist url spec and bug with nicoseiga strategy not recognizing urls 2018-08-29 17:14:36 -07:00
Albert Yi
762dc3da24 Refactor sources 2018-08-24 12:10:51 -07:00
evazion
48b67967b5 Fix #3457: Timeout when fetching source data from NicoSeiga. 2017-12-27 13:08:14 -06:00
evazion
4c39783d28 Fix #3424: /iqdb_queries.json fails for certain urls.
Fix the HTML page -> image URL download rewrite strategy failing for
https://lohas.nicoseiga.jp/thumb/${id}i URLs.
2017-12-15 10:16:06 -06:00
r888888888
449385f08f fixes #3313 2017-10-09 12:17:15 -07:00
r888888888
09ed1ea720 another bug fix for nico seiga artist url normalization 2017-05-31 15:50:40 -07:00
r888888888
216ca06fee fixes #3100 2017-05-30 15:38:01 -07:00
evazion
98b0b2c5d8 tests: fix Net::HTTP::Persistent::Error: too many connection resets.
Works around connection reset errors in the test suite by disabling
persistent connections.

  20) Error:
Sources::PixivTest#test_: in all cases fetching source data for a new manga image should get the tags. :
Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 47071328584700, last used 1.842702476 seconds ago
  app/logical/pixiv_web_agent.rb:46:in `build'
  app/logical/sources/strategies/pixiv.rb:104:in `agent'
  app/logical/sources/strategies/pixiv.rb:72:in `get'
  app/logical/sources/site.rb:6:in `get'
  test/unit/sources/pixiv_test.rb:7:in `get_source'
  test/unit/sources/pixiv_test.rb:64:in `block (3 levels) in <class:PixivTest>'

ref: github.com/sparklemotion/mechanize/issues/123
ref: http://www.rubydoc.info/gems/mechanize/Mechanize#retry_change_requests%3D-instance_method
2017-02-04 17:07:00 -06:00
Albert Yi
3ad639521f fixes #2805: Improve nico seiga support 2016-12-27 16:11:22 -08:00
r888888888
4a24fe5074 potential fix for #2431 2015-07-07 15:59:40 -07:00
Toks
6ea556944b Add support for uploading from seiga /o/ pages 2015-07-07 18:00:47 -04:00
r888888888
fd74f860ee potential fix for #2404 2015-06-10 17:28:51 -07:00
Toks
854d587373 Fix upload source fetcher fetching from wrong work page for all sites
e.g. If you were on an html work page on pixiv, clicked a link to a
different html work page on pixiv, and then clicked the bookmarklet,
then it used to fetch the source from the FIRST work you were on instead
of the second.
2015-06-03 20:59:24 -04:00
r888888888
66dd4f072e update tests 2015-06-02 19:20:48 -07:00
Toks
e14667e757 Autodelete invalid nicoseiga session 2015-05-10 13:08:34 -04:00
r888888888
39ce77bbb1 fix nico seiga tests 2014-12-04 22:58:27 -08:00
Toks
f4529e73e3 Cache seiga and nijie sessions 2014-10-05 12:11:08 -04:00
Toks
be28a8e624 Fix Seiga sample/thumbnail rewriting 2014-06-13 16:59:08 -04:00
Toks
0a75402cc7 Support referrer matching for seiga and da 2014-05-08 20:25:11 -04:00
Toks
bb07dc429b Seiga: fix source uploads still not working in some cases 2014-04-30 15:18:53 -04:00
Toks
884be2b711 Seiga: fix source uploads not working 2014-04-30 14:40:21 -04:00