Commit Graph

294 Commits

Author SHA1 Message Date
evazion
4a99cb098f moebooru: use the image url as the canonical url. 2018-09-16 21:00:11 -05:00
evazion
d9ce953752 Fix #3906: Moebooru strategy raises NotImplementedError. 2018-09-16 21:00:11 -05:00
evazion
f135a7c064 twitter: normalize canonical urls.
Normalize http://mobile.twitter.com to http://twitter.com in canonical urls.
2018-09-16 15:03:47 -05:00
evazion
bd47641601 twitter: don't fail when api key isn't configured. 2018-09-16 15:03:47 -05:00
evazion
325120ee51 twitter: fix parsing of the artist name from the url.
Fixes URLs like https://twitter.com/intent/user?user_id=123 being
incorrectly normalized to http://twitter.com/intent/ in artist entries.

Also fixes the artist name to be taken from the url when it can't be
obtained from the api (when the tweet is deleted).
2018-09-16 15:03:23 -05:00
Albert Yi
f487b2a2c6 Merge pull request #3889 from evazion/fix-replace-artist-finder
Cleanup artist finder
2018-09-12 11:44:42 -07:00
evazion
fbd5f6b7f2 pixiv: fix preview_urls for ugoiras (#3891). 2018-09-12 00:43:10 -05:00
evazion
37fc215d75 pixiv: fix preview_urls to use correct url (#3891). 2018-09-11 23:55:46 -05:00
evazion
583f8457f0 artists: clean up artist finding logic.
Rename Artist#find_all_by_url to url_matches and drop previous
url_matches method, along with find_artists and search_for_profile.

Previously find_artists tried to lookup the url, referer url, and profile
url in turn until an artist match was found. This was wasteful, because
the source strategy already knows which url to lookup (usually the profile
url). If that url doesn't find a match, then the artist doesn't exist.
2018-09-11 20:14:46 -05:00
Albert Yi
a5df178bcc Merge pull request #3886 from r888888888/source-api-caching
cache api clients
2018-09-11 17:34:25 -07:00
Albert Yi
4972c998f8 rely on preview urls if available for gallery 2018-09-11 15:06:12 -07:00
Albert Yi
266c7c0d5b cache api clients 2018-09-11 14:19:17 -07:00
Albert Yi
f16c3a3f40 fix nijie specs 2018-09-11 13:27:00 -07:00
evazion
9a980367f6 twitter: normalize artist commentaries to nfkc (#3719)
Fixes hashtags not being interpreted when the author uses a fullwidth
number sign (#, U+FF03).

ref: https://github.com/r888888888/danbooru/issues/3719#issuecomment-419535610
2018-09-10 21:45:50 -05:00
evazion
bfed323988 deviantart: fix page_url for when api data is unavailable.
The api data is unavailable when we can't scrape the uuid, either
because the work is deleted or because the work is actually a sta.sh
upload and we weren't given the sta.sh page in the referer url.
2018-09-10 19:26:53 -05:00
evazion
c9300cc54e sta.sh: add tests + docs. 2018-09-10 19:26:53 -05:00
evazion
7d5d098636 Fix #3877: Add sta.sh strategy.
Co-authored-by: lllusion3469 <31420484+lllusion3469@users.noreply.github.com>
2018-09-10 19:26:47 -05:00
evazion
cb2d85d925 twitter: fix profile_url for twitter.com/i/web/status/:id urls.
Fix profile_url returning nil for https://twitter.com/i/web/status/943446161586733056.
2018-09-09 19:48:34 -05:00
evazion
b924c2bb9c nijie: fix artist url normalization. 2018-09-09 13:17:52 -05:00
Albert Yi
b1a9337897 Merge pull request #3875 from evazion/fix-3873
Fix #3873: Batch bookmarklet for tumblr reports wrong posts as already uploaded
2018-09-07 14:15:24 -07:00
evazion
a67edb8783 deviantart: fix artist finder for artist names with underscores.
Fix the artist finder for urls like this:

  https://orig00.deviantart.net/4274/f/2010/230/8/a/pkmn_king_and_queen_by_mikoto_chan.jpg

that don't contain a deviantart id but do contain the artist name.
2018-09-07 12:23:48 -05:00
evazion
610391205f deviantart: fix artist finder for profile urls missing the 'www'.
Fix the artist finder to work when the profile url in the artist entry
is missing the 'www'. Example:

  https://deviantart.com/noizave
  https://www.deviantart.com/noizave
2018-09-07 11:36:48 -05:00
evazion
950fcdb7b2 uploads: add new source:<url> dupe check (fix #3873)
* On the /uploads/new page, instead of just showing a "This post has
probably already been uploaded" message, show the actual thumbnails of
posts having the same source as what the user is trying to upload.

* Move the iqdb results section up top, beside the related posts section.
2018-09-06 20:43:20 -05:00
evazion
5c457fbe51 pixiv: remove obsolete edgesuite.net rewrite rule.
This CDN hasn't been seen for several years.

ref: https://danbooru.donmai.us/forum_topics/10766
2018-09-04 18:15:21 -05:00
evazion
4bbe09762d pixiv: remove dead methods (#is_manga?, #page_count, #page). 2018-09-04 18:15:21 -05:00
Albert Yi
a5943de418 Merge pull request #3868 from evazion/fix-3864
Fix #3864: DeviantArt fetch source data failure
2018-09-04 13:42:01 -07:00
Albert Yi
8ec96f42f7 fix specs 2018-09-04 13:38:09 -07:00
Albert Yi
4a56f8d160 fixes #3856 for pixiv fanbox urls 2018-09-04 12:53:58 -07:00
Albert Yi
e695cdde75 add a default for image_url on Sources::Strategies#canonical_url 2018-09-04 11:35:33 -07:00
evazion
e37844303d deviantart: take artist name from url when unavailable from API.
In some cases we can't get the artist name from the API, either because
we can't do the API call because the url doesn't contain a deviation id,
or because the work is deleted:

* http://fc08.deviantart.net/files/f/2007/120/c/9/cool_like_me_by_47ness.jpg (work: http://fav.me/dwcohb)
* https://pre00.deviantart.net/423b/th/pre/i/2017/281/e/0/mindflayer_girl01_by_nickbeja-dbpxdt8.png (work: http://fav.me/dbpxd58)

Switch to taking the artist name from the url (when present) to deal
with these cases. Fixes the artist finder and the artist url normalizer
to work in this situation.
2018-09-03 18:27:01 -05:00
evazion
8f87fb90d9 deviantart: handle urls without deviation ids (fix #3864)
Some older URL formats don't contain the deviation id:

* http://fc08.deviantart.net/files/f/2007/120/c/9/Cool_Like_Me_by_47ness.jpg
* http://pre06.deviantart.net/8497/th/pre/f/2009/173/c/c/cc9686111dcffffffb5fcfaf0cf069fb.jpg

In these cases we can't make the API call. Fix failures due to not being
able to do API calls in this situation.

Also fix canonical_url to use the image_url when it contains the
deviation id, or the page_url when it doesn't.

Finally, fix page_url to use the url from the API instead of the raw url
given by the user, so that it's in a consistent form for canonical_url.
2018-09-03 18:26:45 -05:00
evazion
316acead16 deviantart: fix error when uploading image belonging to deleted work. 2018-09-02 23:09:40 -05:00
evazion
2d1b1311d6 deviantart: fix sample urls not being rewritten to full size urls. 2018-09-02 23:09:29 -05:00
evazion
807c3dd5f4 deviantart: remove obsolete image sample rewrite rules.
Remove rewrite rules for these types of sample urls:

* http://th00.deviantart.net/fs71/PRE/f/2014/065/3/b/goruto_by_xyelkiltrox-d797tit.png
* http://th04.deviantart.net/fs70/300W/f/2009/364/4/d/Alphes_Mimic___Rika_by_Juriesute.png

These URLs aren't served to users any more, and just stripping out "PRE"
or "200H" isn't sufficient to get the full size image. In general, an
api call is required to find the full size image url.
2018-09-02 14:49:58 -05:00
evazion
b9ed676bfb deviantart: handle origin-orig.deviantart.net urls. 2018-09-02 13:57:15 -05:00
evazion
d693f01dde Fix #3859: Related tag and find artist don't run when fetch data fails.
Fixes an exception in the artist finder caused by searching for a nil profile_url.
2018-09-01 11:48:42 -05:00
evazion
c689a161f6 pixiv: fix failure when normalizing pixiv stacc artist urls (#3856). 2018-08-30 19:24:44 -05:00
Albert Yi
9206a60760 Merge pull request #3852 from evazion/fix-twitter-direct-url
Twitter: fix handling of direct image urls without a referer url.
2018-08-29 17:32:54 -07:00
evazion
6c94047556 Sources::Strategies::Twitter#profile_url: fix case when url is a profile url. 2018-08-29 19:29:16 -05:00
Albert Yi
48f2a79d13 fix artist url spec and bug with nicoseiga strategy not recognizing urls 2018-08-29 17:14:36 -07:00
evazion
a1044dbc19 twitter: fix handling of direct image urls without a referer url. 2018-08-29 17:14:57 -05:00
Albert Yi
eac5a57c0b implement Sources::Strategies::Null#artist_name 2018-08-29 14:05:44 -07:00
evazion
bf19ea3bd1 twitter: fix typo in ASSET regex (#3850). 2018-08-29 15:09:36 -05:00
Albert Yi
762dc3da24 Refactor sources 2018-08-24 12:10:51 -07:00
evazion
3af82de596 Partial fix for #3719: Certain commentaries not parsed correctly 2018-08-20 23:18:26 -05:00
Albert Yi
135b97d511 additional fixes for deviantart artist search (#3771) 2018-07-27 12:31:26 -07:00
Albert Yi
4762de65e1 more robust handling of deviant art urls 2018-07-10 14:57:38 -07:00
Albert Yi
a0205be8b5 fixes #3771 2018-07-06 11:44:07 -07:00
Albert Yi
5ae37597cd fixes #3728 2018-05-25 13:24:49 -07:00
Albert Yi
c97b0245d6 reduce expiry for cached pixiv tokens to 1 week, revert to old method for extracting image url from page (fixes #3722) 2018-05-25 10:04:28 -07:00