Commit Graph

34 Commits

Author SHA1 Message Date
evazion
e3af738371 tests: fix broken tests. 2022-08-24 02:03:37 -05:00
evazion
d9d3c1dfe4 sources: rename Sources::Strategies to Source::Extractor.
Rename Sources::Strategies to Source::Extractor. A Source::Extractor
represents a thing that extracts information from a given URL.
2022-03-24 03:49:44 -05:00
evazion
4ef8178bd1 sources: remove canonical_url method.
Refactor source strategies to remove the `canonical_url` method.

`canonical_url` returned the URL that should be used as the source of
the post after upload. Now we simply use `Source::URL#page_url` to
determine the source after upload. If the source is an image URL that is
convertible to a page URL, then the image URL is used as the source. If
the source is an image URL that is not convertible to a page URL, then
the page URL is used as the source.

This simplifies source strategies so that all they have to care about is
implementing the `Source::URL#page_url` and `Sources::Strategies#page_url`
methods, and the preferred source will be chosen for posts automatically.
2022-03-23 23:38:06 -05:00
evazion
3aa5cab2aa sources: refactor normalize_for_source.
`normalize_for_source` was used to convert image URLs to page URLs when displaying sources
on the post show page. Move all the code for converting image URLs to page URLs from
`Sources::Strategies#normalize_for_source` to `Source::URL#page_url`.

Before we had to be very careful in source strategies not to make any network calls in
`normalize_for_source`, since it was used in the view for the post show page. Now all the
code for generating page URLs is isolated in Source::URL, which makes source strategies
simpler. It also makes it easier to check if a source is an image URL or page URL, and if
the image URL is convertible to a page URL, which will make autotagging bad_link or
bad_source feasible.

Finally, this fixes it to generate better page URLs in a handful of cases:

* https://www.artstation.com/artwork/qPVGP instead of https://anubis1982918.artstation.com/projects/qPVGP
* https://yande.re/post/show?md5=b4b1d11facd1700544554e4805d47bb6s instead of https://yande.re/post?tags=md5:b4b1d11facd1700544554e4805d47bb6
* http://gallery.minitokyo.net/view/365677 instead of http://gallery.minitokyo.net/download/365677
* https://valkyriecrusade.fandom.com/wiki/File:Crimson_Hatsune_H.png instead of https://valkyriecrusade.wikia.com/wiki/File:Crimson_Hatsune_H.png
* https://rule34.paheal.net/post/view/852405 instead of https://rule34.paheal.net/post/list/md5:854806addcd3b1246424e7cea49afe31/1
2022-03-23 01:34:04 -05:00
evazion
b4aea72d04 sources: remove preview_urls method from base strategy.
Remove the `preview_urls` method from strategies. The only place this was used was
when doing IQDB searches, to download the thumbnail image from the source instead of
the full image.

This wasn't worth it for a few reasons:

* Thumbnails on other sites are sometimes not the size we want, which could affect
  IQDB results.
* Grabbing thumbnails is complex for some sites. You can't always just rewrite the
  image URL. Sometimes it requires extra API calls, which can be slower than just
  grabbing the full image.
* For videos and animations, thumbnails from other sites don't always match our
  thumbnails. We do smart thumbnail generation to try to avoid blank thumbnails, which
  means we don't always pick the first frame, which could affect IQDB results.

API changes:

* /iqdb_queries?search[file_url] now downloads the URL as is without any modification.
  Before it tried to change thumbnail and sample size image URLs to the full version.

* /iqdb_queries?search[url] now returns an error if the URL is for a HTML page that
  contains multiple images. Before it would grab only the first image and silently
  ignore the rest.
2022-03-11 03:22:23 -06:00
evazion
2f61486ac6 sources: remove image_url method from base strategy.
Remove the `image_url` method from source strategies. This method would
return only the first image if a source had multiple images. The
`image_urls` method should be used instead. Tests were the main place
that still used `image_url` instead of `image_urls`.

Also make post replacements return an error if replacing with a source
that contains multiple images, instead of just blindly replacing the
post with the first image in the source.
2022-03-11 01:59:21 -06:00
nonamethanks
b9c7e467e5 sources: factor out Source::URL::Tumblr
Also adds support for fetching source data from direct image urls when
possible.
2022-03-08 15:06:06 +01:00
evazion
450594b803 tests: fix broken tests. 2022-01-07 14:44:24 -06:00
evazion
594b46a85d tests: fix broken tests. 2021-11-23 23:18:54 -06:00
nonamethanks
5615719218 Tumblr: support highest res for new image urls 2020-08-14 01:23:27 +02:00
evazion
e31afd0827 tests: fix broken tests. 2020-08-05 12:41:48 -05:00
evazion
67a52dbc2d tumblr: support new va.media.tumblr.com urls. 2020-06-19 13:53:35 -05:00
nonamethanks
307df3b3e4 Refactor source normalization
* Move the source normalization logic out of the post model
  and into individual sources' strategies.
* Rewrite normalization tests to be handled into each source's test,
  and expand them significantly. Previously we were only testing
  a very small subset of domains and variants.
* Fix up normalization for several sites.
* Normalize fav.me urls into normal deviantart urls.
2020-05-21 22:46:51 +02:00
evazion
84ba1d417f Fix #4220: Uploading from Tumblr is broken. 2019-12-15 19:04:52 -06:00
evazion
6a77b68b74 tumblr: fix tests. 2018-12-27 15:03:11 -06:00
evazion
c700ea4b5f Fix #4016: Translated tags failing to find some tags.
* Normalize spaces to underscores when saving other names. Preserve case
  since case can be significant.

* Fix WikiPage#other_names_include to search case-insensitively (note:
  this prevents using the index).

* Fix sources to return the raw tags in `#tags` and the normalized tags
  in `#normalized_tags`. The normalized tags are the tags that will be
  matched against other names.
2018-12-16 11:37:57 -06:00
evazion
53a51310a3 tumblr: add canonical url tests (#3385). 2018-10-09 12:55:48 -05:00
evazion
16b1b72da5 tumblr: fix video urls not being recognized. 2018-10-09 12:44:59 -05:00
evazion
184a5ebf3e tumblr: fix _640 images not being recognized (#3944).
Fixes _640 images not being matched by the IMAGE regex and therefore not
being rewritten to the largest size.
2018-10-09 12:44:59 -05:00
evazion
d874c68419 tumblr: fix image_urls when api data is unavailable. 2018-10-09 12:44:59 -05:00
evazion
b0d7d90103 tumblr: extract info from url when api data is unavailable.
Derive the artist name / profile url / page url from the source URLs when
the API response is unavailable because the Tumblr post was deleted.

This fixes the artist finder to work on bad_tumblr_id posts.
2018-10-09 12:44:59 -05:00
evazion
0c31a5d6a9 tumblr: don't fail when api data is unavailable (#3948).
The api data is unavailable when the work is deleted (bad_tumblr_id), or
when the source is a direct image url with no page referer.
2018-10-09 12:44:59 -05:00
evazion
4c55c809b0 tumblr: don't fail when api key isn't configured. 2018-10-09 12:44:59 -05:00
Albert Yi
4972c998f8 rely on preview urls if available for gallery 2018-09-11 15:06:12 -07:00
Albert Yi
762dc3da24 Refactor sources 2018-08-24 12:10:51 -07:00
r888888888
abce4d2551 Raise error on unpermitted params.
Fail loudly if we forget to whitelist a param instead of silently
ignoring it.

misc models: convert to strong params.

artist commentaries: convert to strong params.

* Disallow changing or setting post_id to a nonexistent post.

artists: convert to strong params.

* Disallow setting `is_banned` in create/update actions. Changing it
  this way instead of with the ban/unban actions would leave the artist in
  a partially banned state.

bans: convert to strong params.

* Disallow changing the user_id after the ban has been created.

comments: convert to strong params.

favorite groups: convert to strong params.

news updates: convert to strong params.

post appeals: convert to strong params.

post flags: convert to strong params.

* Disallow users from setting the `is_deleted` / `is_resolved` flags.

ip bans: convert to strong params.

user feedbacks: convert to strong params.

* Disallow users from setting `disable_dmail_notification` when creating feedbacks.
* Disallow changing the user_id after the feedback has been created.

notes: convert to strong params.

wiki pages: convert to strong params.

* Also fix non-Builders being able to delete wiki pages.

saved searches: convert to strong params.

pools: convert to strong params.

* Disallow setting `post_count` or `is_deleted` in create/update actions.

janitor trials: convert to strong params.

post disapprovals: convert to strong params.

* Factor out quick-mod bar to shared partial.
* Fix quick-mod bar to use `Post#is_approvable?` to determine visibility
  of Approve button.

dmail filters: convert to strong params.

password resets: convert to strong params.

user name change requests: convert to strong params.

posts: convert to strong params.

users: convert to strong params.

* Disallow setting password_hash, last_logged_in_at, last_forum_read_at,
  has_mail, and dmail_filter_attributes[user_id].

* Remove initialize_default_image_size (dead code).

uploads: convert to strong params.

* Remove `initialize_status` because status already defaults to pending
  in the database.

tag aliases/implications: convert to strong params.

tags: convert to strong params.

forum posts: convert to strong params.

* Disallow changing the topic_id after creating the post.
* Disallow setting is_deleted (destroy/undelete actions should be used instead).
* Remove is_sticky / is_locked (nonexistent attributes).

forum topics: convert to strong params.

* merges https://github.com/evazion/danbooru/tree/wip-rails-5.1
* lock pg gem to 0.21 (1.0.0 is incompatible with rails 5.1.4)
* switch to factorybot and change all references

Co-authored-by: r888888888 <r888888888@gmail.com>
Co-authored-by: evazion <noizave@gmail.com>

add diffs
2018-04-06 18:09:57 -07:00
evazion
b185efbb5f tumblr commentaries: include asker's name in ask posts (#3586). 2018-03-30 21:42:51 -05:00
evazion
3fefb73e90 Fix #3561: Tumblr: support answer posts. 2018-02-24 10:31:59 -06:00
evazion
255082d3b5 tumblr: fix test failure. 2017-11-26 15:37:51 -06:00
r888888888
9d5e4f969f fix source tests 2017-11-20 12:30:29 -08:00
evazion
40d0751e83 Fix NoStrategyError during artist url normalization (#3382).
Fixes a bug from 9a3824a. When an artist entry is saved, `ArtistUrl.normalize`
is called on every URL, which calls `Sources::Site.new(url)`. This
raised NoStrategyError when an artist entry contained URLs that weren't
recognized by any strategy.

This also caused `Fetch source data` to fail in certain cases when it
attempted to find the artist.
2017-11-19 10:49:30 -06:00
evazion
2422ce036c tumblr_test.rb: fix test failures. 2017-11-18 13:52:30 -06:00
evazion
71f84b10af tumblr: convert commentary to dtext.
* Convert Tumblr commentary to DText.
* Strip extraneous whitespace in links and blockquotes.
* Add newlines after block elements to ensure they're separated from
  subsequent blocks.
2017-07-01 11:15:48 -05:00
evazion
fbb25666b0 tumblr: add source tests. 2017-06-25 15:34:15 -05:00