* Fix bug where bogus domains weren't found because we checked for `nil?` instead of `blank?`.
* Fix bug where bogus names weren't found because we used a nonexistent variable in the
RCPT TO check (`to_address` instead of `address`).
Add ability to group reports by various columns. For example, you can see
the posts by the top 10 uploaders over time, or posts grouped by rating
over time.
Split requests made by Danbooru::Http into either internal or external
requests. Internal requests are API calls to internal services run by
Danbooru. External requests are requests to external websites, for
example fetching sources or downloading files. External requests may use
a HTTP proxy if one is configured. Internal requests don't.
Fixes a few source extractors not using the HTTP proxy for certain API calls.
Update email validation rules to disallow the percent character (e.g.
`foo%bar@gmail.com`) and names ending with a period (e.g. `foo.@gmail.com`).
Names ending with a period are invalid according to the RFCs and cause
`Mail::Address.new` to raise an exception.
The percent character is technically legal, but only one email used it
and it was probably a typo.
Try to automatically fix various kind of typos and common mistakes in
email addresses when a user creates a new account. It's common for users
to signup with addresses like `name@gmai.com`, which leads to bounces
when we try to send the welcome email.
* Fixed a bug where manga posts with a single tag would raise an error
* Fixed a bug where dic.nicovideo.jp/oekaki posts weren't uploadable due
to SSL issues
* Added support for more manga corner cases
Make the following fields visible in API responses:
* ip_bans.ip_addr
* ip_geolocations.ip_addr
* ip_geolocations.network
* users.last_ip_addr (mod only)
* user_sessions.ip_addr
* api_keys.last_ip_address
* api_keys.permitted_ip_addresses
Before IP addresses were globally hidden in API responses because IPs were
present in a lot of tables and we didn't want to accidentally leak them.
Now that we've gotten rid of IPs from most tables, it's safe to unhide them.
Fix a PublicSuffix::DomainNotAllowed exception raised with viewing or editing a post
with a source like `Blog.`.
This happened when parsing the post's source. `Danbooru::URL.parse("Blog.")` would
heuristically parse the source into `http://blog`. Calling any methods related to the
URL's hostname or domain would lead to calling `PublicSuffix.parse("blog")`, which
would fail with PublicSuffix::DomainNotAllowed.
Add `#basename`, `#filename`, and `#file_ext` utility methods to
Danbooru::URL and change a few places to use them. Simplifies parsing
filenames in source URLs in various places.
Introduce a Source::URL class for parsing URLs from source sites. Refactor the Twitter
source strategy to use it.
This is the first step towards factoring all the URL parsing logic out of source
strategies and moving it to subclasses of Source::URL. Each site will have a subclass
of Source::URL dedicated to parsing URLs from that site. Source strategies will use
these classes to extract information from URLs.
This is to simplify source strategies. Most sites have many different URL formats we have
to parse or rewrite, and handling all these different cases tends to make source
strategies very complex. Isolating the URL parsing logic from the site scraping logic
should make source strategies easier to maintain.
Introduce a Danbooru::URL class for dealing with URLs. This is a wrapper
around Addressable::URI that adds some additional helper methods. Most
significantly, the `parse` method only allows valid http/https URLs, and
it returns nil instead of raising an exception when the URL is invalid.
Fixes a bug where the Foundation source strategy failed because http.rb
automatically sent a `Content-Length: 0` header with all GET requests,
which caused Foundation to return a 400 Bad Request error. This behavior
was fixed in http.rb 5.x.
http.rb 5.x has a breaking change where it now includes the request object
inside the response object, which we have to handle in a few places.
Fix an error when trying to upload a file larger than the file size
limit. In this case we tried to dump the whole HTTP response into the
error message, which included the binary file itself, which caused this
exception because it contained null bytes.
This exception was thrown by app/logical/pixiv_ajax_client.rb:406 when a
Pixiv API call failed with a network error. In this case we tried to log
the response body, but this failed because we returned a faked HTTP
response with an empty string for the body, which the http.rb library
didn't like because it was expecting an IO-like object for the body.
Fix certain artist commentaries for foundation.app containing scrambled
characters. Apparently caused by the Nokogiri HTML5 parser not handling
UTF-8 input correctly when the encoding isn't explicitly set to UTF-8.
Send all logs to stderr by default instead of stdout. Fixes a problem
where parsing the output of sandboxed commands could fail, because they
could contain Rails log messages in their stdout.
When we run a command in a sandbox, we call fork+exec to run the command
in the background so we can capture its output. If Rails prints
anything to stdout between the fork and exec calls, then it will be
inadvertently captured along with the command's output. This will break
parsing of the command's output. This can happen if warning messages are
printed by Rails while setting up the sandbox between the fork and exec
calls.
Writing to stderr is also more correct, since stdout is buffered by
default, which means logs could potentially be lost if the process dies
unexpectedly before the buffers are flushed. Stderr is unbuffered by
default, which means logs will always be output immediately.
Add the following:
* Container name, machine name, worker id.
* Container uptime, puma uptime, worker uptime.
* Number of requests processed by current worker.
* ExifTool version.
Also change /status page to show information in tables instead of lists.
Add support for using a proxy for HTTP requests. Only used for external
requests, such as downloading files or talking to source sites such as
Pixiv or Twitter, not for internal requests, such as talking to IQDB or
Reportbooru.
When a POST request returns a 302 redirect, follow the redirect with a
GET request instead of with a POST request.
HTTP standards leave it unspecified whether a POST request that returns
a 302 redirect should be followed with a GET or with a POST. A GET is
what most browsers use, which means it's what most servers expect.
Fixes the /tagme Discord command not working because when we uploaded
the image to DeepDanbooru, the POST request returned a 302 redirect,
which the server expected us to follow with a GET, not with a POST.
Ref:
* https://stackoverflow.com/questions/17605915/what-is-the-correct-behavior-expected-of-an-http-post-302-redirect-to-get
Require new accounts to verify their email address if any of the
following conditions are true:
* Their IP is a proxy.
* Their IP is under a partial IP ban.
* They're creating a new account while logged in to another account.
* Somebody recently created an account from the same IP in the last week.
Changes from before:
* Allow logged in users to view the signup page and create new accounts.
Creating a new account while logged in to your old account is now
allowed, but it requires email verification. This is a honeypot.
* Creating multiple accounts from the same IP is now allowed, but they
require email verification. Previously the same IP check was only for
the last day (now it's the last week), and only for an exact IP match
(now it's a subnet match, /24 for IPv4 or /64 for IPv6).
* New account verification is disabled for private IPs (e.g. 127.0.0.1,
192.168.0.1), to make development or running personal boorus easier
(fixes#4618).
Bug: The frontpage failed due to a SSL error. We couldn't fetch the
popular tag list from Reportbooru because Reportbooru's SSL certificate
had expired and HTTP.rb raised an SSLError exception that we didn't
catch.
Fix: Convert the SSLError to a 5xx HTTP error to prevent SSL exceptions
from leaking through HTTP.rb.
* Factor out the Cloudflare Polish bypass code to a standalone feature.
* Add `http_downloader` method to the base source strategy. This is a
HTTP client that should be used for downloading images or making
requests to images. This client ensures that referrer spoofing and
Cloudflare bypassing are performed.
This fixes a bug with the upload page reporting the polished filesize
instead of the original filesize when uploading ArtStation images.
Factor out referrer spoofing so that it can be used outside of downloading
files. We also need to spoof the referrer when determining the remote
filesize of images on the uploads page.
Bug: if a Nijie login failed with a 429 Too Many Requests error, the
error would get cached, so when we retried the request, we would just
get our own cached response back every time. The 429 error would
eventually be passed up to the Nijie strategy, which caused random
methods to fail because they couldn't get the html page.
Fix: add the `retriable` feature *after* the `cache` feature so that
retries don't go through the cache. This is a hack. We want retries to
go at the bottom of the stack, below caching, but we can't enforce this
ordering.
Replace http.rb's builtin redirect following option with our own
redirect follower. This fixes an issue with http.rb losing cookies after
following a redirect.
Allow cookies to be saved and sent back when making several requests in
a row. Usage:
http = Danbooru::Http.use(:session)
# saves the foo=42 cookie sent by the response.
http.get("https://httpbin.org/cookies/set/foo/42")
# sends back the foo=42 cookie from the previous request.
http.get("https://httpbin.org/cookies")