Blob Blame History Raw
From ae0065be15b3253042b65baa54f2953f3a6e6926 Mon Sep 17 00:00:00 2001
From: usa <usa@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
Date: Wed, 28 Mar 2018 14:47:30 +0000
Subject: [PATCH] merge revision(s) 62960-62965:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

	webrick: use IO.copy_stream for multipart response

	Use the new Proc response body feature to generate a multipart
	range response dynamically.  We use a flat array to minimize
	object overhead as much as possible; as many ranges may fit
	into an HTTP request header.

	* lib/webrick/httpservlet/filehandler.rb (multipart_body): new method
	  (make_partial_content): use multipart_body
	------------------------------------------------------------------------
	r62960 | normal | 2018-03-28 17:06:23 +0900 (水, 28 3 2018) | 13 lines

	webrick/httprequest: limit request headers size

	We use the same 112 KB limit started (AFAIK) by Mongrel, Thin,
	and Puma to prevent malicious users from using up all the memory
	with a single request.  This also limits the damage done by
	excessive ranges in multipart Range: requests.

	Due to the way we rely on IO#gets and the desire to keep
	the code simple, the actual maximum header may be 4093 bytes
	larger than 112 KB, but we're splitting hairs at that point.

	* lib/webrick/httprequest.rb: define MAX_HEADER_LENGTH
	  (read_header): raise when headers exceed max length
	------------------------------------------------------------------------
	r62961 | normal | 2018-03-28 17:06:28 +0900 (水, 28 3 2018) | 9 lines

	webrick/httpservlet/cgihandler: reduce memory use

	WEBrick::HTTPRequest#body can be passed a block to process the
	body in chunks.  Use this feature to avoid building a giant
	string in memory.

	* lib/webrick/httpservlet/cgihandler.rb (do_GET):
	  avoid reading entire request body into memory
	  (do_POST is aliased to do_GET, so it handles bodies)
	------------------------------------------------------------------------
	r62962 | normal | 2018-03-28 17:06:34 +0900 (水, 28 3 2018) | 7 lines

	webrick/httprequest: raise correct exception

	"BadRequest" alone does not resolve correctly, it is in the
	HTTPStatus namespace.

	* lib/webrick/httprequest.rb (read_chunked): use correct exception
	* test/webrick/test_httpserver.rb (test_eof_in_chunk): new test
	------------------------------------------------------------------------
	r62963 | normal | 2018-03-28 17:06:39 +0900 (水, 28 3 2018) | 9 lines

	webrick/httprequest: use InputBufferSize for chunked requests

	While WEBrick::HTTPRequest#body provides a Proc interface
	for streaming large request bodies, clients must not force
	the server to use an excessively large chunk size.

	* lib/webrick/httprequest.rb (read_chunk_size): limit each
	  read and block.call to :InputBufferSize in config.
	* test/webrick/test_httpserver.rb (test_big_chunks): new test
	------------------------------------------------------------------------
	r62964 | normal | 2018-03-28 17:06:44 +0900 (水, 28 3 2018) | 9 lines

	webrick: add test for Digest auth-int

	No changes to the actual code, this is a new test for
	a feature for which no tests existed.  I don't understand
	the Digest authentication code well at all, but this is
	necessary for the subsequent change.

	* test/webrick/test_httpauth.rb (test_digest_auth_int): new test
	  (credentials_for_request): support bodies with POST
	------------------------------------------------------------------------
	r62965 | normal | 2018-03-28 17:06:49 +0900 (水, 28 3 2018) | 18 lines

	webrick/httpauth/digestauth: stream req.body

	WARNING! WARNING! WARNING!  LIKELY BROKEN CHANGE

	Pass a proc to WEBrick::HTTPRequest#body to avoid reading a
	potentially large request body into memory during
	authentication.

	WARNING! this will break apps completely which want to do
	something with the body besides calculating the MD5 digest
	of it.

	Also, keep in mind that probably nobody uses "auth-int".
	Servers such as Apache, lighttpd, nginx don't seem to
	support it; nor does curl when using POST/PUT bodies;
	and we didn't have tests for it until now...

	* lib/webrick/httpauth/digestauth.rb (_authenticate): stream req.body

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@63021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
---
 lib/webrick/httpauth/digestauth.rb    |  8 ++-
 lib/webrick/httprequest.rb            | 23 +++++--
 lib/webrick/httpservlet/cgihandler.rb |  4 +-
 test/webrick/test_httpauth.rb         | 90 ++++++++++++++++++++++++++-
 test/webrick/test_httpserver.rb       | 67 ++++++++++++++++++++
 5 files changed, 178 insertions(+), 14 deletions(-)

diff --git a/lib/webrick/httpauth/digestauth.rb b/lib/webrick/httpauth/digestauth.rb
index 78ad45b233..2a2319e9b1 100644
--- a/lib/webrick/httpauth/digestauth.rb
+++ b/lib/webrick/httpauth/digestauth.rb
@@ -235,9 +235,11 @@ module WEBrick
           ha2 = hexdigest(req.request_method, auth_req['uri'])
           ha2_res = hexdigest("", auth_req['uri'])
         elsif auth_req['qop'] == "auth-int"
-          ha2 = hexdigest(req.request_method, auth_req['uri'],
-                          hexdigest(req.body))
-          ha2_res = hexdigest("", auth_req['uri'], hexdigest(res.body))
+          body_digest = @h.new
+          req.body { |chunk| body_digest.update(chunk) }
+          body_digest = body_digest.hexdigest
+          ha2 = hexdigest(req.request_method, auth_req['uri'], body_digest)
+          ha2_res = hexdigest("", auth_req['uri'], body_digest)
         end
 
         if auth_req['qop'] == "auth" || auth_req['qop'] == "auth-int"
diff --git a/lib/webrick/httprequest.rb b/lib/webrick/httprequest.rb
index 76420730b1..b3bcea7b3d 100644
--- a/lib/webrick/httprequest.rb
+++ b/lib/webrick/httprequest.rb
@@ -412,9 +412,13 @@ module WEBrick
 
     MAX_URI_LENGTH = 2083 # :nodoc:
 
+    # same as Mongrel, Thin and Puma
+    MAX_HEADER_LENGTH = (112 * 1024) # :nodoc:
+
     def read_request_line(socket)
       @request_line = read_line(socket, MAX_URI_LENGTH) if socket
-      if @request_line.bytesize >= MAX_URI_LENGTH and @request_line[-1, 1] != LF
+      @request_bytes = @request_line.bytesize
+      if @request_bytes >= MAX_URI_LENGTH and @request_line[-1, 1] != LF
         raise HTTPStatus::RequestURITooLarge
       end
       @request_time = Time.now
@@ -433,6 +437,9 @@ module WEBrick
       if socket
         while line = read_line(socket)
           break if /\A(#{CRLF}|#{LF})\z/om =~ line
+          if (@request_bytes += line.bytesize) > MAX_HEADER_LENGTH
+            raise HTTPStatus::RequestEntityTooLarge, 'headers too large'
+          end
           @raw_header << line
         end
       end
@@ -500,12 +507,16 @@ module WEBrick
     def read_chunked(socket, block)
       chunk_size, = read_chunk_size(socket)
       while chunk_size > 0
-        data = read_data(socket, chunk_size) # read chunk-data
-        if data.nil? || data.bytesize != chunk_size
-          raise BadRequest, "bad chunk data size."
-        end
+        begin
+          sz = [ chunk_size, @buffer_size ].min
+          data = read_data(socket, sz) # read chunk-data
+          if data.nil? || data.bytesize != sz
+            raise HTTPStatus::BadRequest, "bad chunk data size."
+          end
+          block.call(data)
+        end while (chunk_size -= sz) > 0
+
         read_line(socket)                    # skip CRLF
-        block.call(data)
         chunk_size, = read_chunk_size(socket)
       end
       read_header(socket)                    # trailer + CRLF
diff --git a/lib/webrick/httpservlet/cgihandler.rb b/lib/webrick/httpservlet/cgihandler.rb
index 7c012ca64b..d5ba756437 100644
--- a/lib/webrick/httpservlet/cgihandler.rb
+++ b/lib/webrick/httpservlet/cgihandler.rb
@@ -66,9 +66,7 @@ module WEBrick
           cgi_in.write("%8d" % dump.bytesize)
           cgi_in.write(dump)
 
-          if req.body and req.body.bytesize > 0
-            cgi_in.write(req.body)
-          end
+          req.body { |chunk| cgi_in.write(chunk) }
         ensure
           cgi_in.close
           status = $?.exitstatus
diff --git a/test/webrick/test_httpauth.rb b/test/webrick/test_httpauth.rb
index 2414be9096..842668f54e 100644
--- a/test/webrick/test_httpauth.rb
+++ b/test/webrick/test_httpauth.rb
@@ -3,6 +3,7 @@ require "net/http"
 require "tempfile"
 require "webrick"
 require "webrick/httpauth/basicauth"
+require "stringio"
 require_relative "utils"
 
 class TestWEBrickHTTPAuth < Test::Unit::TestCase
@@ -182,12 +183,97 @@ class TestWEBrickHTTPAuth < Test::Unit::TestCase
     }
   end
 
+  def test_digest_auth_int
+    log_tester = lambda {|log, access_log|
+      log.reject! {|line| /\A\s*\z/ =~ line }
+      pats = [
+        /ERROR Digest wb auth-int realm: no credentials in the request\./,
+        /ERROR WEBrick::HTTPStatus::Unauthorized/,
+        /ERROR Digest wb auth-int realm: foo: digest unmatch\./
+      ]
+      pats.each {|pat|
+        assert(!log.grep(pat).empty?, "webrick log doesn't have expected error: #{pat.inspect}")
+        log.reject! {|line| pat =~ line }
+      }
+      assert_equal([], log)
+   }
+    TestWEBrick.start_httpserver({}, log_tester) {|server, addr, port, log|
+      realm = "wb auth-int realm"
+      path = "/digest_auth_int"
+
+      Tempfile.create("test_webrick_auth_int") {|tmpfile|
+        tmpfile.close
+        tmp_pass = WEBrick::HTTPAuth::Htdigest.new(tmpfile.path)
+        tmp_pass.set_passwd(realm, "foo", "Hunter2")
+        tmp_pass.flush
+
+        htdigest = WEBrick::HTTPAuth::Htdigest.new(tmpfile.path)
+        users = []
+        htdigest.each{|user, pass| users << user }
+        assert_equal %w(foo), users
+
+        auth = WEBrick::HTTPAuth::DigestAuth.new(
+          :Realm => realm, :UserDB => htdigest,
+          :Algorithm => 'MD5',
+          :Logger => server.logger,
+          :Qop => %w(auth-int),
+        )
+        server.mount_proc(path){|req, res|
+          auth.authenticate(req, res)
+          res.body = "bbb"
+        }
+        Net::HTTP.start(addr, port) do |http|
+          post = Net::HTTP::Post.new(path)
+          params = {}
+          data = 'hello=world'
+          body = StringIO.new(data)
+          post.content_length = data.bytesize
+          post['Content-Type'] = 'application/x-www-form-urlencoded'
+          post.body_stream = body
+
+          http.request(post) do |res|
+            assert_equal('401', res.code, log.call)
+            res["www-authenticate"].scan(DIGESTRES_) do |key, quoted, token|
+              params[key.downcase] = token || quoted.delete('\\')
+            end
+             params['uri'] = "http://#{addr}:#{port}#{path}"
+          end
+
+          body.rewind
+          cred = credentials_for_request('foo', 'Hunter3', params, body)
+          post['Authorization'] = cred
+          post.body_stream = body
+          http.request(post){|res|
+            assert_equal('401', res.code, log.call)
+            assert_not_equal("bbb", res.body, log.call)
+          }
+
+          body.rewind
+          cred = credentials_for_request('foo', 'Hunter2', params, body)
+          post['Authorization'] = cred
+          post.body_stream = body
+          http.request(post){|res| assert_equal("bbb", res.body, log.call)}
+        end
+      }
+    }
+  end
+
   private
-  def credentials_for_request(user, password, params)
+  def credentials_for_request(user, password, params, body = nil)
     cnonce = "hoge"
     nonce_count = 1
     ha1 = "#{user}:#{params['realm']}:#{password}"
-    ha2 = "GET:#{params['uri']}"
+    if body
+      dig = Digest::MD5.new
+      while buf = body.read(16384)
+        dig.update(buf)
+      end
+      body.rewind
+      ha2 = "POST:#{params['uri']}:#{dig.hexdigest}"
+    else
+      ha2 = "GET:#{params['uri']}"
+    end
+
     request_digest =
       "#{Digest::MD5.hexdigest(ha1)}:" \
       "#{params['nonce']}:#{'%08x' % nonce_count}:#{cnonce}:#{params['qop']}:" \
diff --git a/test/webrick/test_httpserver.rb b/test/webrick/test_httpserver.rb
index ffebf7e843..f1d58b40f5 100644
--- a/test/webrick/test_httpserver.rb
+++ b/test/webrick/test_httpserver.rb
@@ -366,4 +366,71 @@ class TestWEBrickHTTPServer < Test::Unit::TestCase
     }
     assert_equal(requested, 1)
   end
+
+  def test_gigantic_request_header
+    log_tester = lambda {|log, access_log|
+      assert_equal 1, log.size
+      assert log[0].include?('ERROR headers too large')
+    }
+    TestWEBrick.start_httpserver({}, log_tester){|server, addr, port, log|
+      server.mount('/', WEBrick::HTTPServlet::FileHandler, __FILE__)
+      TCPSocket.open(addr, port) do |c|
+        c.write("GET / HTTP/1.0\r\n")
+        junk = "X-Junk: #{' ' * 1024}\r\n"
+        assert_raise(Errno::ECONNRESET, Errno::EPIPE) do
+          loop { c.write(junk) }
+        end
+      end
+    }
+  end
+
+  def test_eof_in_chunk
+    log_tester = lambda do |log, access_log|
+      assert_equal 1, log.size
+      assert log[0].include?('ERROR bad chunk data size')
+    end
+    TestWEBrick.start_httpserver({}, log_tester){|server, addr, port, log|
+      server.mount_proc('/', ->(req, res) { res.body = req.body })
+      TCPSocket.open(addr, port) do |c|
+        c.write("POST / HTTP/1.1\r\nHost: example.com\r\n" \
+                "Transfer-Encoding: chunked\r\n\r\n5\r\na")
+        c.shutdown(Socket::SHUT_WR) # trigger EOF in server
+        res = c.read
+        assert_match %r{\AHTTP/1\.1 400 }, res
+      end
+    }
+  end
+
+  def test_big_chunks
+    nr_out = 3
+    buf = 'big' # 3 bytes is bigger than 2!
+    config = { :InputBufferSize => 2 }.freeze
+    total = 0
+    all = ''
+    TestWEBrick.start_httpserver(config){|server, addr, port, log|
+      server.mount_proc('/', ->(req, res) {
+        err = []
+        ret = req.body do |chunk|
+          n = chunk.bytesize
+          n > config[:InputBufferSize] and err << "#{n} > :InputBufferSize"
+          total += n
+          all << chunk
+        end
+        ret.nil? or err << 'req.body should return nil'
+        (buf * nr_out) == all or err << 'input body does not match expected'
+        res.header['connection'] = 'close'
+        res.body = err.join("\n")
+      })
+      TCPSocket.open(addr, port) do |c|
+        c.write("POST / HTTP/1.1\r\nHost: example.com\r\n" \
+                "Transfer-Encoding: chunked\r\n\r\n")
+        chunk = "#{buf.bytesize.to_s(16)}\r\n#{buf}\r\n"
+        nr_out.times { c.write(chunk) }
+        c.write("0\r\n\r\n")
+        head, body = c.read.split("\r\n\r\n")
+        assert_match %r{\AHTTP/1\.1 200 OK}, head
+        assert_nil body
+      end
+    }
+  end
 end
-- 
2.17.1