Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.takes.Request.body().read() never returns -1 to indicate end of request body #1116

Open
golszewski86 opened this issue Jan 14, 2022 · 7 comments

Comments

@golszewski86
Copy link
Contributor

Consider the following example of Takes server and HTTP made to it:

package takes.test;

import com.jcabi.http.request.JdkRequest;
import java.io.IOException;
import java.io.InputStream;
import java.net.SocketTimeoutException;
import java.time.Duration;
import java.util.concurrent.atomic.AtomicBoolean;
import lombok.AllArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.takes.Request;
import org.takes.Response;
import org.takes.Take;
import org.takes.http.Exit;
import org.takes.http.FtBasic;
import org.takes.rs.RsEmpty;

/**
 *
 * @since 1.0
 */
@Slf4j
public class TakesTimeoutTest {

    public static void main(final String[] args) throws Exception {
        Thread.currentThread().setName("Client");
        final AtomicBoolean done = new AtomicBoolean(false);
        final Thread server = new Thread(new Server(done::get), "Server");
        server.start();
        final int timeout = (int) Duration.ofSeconds(5).toMillis();
        try {
            new JdkRequest("http://localhost:8080/")
                .method("POST")
                .body().set("bar").back()
                .timeout(timeout, timeout)
                .fetch();
            log.info("Received response");
        } catch (final IOException exc) {
            if (exc.getCause() instanceof SocketTimeoutException) {
                log.error("Request timed out!");
            } else {
                throw exc;
            }
        } finally {
            done.set(true);
        }
        server.join();
    }

    @AllArgsConstructor
    private static final class Server implements Runnable {

        private final Exit exit;

        @Override
        public void run() {
            try {
                log.info("Starting server on port 8080");
                new FtBasic(
                    new Take() {
                        @Override
                        public Response act(final Request req) throws Exception {
                            log.info("Received request");
                            final InputStream body = req.body();
                            int character;
                            while ((character = body.read()) >= 0) {
                                log.info("Read: {}", (char) character);
                            }
                            log.info("Returning response");
                            return new RsEmpty();
                        }
                    }, 8080
                ).start(this.exit);
            } catch (final Exception exc) {
                log.error("Server exception", exc);
            } finally {
                log.info("Server stopped");
            }
        }
    }
}

What it does

  • It starts a Takes HTTP server on port 8080
  • The server accepts any HTTP requests, logs information about received request, logs every character from the request body (assuming it is a plain text body), and logs information about returned response
  • The server runs until AtomicBoolean done value is set to true
  • Then it executes a HTTP request to localhost:8080 with body bar and timeout 5 seconds
  • When a response from server is received or request times out, it sets value of AtomicBoolean done to true, effectively stopping the server

What is the expected result

The expected result would be application logging the following information:

2022-01-14 08:24:45.491 [Server] INFO takes.test.TakesTimeoutTest - Starting server on port 8080
2022-01-14 08:24:45.526 [Server] INFO takes.test.TakesTimeoutTest - Received request
2022-01-14 08:24:45.527 [Server] INFO takes.test.TakesTimeoutTest - Read: b
2022-01-14 08:24:45.527 [Server] INFO takes.test.TakesTimeoutTest - Read: a
2022-01-14 08:24:45.527 [Server] INFO takes.test.TakesTimeoutTest - Read: r
2022-01-14 08:24:45.527 [Server] INFO takes.test.TakesTimeoutTest - Returning response
2022-01-14 08:24:45.538 [Client] INFO com.jcabi.http.request.BaseRequest - #fetch(POST localhost:8080 /): [204 No Content] in 34ms
2022-01-14 08:24:45.539 [Client] INFO takes.test.TakesTimeoutTest - Received response
2022-01-14 08:24:46.532 [Server] INFO takes.test.TakesTimeoutTest - Server stopped
  • The server starts
  • The client makes a HTTP request
  • The server logs information about received request, logs request body and then information about returned response
  • The client logs information about received response
  • Server stops

What is the actual result

2022-01-14 08:28:19.812 [Server] INFO takes.test.TakesTimeoutTest - Starting server on port 8080
2022-01-14 08:28:19.849 [Server] INFO takes.test.TakesTimeoutTest - Received request
2022-01-14 08:28:19.850 [Server] INFO takes.test.TakesTimeoutTest - Read: b
2022-01-14 08:28:19.850 [Server] INFO takes.test.TakesTimeoutTest - Read: a
2022-01-14 08:28:19.851 [Server] INFO takes.test.TakesTimeoutTest - Read: r
2022-01-14 08:28:24.834 [Server] INFO takes.test.TakesTimeoutTest - Returning response
2022-01-14 08:28:24.834 [Client] ERROR takes.test.TakesTimeoutTest - Request timed out!
2022-01-14 08:28:24.847 [Server] INFO takes.test.TakesTimeoutTest - Server stopped
  • The server starts
  • The client makes a HTTP request
  • The server logs information about received request, logs request body (at 08:28:19)
  • Nothing is being logged for 5 seconds (time span equal to the defined request timeout)
  • Client logs information about request timeout and server logs information about returned response (at 08:28:24 i.e. 5 seconds after the last character from request body has been read)
  • Server stops

What happens

The method java.io.InputStream.read(), according to the Java documentation, returns next byte from the stream, or -1 if the end of the stream has been reached However, it blocks until more input is available. In case of the Takes framework, the read() method in eventually delegated to the native java.net.SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) method which blocks after the request body ("bar" in this case) has been read.

Basically this is the correct behavior for a TCP connection, because while the connection remains open the server socket does not know if the client is going to send more data, or when to expect more data. This is why the socket input stream blocks indefinitely (or until defined timeout). However, the client has already sent all the data and is awaiting a response.

After 5 seconds the client decides to close the connection because of the timeout. Then the java.io.InputStream.read() method finally returns -1 because now the socket "knows" the end of the stream has been reached. The server then prints "Returning response" and returns HTTP 204 No Content response. However, on the client side the request has already timed out and the client will not process response from the server.

What should happen

The TCP socket may not know whether to expect more data coming from the client, but the HTTP protocol features a Content-Length header containing information about number of bytes in the request body. The java.net.SocketInputStream is not aware about HTTP protocol, but the Takes framework should be. The org.takes.Request.body() method should return an instance of InputStream that is aware of the Content-Length header value, and its read() method returns -1 if the number of bytes equal to the Content-Length header value has been read.

Note this has been implemented in the org.takes.rq.RqPrint class, which uses org.takes.rq.RqChunk and org.takes.rq.RqLengthAware classes to stop reading from the input stream once the expected number of bytes has been received; however, org.takes.http.FtBasic (or rather the underlying org.takes.http.BkBasic) return a org.takes.Request instance with InputStream simply delegating to the native TCP stream.

@yegor256
Copy link
Owner

@golszewski86 many thanks for the report! do you know how to fix this? would be great if you can submit a pull request

@laeubi
Copy link

laeubi commented Jun 6, 2024

By the way it is even worse, as the content (e.g. POST) is never consumed the next request in a keep alive session will simply fill up the buffers... so additionally to providing a stream that only reads either chunked or content length data, takes should take care that additional data is disposed (unless connection-close is send).

@laeubi
Copy link

laeubi commented Jun 10, 2024

I now created the following wrapper, maybe worth to add to the library directly?

/**
 * Request decorator, to consume the body in either content-length or chucked-encoding
 *
 */
public class RqBody extends RqWrap {

	public RqBody(Request req) {
		super(new RqChunk(new RqLengthAware(req)));
	}

}

@laeubi
Copy link

laeubi commented Jun 12, 2024

I found another issue, if you ever close the stream (e.g. try with resources) this closes the underlying socket as well :-\

Improved decorator:

/**
 * Request decorator, to consume the body in either content-length or chucked-encoding
 *
 */
public class RqBody extends RqWrap {

	public RqBody(Request req) {

		super(new RequestOf(req::head, () -> safeClose(new RqChunk(new RqLengthAware(req)).body())));
	}

	private static final InputStream safeClose(InputStream original) {

		return new FilterInputStream(original) {


			@Override
			public void close() throws IOException {

				// do not close but instead discard any data that is left...
				while(read() > -1) {
					// discard
				}
			}
		};
	}

}

@yegor256
Copy link
Owner

yegor256 commented Aug 2, 2024

@laeubi maybe you can help and fix this issue?

@laeubi
Copy link

laeubi commented Aug 2, 2024

Yes I'll try to came up with a possible fix

@laeubi
Copy link

laeubi commented Aug 2, 2024

Just as a reference from the Spec:

Http 1.0:

Closing the connection cannot be used to indicate the end of a request body, since it leaves no possibility for the server to send back a response. Therefore, HTTP/1.0 requests containing an entity body must include a valid Content-Length header field. If a request contains an entity body and Content-Length is not specified, and the server does not recognize or cannot calculate the length from other fields, then the server should send a 400 (bad request) response.

Http 1.1:

A user agent SHOULD send Content-Length in a request when the method defines a meaning for enclosed content and it is not sending Transfer-Encoding. For example, a user agent normally sends Content-Length in a POST request even when the value is 0 (indicating empty content). A user agent SHOULD NOT send a Content-Length header field when the request message does not contain content and the method semantics do not anticipate such data.

A user agent that sends a request that contains a message body MUST send either a valid Content-Length header field or use the chunked transfer coding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants