这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@sephiroth-j
Copy link

Motivation:

Add a reusable handler that compresses (encodes) HTTP requests and close the long pending feature request #2132.

Modifications:

Adds a new ChannelOutboundHandler named HttpRequestCompressor to the io.netty.handler.codec.http package. The handler supports the same encodings as HttpContentEncoder for HTTP responses.

Result:

After the change, you will be able to compress the body of HTTP requests.

HttpClient
  .create()
  // add the handler
  .observe((con, newState) -> {
    // use CONFIGURED for HTTP/1.1 connections
    // use STREAM_CONFIGURED for HTTP/2 connections
    if (newState == HttpClientState.CONFIGURED || newState == HttpClientState.STREAM_CONFIGURED) {
      con.addHandlerLast(new HttpRequestCompressor("gzip"));
    }
  })
  .build();

Fixes #2132

@franz1981 franz1981 force-pushed the 4.2 branch 2 times, most recently from c0965ca to 5ca90a7 Compare June 28, 2025 22:33
@sephiroth-j sephiroth-j force-pushed the feature/add-http-request-compressor branch 2 times, most recently from 3f5d3ac to 44695d7 Compare July 4, 2025 18:54
Motivation:

Add a reusable handler that compresses (encodes) HTTP requests and close the long pending feature request netty#2132.

Modifications:

Adds a new `ChannelOutboundHandler` named `HttpRequestCompressor` to the `io.netty.handler.codec.http` package.
The handler supports the same encodings as `HttpContentEncoder` for HTTP responses.

Result:

After the change, you will be able to compress the body of HTTP requests.

```java
HttpClient
  .create()
  // add the handler
  .observe((con, newState) -> {
    // use CONFIGURED for HTTP/1.1 connections
    // use STREAM_CONFIGURED for HTTP/2 connections
    if (newState == HttpClientState.CONFIGURED || newState == HttpClientState.STREAM_CONFIGURED) {
      con.addHandlerLast(new HttpRequestCompressor("gzip"));
    }
  })
  .build();
```
@sephiroth-j sephiroth-j force-pushed the feature/add-http-request-compressor branch from 44695d7 to ac636d2 Compare July 9, 2025 20:11
@sephiroth-j sephiroth-j requested review from hyperxpro and yawkat July 9, 2025 20:20
* @see StandardCompressionOptions#deflate() default deflate options
*/
public HttpRequestCompressor(String preferredEncoding, int contentSizeThreshold,
CompressionOptions... compressionOptions) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use a builder pattern preemptively. Netty has a history of constructor parameters getting out of control over time.

// received HttpContent without previous HttpRequest ... likely because we skipped it
ctx.write(content, promise);
} else {
buf.writeBytes(content);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're leaking the content here, be it a ByteBuf or a HttpContent object.

This also buffers the contents uncontrollably. We should start streaming out compressed chunks as soon as we hit the minimum threshold. Imagine a big file upload going through here, for instance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how chunked HTTP requests work. Content encoding (compression) does not apply to each individual chunk, but to the entirety (the actual content). The receiver merges all chunks in the order in which they were received and applies the reverse operation to the content encoding (decompression). If you were to send individual, compressed chunks, the recipient would end up with garbage. This handler behaves in a similar way to a receiver: it waits until it has received all the chunks and then compresses the content (previously split into chunks).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to do full compression on chunks. Just stream the chunks through the compressor and on the LastHttpContent, finish the compression.

@chrisvest concern is valid. If someone is sending a 10G payload, we will end up with the whole compressed payload in memory before this handler receives LastHttpContent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler still has to wait and buffer the payload until the size reaches the configured threshold. And it can happen that the size is still below the threshold at the end. It also depends on the compressor: the ZstEncoder for example waits until flush is called and BrotliEncoder waits for close (its encode method does nothing).

So in the end it just gets more complicated and the result can still be the same: the content is buffered - either by the handler or the encoder. I would therefore like to leave it as it is.

Large files are best compressed in advance anyway and not by using on-the-fly compression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support streaming compression. If brotli doesn't support that, that's a bug in the brotli impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brotli4j supports streaming compression. It is already implemented in Netty response encoder.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compressed chunks are now written to the context as quickly as possible, i.e. as soon as the threshold value is reached.

The following situations are taken into account:

  • the threshold value is reached with the first chunk
  • the threshold is reached with a later chunk
  • the threshold is reached with the LastHttpContent chunk. In this case, the request is sent as a compressed FullHttpRequest.
  • the threshold is never reached. In this case, the request is sent as an uncompressed FullHttpRequest.

The hex values of the compressed content used in the test were checked with other tools to ensure that they can be properly decoded. The command was

echo -n "${hexdump}" | xxd -r -p | $command

where $command is one of the following commands

  • gzip -d
  • zstd -d
  • zlib-flate -uncompress
  • snappy -d
  • brotli -d

I have inserted line breaks for better readability. For zstd, br and snappy, you can clearly see the hex values of the individual characters used as chunks if you look at the last two characters of the encoded chunks.

Copy link
Contributor

@hyperxpro hyperxpro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some gotchas


private void handleChunkedHttpRequest(ChannelHandlerContext ctx, ChannelPromise promise, HttpRequest req) {
// only process requests that are allowed to have content
if (req.method().equals(HttpMethod.POST)
Copy link
Contributor

@hyperxpro hyperxpro Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC does not prevents GET from having a payload.

See: https://datatracker.ietf.org/doc/html/rfc7231#section-4.3.1

We should not have these checks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it is common practice that only the POST, PUT and PATCH methods have content. With GET in particular, the content is ignored or even rejected by most servers and clients. I think this is therefore a fair assumption and optimization. But I have to admit, if there are such checks for chunked requests, they should also be done for FullHttpRequest and they should be documented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change semantics. You can write an HTTP server in Netty to have a payload of 1 GB in a GET request. There is nothing to stop, as RFC does not either.

if (encoderChannel == null) {
buf.writeBytes(content);
// buffered content exceeds threshold, start streaming and send stored HttpRequest
if (buf.isReadable() && buf.readableBytes() >= contentSizeThreshold) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content Threshold applies to Content-Length of request, not individual chunks. If the Content-Length header is present, this logic will apply to the whole request (including upcoming chunks). If the request has Transfer-Encoding: chunked, it should always compress since there will be no visibility of content length.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content Threshold applies to Content-Length of request, not individual chunks.

Yes and no. For this handler, it is defined as the size of the content independent of the content-length header. The Javadoc for this parameter is very clear in this regard.

the size in byte the http body must have before compressing the request

this logic will apply to the whole request (including upcoming chunks).
That is exactly what it is intended for.

If this were ignored, the concept of a threshold for chunked requests would be pure nonsense. The idea of the threshold is to skip compression for small content, which would just be a waste of resources and produce an even bigger result than it originally was (as you can see from the examples in the test).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected behaviour. The threshold only applies if there is Content-Length. Nginx and others follow the same.

https://nginx.org/en/docs/http/ngx_http_gzip_module.html

You can have chunks that will be smaller than the threshold, and in that case, you cannot skip compressing them and send corrupted content into an already compressed request.

Say the threshold is 1 MB. The request will generate 5 chunks, the first 4 will be 2 MB, and the last one will be 100 KB. In this case, your logic will compress the first four chunks properly and skip the last one, resulting in a corrupted stream.

I'd suggest reading the Nginx source to get more in-depth content.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, your logic will compress the first four chunks properly and skip the last one, resulting in a corrupted stream.

No, as soon as the threshold value for all previous chunks together is reached, the subsequent chunks are also compressed. The channel with the encoder is only created when the threshold value is reached (handleHttpContent, line 291). If a new chunk is received, it is compressed immediately without being written to the internal buffer and without further checking whether the threshold value has been reached (handleHttpContent, line 304).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't find this approach scalable because we will keep holding onto buffers until the threshold is reached to determine whether to compress or not. We can have a threshold of 100 MB, send 10 chunks of 10 MB each, and it will keep on accumulating in memory until 100 MB is reached.

@chrisvest WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the trade-off. I just give the user the option to choose and let them decide.

Other implementations also have the option of compressing only certain mime types in order to skip compression for poorly compressible or already compressed content.

* @param in the remaining content to compress
* @param out the destination of the compressed content
*/
private void finishCompress(EmbeddedChannel encoderChannel, ByteBuf in, ByteBuf out) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need EmbeddedChannel?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Have a handler for compressing HTTP request.

5 participants