refactor(storage): Pipeline gRPC writes. #12422

cjc25 · 2025-06-07T03:43:26Z

Modify the gRPC writer to send additional data while waiting for the current chunk to flush. This is a substantial refactor.

Per the Go io.Writer interface contract, we must never modify or retain the slice that the caller provides to Write. However, that doesn't mean we have to copy every byte into a writer-controlled buffer: we can refer to the byte slice in place. Therefore, if callers call Write() with more bytes than the chunk size, we can send them to the service immediately as long as we don't return from Write() until we no longer need the caller's slice.

By sending data as soon as callers provide it, we get a substantial single-stream throughput increase for large objects. This is especially evident when callers provide large buffers to Write() calls.

There are two followup investigations made possible by this refactor. The first is to flush less frequently when the caller provides write slices much larger than the chunk size. This may provide an even larger throughput improvement when Write() is called with a large buffer, and is straightforward to implement.

The second is to flush more frequently when the caller provides write slices much smaller than the chunk size. (E.g. split a 16MiB chunk into 2x8MiB sub-chunks, and flush each when they're full.) This can avoid pipeline stalls in more scenarios, by increasing the likelihood that part of the chunk is available to buffer data without waiting for a flush acknowledgement.

Modify the gRPC writer to send additional data while waiting for the current chunk to flush. This is a substantial refactor. Per the Go io.Writer interface contract, we must never modify or retain the slice that the caller provides to Write. However, that doesn't mean we have to copy every byte into a writer-controlled buffer: we can refer to the byte slice in place. Therefore, if callers call Write() with more bytes than the chunk size, we can send them to the service immediately as long as we don't return from Write() until we no longer need the caller's slice. This is an in-place refactor which retains the existing flush behavior, so its benefits are most obvious in somewhat unlikely scenarios. For example: a 100MiB upload to a bucket in a remote region with chunk size 256KiB can see a latency reduction of ~35%. There are two followup investigations made possible by this refactor. The first is to flush less frequently when the caller provides write slices much larger than the chunk size. This may provide an even larger throughput improvement for cases like the one above, and is straightforward to implement. The second is to flush more frequently when the caller provides write slices much smaller than the chunk size. (e.g. split a 16MiB chunk into 2x8MiB sub-chunks, and flush each when they're full.) This can avoid pipeline stalls in more scenarios, by increasing the likelihood that part of the chunk is available to buffer data without waiting for a flush acknowledgement inline.

storage/grpc_writer.go

tritone

Overall this is looking pretty good to me; only significant comment is the one about channel overhead for many small writes. Profiling this will help understand whether this contributes significant overhead.

I think a bigger question is how to test this to make sure it doesn't cause regressions. @vadlakondaswetha has already done some GCSFuse performance testing using this PR which has helped iron out some issues. We'll also need to do a close comparison of throughput, CPU and memory between the release candidate with this PR and the previous release.

Overall though this seems like a really good refactor and adds clarity to the flows around different types of writes.

tritone · 2025-06-24T17:20:52Z

storage/grpc_writer.go

+		w.initializeSender()
+	} else {
+		select {
+		case <-w.donec:


Is this case intended for if you send on a writer that's already closed? if so what is supposed to happen?

Either the writer is already closed, or the writer gets closed due to a permanent error before it can process the provided command. Basically this is how we handle asynchronous failures which the calling code may not have detected yet.

The underlying writer promises to set w.streamResult before closing w.donec

tritone · 2025-06-24T17:26:32Z

storage/grpc_writer.go

-			return true
+
+	done := make(chan struct{})
+	cmd := &gRPCWriterCommandWrite{p: p, done: done}


Did you profile this with many small-buffer writes? Just wanted to make sure that the extra overhead of the channel for each cmd is not too much of an issue.

Discussed this offline; sounds like it could be worth it to just preserve this outer channel between Write() calls rather than closing/recreating each time. But profiling will tell us what the actual overhead is.

See http://b/422440765#comment4 and comment 5 for more discussion. I'm inclined to merge like this and improve afterwards if necessary, if that's alright.

To summarize for any reader without Google internal bug access: there is a minor (few % on a very low base) CPU increase and therefore latency increase and IOPS decrease for 1-byte operations. Once op size goes above a de minimis value, this effect disappears and the throughput increase from pipelining results in slightly lower mean and tail latencies and higher IOPS. There is still a small % CPU increase, but since the write is now large enough to block on IO for more of the time, the CPU base is even lower.

We could trade the simplicity of a write completion channel per op for a somewhat less forgiving interface with a write completion channel per storage.Writer. This is consistent with the existing API contract based on io.PipeWriter, and gets us ~1/3 of the new CPU cost back. I think it's reasonable to do that in a focused followup PR.

Sounds good, thanks for the summary here.

tritone · 2025-06-25T18:06:06Z

storage/grpc_writer.go

+	}
+	w.streamSender = w.pickBufferSender()
+
+	runWriteLoop := func(ctx context.Context) error {


nit: This additional wrapper seems unneeded?

w.lastErr is just to persist the prior failure across multiple run() retries, and it's convenient to get that from the writeLoop return instead of remembering it each place we break out of that loop.

But you're right it doesn't have to be named: moved these two lines into the run() call directly as an anonymous function.

tritone · 2025-06-25T19:11:39Z

storage/grpc_writer.go

+type gRPCWriteRequestParams struct {
+	appendable   bool
+	bucket       string
+	routingToken *string


Is having an empty string routing token valid?

No it is not. Do you suggest switching this from *string to string and copying from e.RoutingToken in maybeHandleRedirectionError?

Review comments

fix typo

Don't retain user write buffers after returning from Write

tritone · 2025-07-08T20:40:57Z

storage/grpc_writer.go

 	}
 	requests := make(chan gRPCBidiWriteRequest, w.sendableUnits)
+	// only one request will be outstanding at a time.
+	requestAcks := make(chan struct{}, 1)


tritone · 2025-07-08T20:47:41Z

storage/grpc_writer.go

-			return true
+
+	done := make(chan struct{})
+	cmd := &gRPCWriterCommandWrite{p: p, done: done}


Sounds good, thanks for the summary here.

Correct issues with reduced ack tracking.

Update attr size during completions

cjc25 · 2025-08-14T16:13:58Z

@tritone IIUC this is ready to merge now

Oneshot writes did not report progress prior to googleapis#12422. This fixes them so that they also don't report progress after that. Also add an emulator test, since it turns out our first test of that behavior was in the integration tests!

Oneshot writes did not report progress prior to #12422. This fixes them so that they also don't report progress after that. Also add an emulator test, since it turns out our first test of that behavior was in the integration tests!

storage.Writer took an assumption that CloseWithError() could be called more than once, and was thread-safe with respect to concurrent Write(), Flush(), and Close() calls. This was not honored in the refactor in googleapis#12422. Modify Writer so that it is thread-safe to provide these behaviors, and support repeated Close() and CloseWithError() calls. To address this, we start the sender goroutine earlier, and gather the first buffer in that goroutine. It's possible that some workloads which gather less than one buffer worth of data with a sequence of small writes will observe a performance hit here, since those writes used to be direct copies but will now be a channel ping-pong. If that's an issue, it could be improved by wrapping the buffer in a mutex and doing more explicit concurrency control.

storage.Writer took an assumption that CloseWithError() could be called more than once, and was thread-safe with respect to concurrent Write(), Flush(), and Close() calls. This was not honored in the refactor in #12422. Modify Writer so that it is thread-safe to provide these behaviors, and support repeated Close() and CloseWithError() calls. To address this, we start the sender goroutine earlier, and gather the first buffer in that goroutine. It's possible that some workloads which gather less than one buffer worth of data with a sequence of small writes will observe a performance hit here, since those writes used to be direct copies but will now be a channel ping-pong. If that's an issue, it could be improved by wrapping the buffer in a mutex and doing more explicit concurrency control.

I'd like to pick up a [bug fix](googleapis/google-cloud-go#12839) and a [write performance improvement](googleapis/google-cloud-go#12422)

I'd like to pick up googleapis/google-cloud-go#12839 and googleapis/google-cloud-go#12422 This PR is the result of running `bazel run go get cloud.google.com/go/storage`

cjc25 requested review from a team as code owners June 7, 2025 03:43

product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Jun 7, 2025

BrennaEpp reviewed Jun 23, 2025

View reviewed changes

storage/grpc_writer.go Show resolved Hide resolved

storage/grpc_writer.go Outdated Show resolved Hide resolved

cjc25 added 2 commits June 23, 2025 13:28

Merge branch 'main' into grpc-writer-pipelining

a4dc4c7

fixup! refactor(storage): Pipeline gRPC writes.

f2c5e5c

tritone reviewed Jun 26, 2025

View reviewed changes

fixup! refactor(storage): Pipeline gRPC writes.

d5e2d6e

Review comments

cjc25 force-pushed the grpc-writer-pipelining branch from 76def25 to d5e2d6e Compare June 30, 2025 17:28

cjc25 added 3 commits June 30, 2025 17:36

fixup! refactor(storage): Pipeline gRPC writes.

d4dfe95

fix typo

Merge branch 'main' into grpc-writer-pipelining

4fd3018

fixup! refactor(storage): Pipeline gRPC writes.

9daa7f9

Don't retain user write buffers after returning from Write

tritone reviewed Jul 8, 2025

View reviewed changes

cjc25 added 2 commits July 9, 2025 01:24

fixup! refactor(storage): Pipeline gRPC writes.

e049161

Correct issues with reduced ack tracking.

fixup! refactor(storage): Pipeline gRPC writes.

3e46759

tritone mentioned this pull request Jul 9, 2025

feat(storage): send trailing checksums for gRPC resumable uploads #12477

Draft

cjc25 added 5 commits July 11, 2025 20:07

fixup! refactor(storage): Pipeline gRPC writes.

1f87b5e

Update attr size during completions

Merge branch 'main' into grpc-writer-pipelining

345b1de

Merge branch 'main' into grpc-writer-pipelining

69a87d9

Merge branch 'main' into grpc-writer-pipelining

8938b4c

Merge branch 'main' into grpc-writer-pipelining

c507760

tritone approved these changes Aug 19, 2025

View reviewed changes

tritone merged commit 1f2c5fe into googleapis:main Aug 19, 2025
9 checks passed

cjc25 deleted the grpc-writer-pipelining branch August 20, 2025 00:11

BrennaEpp mentioned this pull request Aug 21, 2025

storage: TestIntegration_CancelWrite failed #3688

Closed

cjc25 mentioned this pull request Aug 21, 2025

fix(storage): No progress report for oneshot write #12746

Merged

cjc25 mentioned this pull request Aug 22, 2025

fix(storage): Make Writer thread-safe. #12753

Merged

tritone mentioned this pull request Sep 23, 2025

chore(main): release storage 1.57.0 #12749

Merged

vanja-p added a commit to buildbuddy-io/buildbuddy that referenced this pull request Oct 10, 2025

Upgrade cloud.google.com/go/storage to v1.57.0

144a075

I'd like to pick up a [bug fix](googleapis/google-cloud-go#12839) and a [write performance improvement](googleapis/google-cloud-go#12422)

vanja-p mentioned this pull request Oct 10, 2025

Upgrade cloud.google.com/go/storage to v1.57.0 buildbuddy-io/buildbuddy#10475

Merged

refactor(storage): Pipeline gRPC writes. #12422

refactor(storage): Pipeline gRPC writes. #12422

Uh oh!

Conversation

cjc25 commented Jun 7, 2025

Uh oh!

Uh oh!

Uh oh!

tritone left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cjc25 commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants