fix(storage): Update offset on resumable upload retry #12086

cjc25 · 2025-04-30T15:19:23Z

When resumable uploads retry, they may observe a flush offset which is past the start of the current send (in fact the whole send might have already completed). In that case, we avoided re-sending unnecessary data, but we didn't update the offset to account for the data not sent.

tritone

Talked with @cjc25 ; test coverage for this will come via a fix in storage-testbench.

cjc25 · 2025-04-30T17:56:47Z

The test failure here is only on "earliest version" (go 1.23.6) not "latest version" (go 1.24.0):

=== RUN   TestRetryTimeoutEmulated
=== RUN   TestRetryTimeoutEmulated/grpc
    client_test.go:2228: GetBucket: got unexpected error: rpc error: code = DeadlineExceeded desc = context deadline exceeded; want 503
    client_test.go:2233: GetBucket: got unexpected error rpc error: code = DeadlineExceeded desc = context deadline exceeded, want to match DeadlineExceeded.
=== RUN   TestRetryTimeoutEmulated/http
--- FAIL: TestRetryTimeoutEmulated (0.42s)
    --- FAIL: TestRetryTimeoutEmulated/grpc (0.21s)
    --- PASS: TestRetryTimeoutEmulated/http (0.21s)

Which is... surprising :). Maybe just flaky comparison, but I don't think related to this PR. I can look later.

tritone · 2025-04-30T18:04:07Z

The test failure here is only on "earliest version" (go 1.23.6) not "latest version" (go 1.24.0):

=== RUN   TestRetryTimeoutEmulated
=== RUN   TestRetryTimeoutEmulated/grpc
    client_test.go:2228: GetBucket: got unexpected error: rpc error: code = DeadlineExceeded desc = context deadline exceeded; want 503
    client_test.go:2233: GetBucket: got unexpected error rpc error: code = DeadlineExceeded desc = context deadline exceeded, want to match DeadlineExceeded.
=== RUN   TestRetryTimeoutEmulated/http
--- FAIL: TestRetryTimeoutEmulated (0.42s)
    --- FAIL: TestRetryTimeoutEmulated/grpc (0.21s)
    --- PASS: TestRetryTimeoutEmulated/http (0.21s)

Which is... surprising :). Maybe just flaky comparison, but I don't think related to this PR. I can look later.

This is somewhat surprising for the emulator but might just be a flake; I triggered a rerun.

cjc25 · 2025-05-01T02:19:30Z

The test failure here is only on "earliest version" (go 1.23.6) not "latest version" (go 1.24.0):
=== RUN   TestRetryTimeoutEmulated
=== RUN   TestRetryTimeoutEmulated/grpc
    client_test.go:2228: GetBucket: got unexpected error: rpc error: code = DeadlineExceeded desc = context deadline exceeded; want 503
    client_test.go:2233: GetBucket: got unexpected error rpc error: code = DeadlineExceeded desc = context deadline exceeded, want to match DeadlineExceeded.
=== RUN   TestRetryTimeoutEmulated/http
--- FAIL: TestRetryTimeoutEmulated (0.42s)
    --- FAIL: TestRetryTimeoutEmulated/grpc (0.21s)
    --- PASS: TestRetryTimeoutEmulated/http (0.21s)
Which is... surprising :). Maybe just flaky comparison, but I don't think related to this PR. I can look later.
This is somewhat surprising for the emulator but might just be a flake; I triggered a rerun.

The issue is that we check specifically that the error is context.DeadlineExceeded, but this is a gRPC-layer DeadlineExceeded. I don't think that can be reliable: the context deadline is propagated to the server, so now there are two clocks, and the server might return a DeadlineExceeded error before the local context is actually cancelled.

I'll send a PR to check for gRPC errors too.

cjc25 · 2025-05-01T02:49:04Z

#12092 to fix the flake

When resumable uploads retry, they may observe a flush offset which is past the start of the current send (in fact the whole send might have already completed). In that case, we avoided re-sending unnecessary data, but we didn't update the offset to account for the data not sent.

cjc25 requested review from a team as code owners April 30, 2025 15:19

cjc25 changed the title ~~fix: Update offset on resumable upload retry~~ fix(storage): Update offset on resumable upload retry Apr 30, 2025

tritone approved these changes Apr 30, 2025

View reviewed changes

Merge branch 'main' into resumable-upload-grpc-fix

c9698dc

tritone added the automerge Merge the pull request once unit tests and other checks pass. label Apr 30, 2025

Merge branch 'main' into resumable-upload-grpc-fix

e370b9c

tritone added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 30, 2025

kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 30, 2025

kick go vet check

7dfd299

gcf-merge-on-green bot merged commit 6ce8fe5 into googleapis:main May 1, 2025
8 checks passed

gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label May 1, 2025

release-please bot mentioned this pull request Apr 30, 2025

chore(main): release storage 1.53.0 #12052

Merged

cjc25 deleted the resumable-upload-grpc-fix branch May 1, 2025 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(storage): Update offset on resumable upload retry #12086

fix(storage): Update offset on resumable upload retry #12086

cjc25 commented Apr 30, 2025

Uh oh!

tritone left a comment

Uh oh!

cjc25 commented Apr 30, 2025

Uh oh!

tritone commented Apr 30, 2025

Uh oh!

Uh oh!

cjc25 commented May 1, 2025

Uh oh!

cjc25 commented May 1, 2025

Uh oh!

Uh oh!

fix(storage): Update offset on resumable upload retry #12086

fix(storage): Update offset on resumable upload retry #12086

Conversation

cjc25 commented Apr 30, 2025

Uh oh!

tritone left a comment

Choose a reason for hiding this comment

Uh oh!

cjc25 commented Apr 30, 2025

Uh oh!

tritone commented Apr 30, 2025

Uh oh!

Uh oh!

cjc25 commented May 1, 2025

Uh oh!

cjc25 commented May 1, 2025

Uh oh!

Uh oh!