Fix the FileBasedDeadLetterQueueReconsumer dup issue #2476

liferoad · 2025-06-22T15:08:34Z

This PR changes the process from a "match -> read -> delete" sequence to a "match -> atomic rename -> read -> delete" sequence.

MoveFiles:

A new MoveFiles step is introduced, which renames each matched dead-letter file by adding a tmp- prefix to its name. his atomic "move" operation effectively claims the file, ensuring that subsequent scans won't find the original file and attempt to process it again.

After the files are moved, a Reshuffle transform is applied to guarantee that the rename operation is completed.

codecov · 2025-06-22T15:11:45Z

Codecov Report

Attention: Patch coverage is 88.46154% with 3 lines in your changes missing coverage. Please review.

Project coverage is 49.62%. Comparing base (43be033) to head (d893e98).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...v2/cdc/dlq/FileBasedDeadLetterQueueReconsumer.java	88.46%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##               main    #2476    +/-   ##
==========================================
  Coverage     49.62%   49.62%            
- Complexity     4808     5157   +349     
==========================================
  Files           941      941            
  Lines         57661    57675    +14     
  Branches       6233     6233            
==========================================
+ Hits          28614    28623     +9     
- Misses        27007    27010     +3     
- Partials       2040     2042     +2

Components	Coverage Δ
spanner-templates	`69.94% <ø> (-0.01%)`	⬇️
spanner-import-export	`68.61% <ø> (-0.03%)`	⬇️
spanner-live-forward-migration	`78.77% <ø> (ø)`
spanner-live-reverse-replication	`77.36% <ø> (ø)`
spanner-bulk-migration	`87.89% <ø> (ø)`

Files with missing lines	Coverage Δ
...v2/cdc/dlq/FileBasedDeadLetterQueueReconsumer.java	`72.54% <88.46%> (+4.36%)`	⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

damccorm · 2025-07-14T13:49:50Z

...n/src/main/java/com/google/cloud/teleport/v2/cdc/dlq/FileBasedDeadLetterQueueReconsumer.java

+
+        PCollection<ResourceId> movedFiles =
+            input
+                .apply("MoveFiles", ParDo.of(new MoveFiles()))


I think this change introduces the possibility of data loss which is worse. Specifically, we could run into the following scenario:

Bundle x contains file-1, file-2, file-3

file-1 is successfully renamed to tmp-file-1, but the bundle fails on renaming file-2

file-1 is now orphaned as tmp-file-1. I guess the way it is written, we would actually get recurrent errors which is a little better, but still not ideal.

Instead, I'd propose the following tweak:

We get a timestamp FOO associated with each generate sequence firing. We already get this for free, we'd just need to propogate it through

Instead of renaming the file tmp-<file name>, we rename it tmp-FOO-<file name>

Instead of only looking for <file name> when doing the rename operation, we look for <file name> and rename it OR if <file name> is not present, we look for tmp-FOO-<file name> and use that instead. If tmp-FOO-<file name> and <file name> are both not present, we log and move on, assuming that something else has claimed the file.

So the algorithm would be:

TriggerConsumeDLQ, AsFilePattern, MatchFiles, <new extract_ts function> -> (ts, file) tuples in MoveFiles: renamed_file_name = "tmp-${ts}-${original_file_name}" if exists(renamed_file_name): return if !exists(original_file_name): log.warning('skipping, handled by different pass') mv(original_file_name, renamed_file_name)

Then the rest of the logic would stay the same.

damccorm · 2025-07-14T13:53:57Z

...n/src/main/java/com/google/cloud/teleport/v2/cdc/dlq/FileBasedDeadLetterQueueReconsumer.java

                ParDo.of(new MoveAndConsumeFn(fileContents, fileMetadata))
                    .withOutputTags(fileContents, TupleTagList.of(fileMetadata)));

        results
            .get(fileMetadata)
-            .setCoder(MetadataCoder.of())
+            .setCoder(ResourceIdCoder.of())


I think this is inevitable, but we are breaking update compatibility. Maybe worth calling out in the PR title so that it gets into the resource notes.

Fix the FileBasedDeadLetterQueueReconsumer dup issue

d893e98

pull-request-size bot added the size/M label Jun 22, 2025

liferoad added the ignore-for-release label Jun 22, 2025

Merge branch 'main' into fix-dlq-dup

3b27c6b

liferoad marked this pull request as ready for review July 12, 2025 19:54

liferoad requested a review from damccorm July 12, 2025 19:54

damccorm added bug-fix and removed ignore-for-release labels Jul 14, 2025

damccorm reviewed Jul 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the FileBasedDeadLetterQueueReconsumer dup issue #2476

Fix the FileBasedDeadLetterQueueReconsumer dup issue #2476

Uh oh!

liferoad commented Jun 22, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 22, 2025 •

edited

Loading

Uh oh!

damccorm Jul 14, 2025

Uh oh!

damccorm Jul 14, 2025

Uh oh!

Uh oh!

Fix the FileBasedDeadLetterQueueReconsumer dup issue #2476

Are you sure you want to change the base?

Fix the FileBasedDeadLetterQueueReconsumer dup issue #2476

Uh oh!

Conversation

liferoad commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

damccorm Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

damccorm Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liferoad commented Jun 22, 2025 •

edited

Loading

codecov bot commented Jun 22, 2025 •

edited

Loading