+
Skip to content

Conversation

mrForza
Copy link
Contributor

@mrForza mrForza commented Aug 12, 2025

Before this patch "Finish bucket recovery step ..." logs were printed at
the end of recovery even if no buckets were successfully recovered, it led
to unnecessary log entries. This patch fixes the issue by adding an
additional check for the number of recovered buckets.

Closes #212

NO_DOC=bugfix

@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch from 6b1057d to 64cc837 Compare August 15, 2025 09:27
@mrForza mrForza requested a review from Serpentian August 15, 2025 09:42
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the comments for the first two commits, more comments are coming later) Thank you for working on this, good logging is crucial and allows us to investigate, what happened during incidents

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Aug 20, 2025
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch 3 times, most recently from 5a8b3f8 to f5c25f7 Compare August 22, 2025 15:52
@mrForza mrForza assigned Serpentian and unassigned mrForza Aug 23, 2025
@mrForza mrForza requested a review from Serpentian August 23, 2025 13:17
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, shit. I forgot to send the last message of review, I'm very sorry

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Aug 25, 2025
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch 2 times, most recently from 04c506f to ccff54f Compare September 10, 2025 07:47
@mrForza mrForza assigned Serpentian and unassigned mrForza Sep 10, 2025
@mrForza mrForza requested a review from Serpentian September 10, 2025 08:07
@Serpentian Serpentian assigned mrForza and unassigned Serpentian Sep 15, 2025
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch 3 times, most recently from 46add65 to a1c095b Compare September 17, 2025 13:22
@mrForza mrForza assigned Serpentian and unassigned mrForza Sep 17, 2025
@mrForza mrForza requested a review from Serpentian September 19, 2025 18:23
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the final comments I have, the patch is pretty clean now)

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Sep 30, 2025
Before this patch "Finish bucket recovery step ..." logs were printed at
the end of recovery even if no buckets were successfully recovered. It led
to unnecessary log records. This patch fixes the issue by adding an
additional check for the number of recovered buckets.

Part of tarantool#212

NO_DOC=bugfix
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch 2 times, most recently from 871197d to 489b425 Compare October 3, 2025 08:18
@mrForza mrForza requested a review from Serpentian October 3, 2025 09:46
@mrForza mrForza assigned Serpentian and unassigned mrForza Oct 3, 2025
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final nits

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Oct 8, 2025
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch from 489b425 to fce8f28 Compare October 9, 2025 09:21
@mrForza mrForza removed their assignment Oct 10, 2025
@mrForza mrForza requested a review from Serpentian October 10, 2025 12:40
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but this one is critical, overlooked it somehow on the prev iterations(

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Oct 13, 2025
This patch introduces logging of buckets' ids which were recovered
during recovery stage of storage.

Part of tarantool#212

NO_DOC=bugfix
This patch adds rebalancer routes' logging. The log file now
includes information about the source storage, the number of
buckets, and the destination storage where the buckets will
be moved.

Since the rebalancer service has changed logging of routes that
were sent, we change the `rebalancer/rebalancer.test.lua` and
`rebalancer/stress_add_remove_several_rs.test.lua` tests.

Part of tarantool#212

NO_DOC=bugfix
Before this patch the function `rebalancer_download_states` didn't
return information about replicaset from which the states could not
be downloaded. As a result, the log "Some buckets are not active
..." lacks of valuable information about unhealthy replicaset.

Now, we return `(replicaset.id, nil)` instead of `nil` in case when
rebalancer can't download state from this replicaset. Also we add
replicaset.id in "Some buckets are not active ..." log.

Also we change `rebalancer/rebalancer.test.lua` test which expected
the old "Some buckets are not active" log without replicaset.id.

Closes tarantool#212

NO_DOC=bugfix
@mrForza mrForza force-pushed the mrforza/gh-212-improvement-of-rebalancer-logging branch from fce8f28 to dcfa98a Compare October 14, 2025 14:18
@mrForza mrForza requested a review from Serpentian October 14, 2025 14:39
@mrForza mrForza assigned Serpentian and unassigned mrForza Oct 14, 2025
local hanged_bucket_id_2 = vtest.storage_first_bucket(g.replica_1_a)
start_bucket_move(g.replica_1_a, g.replica_2_a, hanged_bucket_id_2)
t.helpers.retrying({}, function()
t.assert(g.replica_1_a:grep_log('Error during recovery of bucket 1'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's better to wakeup the recovery there in the loop. And in other places, please. E.g. in the test_rebalancer_routes_logging rebalancer should be waken up, when waiting for the Balanced ok

{timeout = consts.REBALANCER_GET_STATE_TIMEOUT})
if state == nil then
return
return nil, replicaset.id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be carefully rebased, now error is returned from the rebalancer_download_states. I'd propose to introduce the new error code, which will say, that the storage doesn't have all bucket in proper state, and set err there if it's nil (it'll happen, in case the function really returned nil).

In that err (or any other there) we'll just set the err.replicaset, smth like that:

err.replicaset = rs_id

And we'll print the whole err, not just the replicaset's id

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve logging of rebalancer and recovery

3 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载