这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@Rakshith-R
Copy link
Contributor

Describe what this PR does

rbd: add support for changed block tracking RPCs

This commit adds support for GetMetadataAllocated
and GetMetadataDelta RPCs.
CSI Snapshot Metadata Service RPCs

Updates: #5346

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@mergify mergify bot added the component/rbd Issues related to RBD label May 30, 2025
@Rakshith-R Rakshith-R requested review from a team May 30, 2025 10:04
@Rakshith-R Rakshith-R force-pushed the add-snapshot-diff branch from b60a6f5 to 3892027 Compare June 2, 2025 07:06

baseRBDSnap, err := genSnapFromSnapID(ctx, baseSnapshotID, cr, req.GetSecrets())
if err != nil {
return status.Error(codes.Internal, err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to handle NotFound?

@Rakshith-R Rakshith-R force-pushed the add-snapshot-diff branch 2 times, most recently from 85a53a1 to 288a354 Compare June 2, 2025 10:43
@Rakshith-R Rakshith-R force-pushed the add-snapshot-diff branch 2 times, most recently from 56db368 to 6a3d211 Compare June 2, 2025 12:18
@Rakshith-R
Copy link
Contributor Author

Rakshith-R commented Jun 2, 2025

Thanks for the quick detailed review

The new main modifications are:

  • moved the new calls to snap_diff.go file
  • debug logs for feature support check errors
  • refactored to have a single flow with sendStreamResponse handling which response to send allocated or delta
  • modified return statement and ensured they are formatted and have correct error codes
  • added validation for req parameters
  • added check for snapshot's existence + returns correct error code now
  • added additional checks for startingOffset (add out of range error code as specified in the spec)
  • additional checks for maxResults and internal value to limit memory usage ([spec](spec allows us to send lower number of blocks as well))

failing golang lintci is being handled in #5351

Please take a look again

@mergify
Copy link
Contributor

mergify bot commented Jun 2, 2025

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1st pass going through this. Needs cleanup and clear splitting of internal RBD things and ControllerServer Snapshot usage. Extend the types.Snapshot interface with required functions, so that ControllerServer does not need to use rbdSnap types directly.


// streamDoneErrCode is a custom error code used to indicate that the
// stream was closed by the client.
streamDoneErrCode = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not use google.golang.org/grpc/codes constants? Or, maybe base it on return values of read(), like EOF and so on?

@Rakshith-R Rakshith-R force-pushed the add-snapshot-diff branch 3 times, most recently from 72390e5 to 4ff0a22 Compare June 5, 2025 12:26
)

type ErrorCode interface {
// Code returns the error code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment Code -> ErrorCode.

Can you also add a comment for the interface itself, with a small explanation how (which kind of errors) it can be used?

// StreamMetadata streams the block metadata for a snapshot.
// If baseRBDSnap is provided, it calculates the delta between the
// current snapshot and the base snapshot.
func (rbdSnap *rbdSnapshot) StreamMetadata(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling this function does too much. The rbdSnapshot struct should probably not send anything, but instead return []*csi.BlockMetadata to the caller which can then send it out? Would that not make it a cleaner implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling this function does too much.

Broke down this function to include multiple helpers

The rbdSnapshot struct should probably not send anything, but instead return []*csi.BlockMetadata to the caller which can then send it out? Would that not make it a cleaner implementation?

It's a streaming API, so we would be sending more than once.
Used this as inspiration to instead send down a func that only takes []*csi.BlockMetadata as input and send stream response internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks easier to understand already, thanks.

I understand that passing sendResponse makes things easier. It ties CSI+rbd functionality very closely together, which isn't very nice. But librbd.DiffIterateByIDConfig{} needs the callback, so I'm also not sure how to split the CSI/rbd things more. This is fine for now, maybe we can come up with an alternative approach later.

@Rakshith-R Rakshith-R force-pushed the add-snapshot-diff branch 7 times, most recently from 9c9afb4 to 755348e Compare June 20, 2025 11:25
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 8, 2025

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

queue

🛑 The pull request has been removed from the queue default

The pull request can't be updated.

You can take a look at Queue: Embarked in merge queue check runs for more details about the failure.

@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

This pull request has been removed from the queue for the following reason: pull request branch update failed.

The pull request can't be updated.

You should update or rebase your pull request manually. If you do, this pull request will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 8, 2025

@Mergifyio rebase

Signed-off-by: Rakshith R <rar@redhat.com>
Signed-off-by: Rakshith R <rar@redhat.com>
This commit adds support for GetMetadataAllocated
and GetMetadataDelta RPCs.

Signed-off-by: Rakshith R <rar@redhat.com>
@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

rebase

✅ Branch has been successfully rebased

@Madhu-1 Madhu-1 force-pushed the add-snapshot-diff branch from 9982594 to 9d66e6c Compare July 8, 2025 11:05
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 8, 2025

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

queue

🛑 The pull request has been removed from the queue default

The merge conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details about the failure.

@mergify mergify bot added the ok-to-test Label to trigger E2E tests label Jul 8, 2025
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.33

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.33

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.33

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jul 8, 2025
@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

This pull request has been removed from the queue for the following reason: checks failed.

The merge conditions cannot be satisfied due to failing checks:

You may have to fix your CI before adding the pull request to the queue again.
If you update this pull request, to fix the CI, it will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue instead, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@Rakshith-R
Copy link
Contributor Author

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2025

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@mergify mergify bot merged commit 66b3e23 into ceph:devel Jul 8, 2025
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/rbd Issues related to RBD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants