Support per-backend role and destination in replication engine#2743
Support per-backend role and destination in replication engine#2743maeldonn wants to merge 3 commits into
Conversation
Hello maeldonn,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (68.37%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files
... and 2 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.5 #2743 +/- ##
===================================================
- Coverage 72.62% 72.27% -0.36%
===================================================
Files 202 202
Lines 13744 13764 +20
===================================================
- Hits 9982 9948 -34
- Misses 3752 3806 +54
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
36b459a to
98c878f
Compare
|
|
d5a834d to
83c75ef
Compare
|
|
3426739 to
4791b8b
Compare
|
4791b8b to
14ef8f1
Compare
|
PR Review Summary
Review by Claude Code |
14ef8f1 to
cfdd8fb
Compare
00094e7 to
0b2138b
Compare
|
PR Review Summary
Observations (non-blocking):
Review by Claude Code |
|
There was a problem hiding this comment.
LGTM on the surface, but def need someone with more hands-on exp with CRR to review in depth.
Also I found it hard to review (+2600, -600 on 20~ files), and the commit split did not help much. Wondering if perhaps future MRs could have smaller, atomic commits. Here there's a big commit and a test commit and then 2 review fixups commits (which I personally find make it even harder to review cause we have to go "oh there's an issue here" remember then "oh it's fixed in that upcoming commit" but we acted in retro that we want no squashes so OK).
| assert.strictEqual(entry.getReplicationTargetBucket({ site: 'siteB' }), 'bucket-b'); | ||
| }); | ||
|
|
||
| it('falls back to top-level role string when backend has none', () => { |
There was a problem hiding this comment.
Feels a bit weird to says "falls back to top-level", when _makeEntryWithBackends is called with no topLevel parameter. I think part of it is me not understanding the test, I guess test is fine but its just naming or structure maybe
There was a problem hiding this comment.
"Top-level" refers to replicationInfo.destination / .role (what per-backend entries override). The helper was hiding those behind defaults. I've hoisted them into TOP_LEVEL_DEST / TOP_LEVEL_ROLE constants so the fallback assertions reference them directly. Kept the test names since the wording matches the actual replicationInfo shape.
SylvainSenechal
left a comment
There was a problem hiding this comment.
Not much to comment, it's really nice that you were able to remove callbacks and turn some functions asynchronous.
This is hard to review, I think it's ok, I left 1 or 2 comments in Arsenal as I was reviewing here, but they are also not very important comments.
You already know but we will have to test this properly, what worries me the most is not people setting up brand new replication from this new branch, but breaking existing replication flow
ec8baf8 to
5df0134
Compare
|
LGTM — well-structured multi-destination support with clean backward-compatible design. Highlights:
No issues found. Review by Claude Code |
Read destination bucket and role from each backends[] entry instead of the shared top-level fields, so a single source object can be replicated to multiple CRR destinations with their own role. Legacy entries without per-backend fields keep working via top-level fallback in ObjectQueueEntry's site-aware getters; MongoQueueProcessor's oplog path now matches every applicable rule and dedups backends per the design's (site, destination, role) rule. Issue: BB-762
5df0134 to
c7ab73a
Compare
| new ObjectQueueEntry(bucket, objectKey, mdObj); | ||
| queueEntry.setSite(entry.getSite()); | ||
| const queueEntry = new ObjectQueueEntry(bucket, objectKey, mdObj) | ||
| .setReplicationBackend({ site: entry.getSite() }); |
There was a problem hiding this comment.
With multi-destination support, two backends can share the same site but differ in destination and role. Here only site is stamped on the refreshed entry, so getReplicationBackend() returns { site, destination: undefined, role: undefined }.
Downstream, _pushToCRRRetryKafkaTopics calls toRetryEntry(backend) which uses _findBackend(backend) — with destination and role both undefined, it will match the first backend for this site, potentially retrying the wrong one in a multi-destination setup.
Consider propagating destination and role from the API request (or from the entry's metadata backends) so the retry path can disambiguate.
— Claude Code
| if (roles.length !== 2) { | ||
| log.error('expecting two roles separated by a ' + | ||
| 'comma in bucket replication configuration', | ||
| if (roles.length > 2) { |
There was a problem hiding this comment.
The check was relaxed from roles.length !== 2 to roles.length > 2, which now allows single-role bucket configs (roles.length === 1). When no matching rule with Account is found (line 314), the fallback sets expectedDestRole = roles[1] which would be undefined for a single-role config. Since the entry always has exactly two roles (validated at line 236), entryRoles[1] would be defined, so undefined !== entryRoles[1] would always evaluate to true and reject with BadRole.
If single-role bucket configurations are intended to be supported (with per-rule Account-based role derivation), this fallback path should handle the roles.length === 1 case explicitly — e.g. skip the literal comparison when roles[1] is undefined and matchingRule wasn't found, or error clearly instead of falling through to a misleading BadRole.
— Claude Code
| status: 'PENDING', | ||
| backends, | ||
| content, | ||
| role: ReplicationConfiguration.resolveSourceRole(bucketRepInfo.role), |
There was a problem hiding this comment.
setReplicationInfo no longer sets top-level destination, storageClass, or storageType — those now live per-backend. Since replicationInfo is part of the persisted object metadata (written to MongoDB), any external consumers of this metadata (cloudserver S3 HEAD responses, other services) that read these top-level fields will see defaults/undefined after this change.
Worth confirming that the arsenal ObjectMD upgrade (8.4.7) and cloudserver handle the missing top-level fields gracefully, and that rolling upgrades between the old and new metadata format won't cause issues.
— Claude Code
|
PR Review Summary Well-structured PR that adds per-backend role and destination support for multi-destination CRR. The core refactor — moving from site-string routing to Findings:
Positive notes:
Review by Claude Code |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
Read destination bucket and role from each backends[] entry instead of the shared top-level fields, so a single source object can be replicated to multiple CRR destinations with their own role. Legacy entries without per-backend fields keep working via top-level fallback in ObjectQueueEntry's site-aware getters; MongoQueueProcessor's oplog path now matches every applicable rule and dedups backends per the design's (site, destination, role) rule.
Issue: BB-762