+
Skip to content

Conversation

AlexanderLanin
Copy link
Member

@AlexanderLanin AlexanderLanin commented Sep 1, 2025

DR-002 (Integration)

Large systems often span multiple repositories. Each repository can look “green” on its own, yet problems only show up when everything is combined. These late surprises slow down development and make debugging painful.

DR-002 turns a collection of separate repositories into a system that behaves like a single, continuously tested whole — ensuring the main line is always integrable across all components.

Proposed Approach

  • Every change in any repository is tested in combination with the rest of the system, not just in isolation.
  • There are two testing layers:
    • a fast feedback loop (lightweight tests that run on every pull request),
    • and a deeper validation (heavier tests run after merges or on a schedule).
  • This setup guarantees that developers can trust the system as a whole to consistently work.

Benefits

  • Problems across repositories are caught early.
  • Developers spend less time coordinating merges (“merge after me” scenarios disappear).
  • The project always has a “known good” baseline to fall back on, enabling stability while still moving fast.

Note: this concept is easily extendable to support multiple versions of S-CORE. But that's currently not required.


Rendered: https://eclipse-score.github.io/score/pr-1689/design_decisions/DR-002-infra.html

@Copilot Copilot AI review requested due to automatic review settings September 1, 2025 12:51
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a design document for implementing integration testing in a distributed monolith architecture. The document outlines a strategy for coordinating testing across multiple repositories that ship together as a single system.

  • Establishes workflows for single pull request testing, coordinated multi-repository changes, and post-merge full test suites
  • Introduces manifest-based system composition using (component, commit) pairs to enable reproducible builds
  • Provides GitHub Actions examples for implementing the integration testing workflows

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

github-actions bot commented Sep 1, 2025

The created documentation from the pull request is available at: docu-html

@AlexanderLanin
Copy link
Member Author

Basically agreed in infrastructure planning session. Feel free to document your opinion here formally with a PR review!

@thilo-schmitt
Copy link

Overall, this reads very well. I'm in support of this approach.

However, what I still don’t understand is how “coordinated multi-repo” works. Yes, I understand that the individual PRs shall be tagged with a common label. But how does the integration CI know that all PRs with that label are now present — more could still be added (or removed again, e.g. if I accidentally applied the wrong label and change it then); when is the changeset considered complete and the integration pipeline can run? Does it need a manual trigger? Do you use a time window of x minutes? Does the pipeline simply run for each individual PR in the coordinated set and simply fails n-1 times until the whole set (of n PRs) is there?

And then it would be nice if one could merge the entire coordinated set automatically with "one click" (if everything is green in each component as well as in the integration). In any case, we should strive for a way that ensures everything belonging to the coordinated set gets merged and nothing is forgotten, so that inconsistencies and non-"integrability" don’t creep back in. Maybe if you merge in one repo, then all PRs in all other repos that belong to the coordinated set are merged too?!

@opajonk
Copy link
Contributor

opajonk commented Sep 4, 2025

Overall, this reads very well. I'm in support of this approach.

However, what I still don’t understand is how “coordinated multi-repo” works. Yes, I understand that the individual PRs shall be tagged with a common label. But how does the integration CI know that all PRs with that label are now present — more could still be added (or removed again, e.g. if I accidentally applied the wrong label and change it then); when is the changeset considered complete and the integration pipeline can run? Does it need a manual trigger? Do you use a time window of x minutes? Does the pipeline simply run for each individual PR in the coordinated set and simply fails n-1 times until the whole set (of n PRs) is there?

Good points. There are some ways to do this technically in GitHub. I know it for "enterprise", I suspect it for "public". I would use "PR with head branch & base branch the same" as unique identifier. It can be the same on all participating repositories, can even be created automatically. Example:

  • I open a PR on communication with HEAD foo and BASE main (draft as a start) and do some work.
  • automation creates a draft PR on reference_integration. Due to that PR, CI runs whatever is required there. Automation creates feedback to my PR on communication. That feedback is made a required check to merge to main.
  • It can now be that e.g. I need to open another PR with HEAD foo and BASE main on "baselibs" to fix a compilation / test failure on reference_integration. OK, exactly what I wanted.

(very rough sketch, just to convey the idea)

And then it would be nice if one could merge the entire coordinated set automatically with "one click" (if everything is green in each component as well as in the integration). In any case, we should strive for a way that ensures everything belonging to the coordinated set gets merged and nothing is forgotten, so that inconsistencies and non-"integrability" don’t creep back in. Maybe if you merge in one repo, then all PRs in all other repos that belong to the coordinated set are merged too?!

This is very tricky. In fact, we are right now trying to implement something like that internally on GitHub Enterprise - using only GitHub methods. I cannot yet say that this works; it's like an atomic merge which involves a set of "the same series of PRs". Ideally a merge queue, so that long-running CI is not a problem. Maybe it requires a third-party tool (some GitHub Bot) or some additional requirement (e.g. you must use a singular integration repository). I can report when I have results here.

(What I wrote here IMHO only expands on the proposal, should in no way contradict it. If it does, that is my lacking ability to explain my thoughts ;-))

Copy link
Contributor

@opajonk opajonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of this. I have added minor remarks, but these are not critical.

Technicality: for reviews and other reasons, the one sentence per line rule is a very helpful one. But this is clearly only a suggestion.

---
## Executive Summary

Large systems often span multiple repositories. Each repository can look “green” on its own, yet problems only show up when everything is combined. These late surprises slow down development and make debugging painful.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Large systems often span multiple repositories. Each repository can look “green” on its own, yet problems only show up when everything is combined. These late surprises slow down development and make debugging painful.
Large systems often span multiple repositories. Each repository can look “green” on its own, yet problems only show up when everything is combined. These late surprises slow down development and make debugging painful. They can even block releases.

reproducibility) to a multi-repository boundary. The central integration repository is a
neutral place to define participating components, build manifests, hold
integration-specific helpers (overrides, fixtures, seam tests), and persist known-good
records. It should not contain business logic; keeping it lean reduces accidental
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that the integration repository can contain e.g. integration tests or other checks which make sense only in that specific integration context. Developing "integration tests" in another repository would be very confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename "seam tests" to "integration tests"

@thilo-schmitt
Copy link

* I open a PR on communication with `HEAD` `foo` and `BASE` `main` (draft as a start) and do some work.

* automation creates a draft PR on `reference_integration`. Due to that PR, CI runs whatever is required there. Automation creates feedback to my PR on communication. That feedback is made a required check to merge to main.

* It can now be that e.g. I need to open another PR with `HEAD` `foo` and `BASE` `main` on "baselibs" to fix a compilation / test failure on `reference_integration`. OK, exactly what I wanted.

(very rough sketch, just to convey the idea)

So, if I understand you correctly, basically what you are saying is: It will fail n-1 times until the n-th required PR is in place. I mean I can somehow live with it, but appears to me to be a waste of resources.

Maybe we can think of something that signals "this set of PRs is now complete and stable, now the integration pipeline will yield a sensible outcome". Ideally not (entirely) manual, because I still want automation as much as possible.

it's like an atomic merge which involves a set of "the same series of PRs".

Exactly. I had this term in mind but hesitated to write it down, since I don't think it's truly achievable (being truly atomic). But yes, that's probably the best term to use to convey the idea.

@opajonk
Copy link
Contributor

opajonk commented Sep 4, 2025

Maybe we can think of something that signals "this set of PRs is now complete and stable, now the integration pipeline will yield a sensible outcome". Ideally not (entirely) manual, because I still want automation as much as possible.

I think as long as the merge requires a passing CI (and you cannot "forget" to request this integration run) - should be fine!

it's like an atomic merge which involves a set of "the same series of PRs".

Exactly. I had this term in mind but hesitated to write it down, since I don't think it's truly achievable (being truly atomic). But yes, that's probably the best term to use to convey the idea.

Yes, I also doubt that truly atomic is possible. Best that is possible is something like "serializing the integration of groups of PRs via some queue", with "queue halts if there is a problem integrating one of the groups". And hoping that will not happen often. Maybe something is possible around "backing out failed integrations automatically" and "continuing with the next group" - but I can see already the complexity of that code exploding...

Copy link
Contributor

@FScholPer FScholPer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the last TL workshop we are fine with the concept. Lets see how it will be implemented ;)

@AlexanderLanin AlexanderLanin merged commit 5649cbf into eclipse-score:main Sep 8, 2025
6 checks passed
@AlexanderLanin AlexanderLanin deleted the dr-002 branch September 8, 2025 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载