Reduce likelihood of multiple coordinators on concurrent startup #41291

ahus1 · 2025-07-21T13:00:05Z

ryanemerson · 2025-07-24T13:54:19Z

model/infinispan/src/main/java/org/keycloak/jgroups/protocol/KEYCLOAK_JDBC_PING2.java

+                while (pingData.stream().noneMatch(PingData::isCoord)) {
+                    // Do a quick check if more nodes have arrived, to have a more complete list of nodes to start with.
+                    List<PingData> newPingData = readFromDB(cluster_name);
+                    if (newPingData.size() == pingData.size() || pingData.stream().anyMatch(PingData::isCoord)) {


Is it safe to just check the size of the List or do we need to consider the content of the PingData as well. I'm just wondering if the view changed from [a,b] to [a,c]

Resolved in next commit, will check if they both contain the same addresses

ryanemerson · 2025-07-24T14:01:16Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JDBC_PING2_Test.java

+ * @author Bela Ban
+ * @since  5.4, 5.3.7
+ */
+public class JDBC_PING2_Test {


It would be good to format this file per Keycloak conventions.

resolved in next commit

ryanemerson · 2025-07-24T14:01:49Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JDBC_PING2_Test.java

+        return modify(new JChannel(cfg).name(name));
+    }
+
+    protected static class Connector implements Runnable {


Can be a record:

protected record Connector(CountDownLatch latch, JChannel ch) implements Runnable { @Override public void run() { try { latch.await(); ch.connect(CLUSTER); } catch (Exception e) { throw new RuntimeException(e); } } }

resolved in next commit

ryanemerson · 2025-07-24T14:08:08Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JDBC_PING2_Test.java

+                        .collect(Collectors.joining("\n")));
+        Arrays.stream(channels).filter(ch -> ch.view().getCoord() != ch.getAddress()).forEach(JChannel::close);
+        Arrays.stream(channels).filter(ch -> !ch.isClosed()).forEach(JChannel::close);
+        System.out.println("Closed");


Do we need to remove the various System.out calls or replace them with debug logging before merging this?

resolved in next commit

model/infinispan/src/main/java/org/keycloak/jgroups/protocol/KEYCLOAK_JDBC_PING2.java

Closes keycloak#41290 Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 · 2025-07-24T15:09:14Z

Thank you for the review. The PR now contains also a change to pre-seed the addresses so they are ordered, and the first node gets a one second grace period from the other nodes starting later to complete it becoming a coordinator.

pruivo · 2025-07-24T15:48:06Z

.../infinispan/src/main/java/org/keycloak/spi/infinispan/impl/embedded/JGroupsConfigurator.java

+        var tableName = JpaUtils.getTableNameForNativeQuery("JGROUPS_PING", cp.getEntityManager());
+        String statement = String.format("INSERT INTO %s values (?, ?, ?, ?, ?)", tableName);
+
+        Address address = new UUID(0, value);


Some Infinispan configurations require an ExtendedUUID (and the upcoming ISPN 16 too)

TODO for Alex: Also test with site name before finalizing this PR (use the SPI option) as that forces an Extended UUID on ISPN side of things.

I found that JGroups can only serialize a UUID or a SiteUUID due to the code in Util.addressToString. For a plain ExtendedUUID it will return null, which leads to followup-errors, starting with a the value being rejected by the database.

I updated the code to use SiteUUID, and for that I needed to also use a site. Please have a look, maybe we could pair for any additional changes that might be necessary.

I'm not sure of the repurcussions of using SiteUUID vs ExtendedUUID here, but it seems to me that ExtendedUUID should be handled by Util.addressToString as I believe the current implementation will cause issues with Infinispan 16 users relying on JDBC_PING2.

-1 to use SiteUUID.
@belaban can this belaban/JGroups#901 be backported to 5.3.x branch?

I found that when setting spi-cache-embedded--default--site-name this already tries to use a ExtendedUUID today, but it then breaks with the NULL problem when using JDBC_PING2.

I'm reverting my previous commit. It seems we'll need to state that spi-cache-embedded--default--site-name is incompatible with JDBC_PING2?!

I'm now mapping an ExtendedUUID to a UUID when passing it down to JDBC_PING as I don't know when the backport will land to and to see if it now breaks in a different place.

pruivo · 2025-07-24T15:48:29Z

.../infinispan/src/main/java/org/keycloak/spi/infinispan/impl/embedded/JGroupsConfigurator.java

+            try (PreparedStatement s = con.prepareStatement(statement)) {
+                s.setString(1, org.jgroups.util.Util.addressToString(address)); // address
+                s.setString(2, "(starting)"); // name
+                s.setString(3, "ISPN"); // cluster name


The cluster name can be changed; you need to read it from the configuration.

This should be fixed with this commit as well.

pruivo · 2025-07-24T15:50:30Z

KEYCLOAK_JDBC_PING2 looks like a mix of discovery, group membership, and failure detection. Let's see what kind of issues pop up in production. 😰

ryanemerson · 2025-07-24T16:16:24Z

.../infinispan/src/main/java/org/keycloak/spi/infinispan/impl/embedded/JGroupsConfigurator.java

+        String seq = storage.loadOrCreate(JGROUPS_ADDRESS_SEQUENCE, () -> "0");
+        long value = Long.parseLong(seq) + 1;
+        String newSeq = Long.toString(value);
+        storage.replace(JGROUPS_ADDRESS_SEQUENCE, seq, newSeq);


Don't we need to handle the case where the replace fails, otherwise we will potentially generate two addresses on different nodes with the same seq number?

This is all handled in one transaction, and the DB transaction will be rolled back if it fails. So those addresses that pass through this one are guaranteed to be consecutive, ordered, and visible to all other nodes.

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Signed-off-by: Alexander Schwartz <aschwart@redhat.com> This reverts commit ccf38b2.

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

…okie Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 · 2025-07-29T15:58:49Z

@pruivo - the build is green and the code blocks moved around one last time. Please review this one when you have the time. Thanks!

pruivo

I got JdbcSQLIntegrityConstraintViolationException locally, running testConcurrentStartupMultipleTimes test.

19:16:52,381 ERROR jgroups-6,jdbc-test,4 [org.keycloak.jgroups.protocol.KEYCLOAK_JDBC_PING2_FOR_TESTING] 4: failed writing to DB: org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Unique index or primary key violation: "PUBLIC.PRIMARY_KEY_C ON PUBLIC.JGROUPS(ADDRESS) VALUES ( /* 751 */ 'uuid://11e06b89-e3c2-4363-9794-ffd49d283280' )"; SQL statement:

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JdbcPing2Test.java

pruivo · 2025-07-29T18:13:08Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JdbcPing2Test.java

+        }
+        CountDownLatch latch=new CountDownLatch(1);
+        int index=1;
+        List<Thread> threads=new ArrayList<>();


Wait for the threads to finish?

done in next commit

pruivo · 2025-07-29T18:14:50Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JdbcPing2Test.java

+        Arrays.stream(channels).filter(ch -> ch.view().getCoord() != ch.getAddress()).forEach(JChannel::close);
+        Arrays.stream(channels).filter(ch -> !ch.isClosed()).forEach(JChannel::close);


It's probably safer to put this in a finally block. If the test fails, the channels won't be closed.

part of next commit

pruivo · 2025-07-29T18:15:08Z

model/infinispan/src/test/java/org/keycloak/jgroups/protocol/JdbcPing2Test.java

+
+    protected static JChannel modify(JChannel ch) {
+        GMS gms=ch.stack().findProtocol(GMS.class);
+        gms.setJoinTimeout(3000).setMaxJoinAttempts(5);


Why not configure this in the XML?

I've inherited it :D ... moved to XML in next commit

pruivo · 2025-07-29T18:29:40Z

model/infinispan/src/test/resources/jdbc-h2.xml

+            write_data_on_find="true"
+    />
+    <!-- very aggressive merging to speed up the test -->
+    <MERGE3 min_interval="150"


This is making the tests pass because it runs the partition handling check every 200ms. In production, it runs every 30s.

Using the production values, the test fails randomly

[ERROR] JdbcPing2Test.testConcurrentStartupMultipleTimes:60->runSingleTest:89 » Timeout Timeout 20000 kicked in, views are: 1: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 2: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 3: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 4: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 5: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 6: [6|0] (1) [6] 7: [8|4] (7) [8, 1, 4, 7, 2, 5, 3] 8: [8|4] (7) [8, 1, 4, 7, 2, 5, 3]

The faster merge interval is there to speed up the test. In production with a longer interval, it will take longer, but it will eventually converge. IMHO any sub-second interval will be too low for production. Still we can go for 5 seconds or so?!

In production with a longer interval, it will take longer, but it will eventually converge

So you agree we don't need this PR 😄

IMHO any sub-second interval will be too low for production.

The LinkedIn blog recommended increasing it to minutes...

On a serious note, I will change my opinion regarding last night and state that the test should be removed. With MERGE3, either you use JDBC_PING2 or KEYCLOAK_JDBC_PING2, the test passes. Without it, it fails with either protocol. I can conclude the test is not testing anything useful. WDYT?

So you agree we don't need this PR ... the test should be removed

The test is more a stress testing harness that helped me to test KEYCLOAK_JDBC_PING2 against H2, and discover possible problems / areas to improve the converging of the cluster. With the other change of having a pre-seeded UUID which is not part of the added test, this should be necessary less often, but still when clusters split.

So I would then remove the test, and keep the (ignored) stress test that's running 100 times?

pruivo · 2025-07-29T18:31:07Z

.../infinispan/src/main/java/org/keycloak/spi/infinispan/impl/embedded/JGroupsConfigurator.java

+                s.setString(1, org.jgroups.util.Util.addressToString(new UUID(address.getMostSignificantBits(), address.getLeastSignificantBits()))); // address
+                s.setString(2, "(starting)"); // name
+                s.setString(3, clusterName); // cluster name
+                s.setString(4, new IpAddress("localhost", 0).toString()); // ip


new IpAddress("localhost", 0).toString() should have always the same output. It can be cached.

…dbcPing2Test.java Co-authored-by: Pedro Ruivo <pruivo@users.noreply.github.com> Signed-off-by: Alexander Schwartz <alexander.schwartz@gmx.net>

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 · 2025-07-30T06:46:16Z

@pruivo - updated the PR.
The exception JdbcSQLIntegrityConstraintViolationException is something that can happen when there are attempts of concurrent writes (for example the coordinator writes a node, and the node itself writes their information). That should then heal itself on the next attempt.

Ready for another round of reviews.

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

pruivo

LGTM.

ryanemerson · 2025-07-30T11:10:49Z

model/infinispan/src/main/java/org/keycloak/jgroups/protocol/KEYCLOAK_JDBC_PING2.java

+    }
+
+    @Override
+    protected void delete(String clustername, Address addressToDelete) throws SQLException {


Do we also need to overwrite callInsertStoredProcedure as this calls Util#addressToString and is called from writeToDB?

we don't use it, so no

Ok, I take your point as the toUUID should not be necessary in JGroups once belaban/JGroups#901 has been backported. There is a lot of flags we don't use in the Keycloak use-case that we're still including here. I assume the reason for inclusion is so that our solution here is as close to the PR that will eventually be issued to JGroups for JDBC_PING2.

ahus1 self-assigned this Jul 21, 2025

ahus1 force-pushed the is-41290-allow-concurrent-starts-postgres branch from 8784e45 to 0661802 Compare July 21, 2025 17:01

ahus1 marked this pull request as ready for review July 22, 2025 09:16

ahus1 requested review from a team as code owners July 22, 2025 09:17

keycloak-github-bot bot added the team/sre label Jul 22, 2025

ahus1 changed the title ~~Ensure to find all members even on concurrent starts~~ Reduce likelihood of multiple coordinators on concurrent startup Jul 24, 2025

ahus1 mentioned this pull request Jul 24, 2025

JDBC_PING2 doesn't merge split clusters after a while #41390

Closed

2 tasks

ryanemerson reviewed Jul 24, 2025

View reviewed changes

model/infinispan/src/main/java/org/keycloak/jgroups/protocol/KEYCLOAK_JDBC_PING2.java Show resolved Hide resolved

ahus1 added 5 commits July 24, 2025 16:58

Ensure to find all members even on concurrent starts

0e7c96f

Closes keycloak#41290 Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Rework, avoiding removeAll in a separate transaction

f405c19

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Rework: Be more specific when to insert or remove data

e094ef9

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Pre-seeing the JGroups address in an ordered way

8c916f0

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Addressing review comments

65fa68c

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 force-pushed the is-41290-allow-concurrent-starts-postgres branch from 7a222cf to 65fa68c Compare July 24, 2025 15:05

ahus1 requested a review from ryanemerson July 24, 2025 15:07

pruivo reviewed Jul 24, 2025

View reviewed changes

keycloak-github-bot bot added the flaky-test label Jul 24, 2025

keycloak-github-bot bot mentioned this pull request Jul 24, 2025

Flaky test: org.keycloak.testsuite.federation.ldap.LDAPReadOnlyTest#testReadOnlyUserGetsPermanentlyLocked #41182

Closed

ryanemerson reviewed Jul 24, 2025

View reviewed changes

ahus1 added 3 commits July 29, 2025 10:11

Rework - use a SiteUUID

ccf38b2

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Revert "Rework - use a SiteUUID"

653a022

Signed-off-by: Alexander Schwartz <aschwart@redhat.com> This reverts commit ccf38b2.

Workaround to use ExtendedUUID

c69f8e7

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 force-pushed the is-41290-allow-concurrent-starts-postgres branch from 0cfcaf5 to c69f8e7 Compare July 29, 2025 12:04

ahus1 added 2 commits July 29, 2025 15:03

Avoid setting the node name early as it would add the route to the co…

5792de8

…okie Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Cleanup

f8e8ff0

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

Cleanup

df90164

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 requested a review from pruivo July 29, 2025 15:58

pruivo reviewed Jul 29, 2025

View reviewed changes

ahus1 and others added 4 commits July 30, 2025 08:17

Update model/infinispan/src/test/java/org/keycloak/jgroups/protocol/J…

6a70cab

…dbcPing2Test.java Co-authored-by: Pedro Ruivo <pruivo@users.noreply.github.com> Signed-off-by: Alexander Schwartz <alexander.schwartz@gmx.net>

Update model/infinispan/src/test/java/org/keycloak/jgroups/protocol/J…

b449172

…dbcPing2Test.java Co-authored-by: Pedro Ruivo <pruivo@users.noreply.github.com> Signed-off-by: Alexander Schwartz <alexander.schwartz@gmx.net>

Update model/infinispan/src/test/java/org/keycloak/jgroups/protocol/J…

3a33edb

…dbcPing2Test.java Co-authored-by: Pedro Ruivo <pruivo@users.noreply.github.com> Signed-off-by: Alexander Schwartz <alexander.schwartz@gmx.net>

Rework and cleanup

b4c0f54

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

ahus1 requested a review from pruivo July 30, 2025 06:44

Rework and cleanup

9eac7c8

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>

pruivo approved these changes Jul 30, 2025

View reviewed changes

ryanemerson reviewed Jul 30, 2025

View reviewed changes

ahus1 merged commit c9943af into keycloak:main Jul 30, 2025
127 of 130 checks passed

		Arrays.stream(channels).filter(ch -> ch.view().getCoord() != ch.getAddress()).forEach(JChannel::close);
		Arrays.stream(channels).filter(ch -> !ch.isClosed()).forEach(JChannel::close);

Reduce likelihood of multiple coordinators on concurrent startup #41291

Reduce likelihood of multiple coordinators on concurrent startup #41291

Uh oh!

Conversation

ahus1 commented Jul 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ahus1 commented Jul 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahus1 Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanemerson Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pruivo commented Jul 24, 2025

Uh oh!

ryanemerson Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahus1 commented Jul 29, 2025

Uh oh!

pruivo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ahus1 Jul 28, 2025 •

edited

Loading

ryanemerson Jul 29, 2025 •

edited

Loading

ryanemerson Jul 24, 2025 •

edited

Loading