-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Provide CLI Parameters for jgroups.* options #40690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9ddc2c0
to
e91d056
Compare
Unreported flaky test detectedIf the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR. org.keycloak.testsuite.cluster.UserFederationInvalidationClusterTest#crudWithFailover
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreported flaky test detected, please review
e91d056
to
0d591da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just a couple of things.
@@ -98,6 +103,24 @@ public static PropertyMapper<?>[] getClusteringPropertyMappers() { | |||
.isEnabled(() -> Configuration.isTrue(CachingOptions.CACHE_EMBEDDED_MTLS_ENABLED), "property '%s' is enabled".formatted(CachingOptions.CACHE_EMBEDDED_MTLS_ENABLED.getKey())) | |||
.validator(CachingPropertyMappers::validateCertificateRotationIsPositive) | |||
.build(), | |||
fromOption(CachingOptions.CACHE_EMBEDDED_NETWORK_BIND_ADDRESS) | |||
.paramLabel("address") | |||
.to("kc.spi-cache--embedded--default--network-bind-address") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.to("kc.spi-cache--embedded--default--network-bind-address") | |
.to("kc.spi-cache-embedded--default--network-bind-port") |
The spi is cache-embedded right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the SPI is cacheEmbedded
- but note that this should still be ...network-bind-address
not ...-port
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to use cache-embedded
for the various options.
boolean bindAddressRequired = Optional.ofNullable(containerBuilder.getArgs()) | ||
.orElse(List.of()) | ||
.stream() | ||
.noneMatch(a -> a.contains("--cache-embedded-network-bind-address")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only match if someone used the unsupported podTemplate to update the main container's args. Do you mean be checking specifically for this, or do you want to check for additionalOptions usage of cache-embedded-network-bind-address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I had forgot about additionalOptions
. I have updated the logic so that we only set the bind-address explicitly if it's not already specified via additionalOptions
or a custom command via unsupported podTemplate.
I have added test cases to operator/src/test/java/org/keycloak/operator/testsuite/unit/PodTemplateTest.java
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user specified bind address doesn't make sense in the context of the Operator, as it will result in the same address being used by all pods, therefore I have reverted ^ in favour of Pedro's suggestion: #40690 (comment)
docs/documentation/upgrading/topics/changes/changes-26_3_0.adoc
Outdated
Show resolved
Hide resolved
String[] split = key.split("(?=[A-Z])", 3); | ||
String property = "jgroups.%s.%s".formatted(split[1].toLowerCase(), split[2].toLowerCase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this will work, but is hard to understand for me, and might fail with future changes. Could we have a switch statement instead which those values that we support? If you want to make it fancy, try using enums to make it safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, looking at this again I realise there's actually a bug as the external address properties are in the format jgroups.%s_%s
not jgroups.%s.%s
.
I have reworked to utilise an Enum.
--cache-embedded-network-bind-address <address> | ||
IP address used by clustering transport. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extends the docs:
--cache-embedded-network-bind-address <address> | |
IP address used by clustering transport. | |
--cache-embedded-network-bind-address <address> | |
IP address to bind to for the clustering transport. By default, ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed a commit updating the various descriptions.
--cache-embedded-network-bind-port <port> | ||
Port used by clustering transport. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extend the docs. People should understand that this is usually optional (same for --cache-embedded-network-bind-address
)
--cache-embedded-network-bind-port <port> | |
Port used by clustering transport. | |
--cache-embedded-network-bind-port <port> | |
Port to bind to for the clustering transport. By default, ... |
--cache-embedded-network-external-address <address> | ||
IP address that other instances in the cluster should use to contact this node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--cache-embedded-network-external-address <address> | |
IP address that other instances in the cluster should use to contact this node. | |
--cache-embedded-network-external-address <address> | |
IP address that other instances in the cluster should use to contact this node. Set only it is different from `cache-embedded-network-bind-address`, for example when this instance is behind a firewall. |
--cache-embedded-network-external-port <port> | ||
Port that other instances in the cluster should use to contact this node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--cache-embedded-network-external-port <port> | |
Port that other instances in the cluster should use to contact this node. | |
--cache-embedded-network-external-port <port> | |
Port that other instances in the cluster should use to contact this node. Set only if ... |
fromOption(CachingOptions.CACHE_EMBEDDED_NETWORK_BIND_ADDRESS) | ||
.paramLabel("address") | ||
.to("kc.spi-cache-embedded--default--network-bind-address") | ||
.validator(CachingPropertyMappers::validateBindAddress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add isEnabled
to all the mapping
.isEnabled(CachingPropertyMappers::cacheSetToInfinispan, "Infinispan clustered embedded is enabled")
private static void validateBindAddress(String address) { | ||
if (InetAddressUtils.isIPv4Address(address) || InetAddressUtils.isIPv6Address(address)) | ||
return; | ||
|
||
for (Util.AddressScope addressScope : Util.AddressScope.values()) | ||
if (addressScope.name().equals(address)) | ||
return; | ||
|
||
String matchType = address.split(":")[0]; | ||
if (Global.MATCH_ADDR.equals(matchType) || Global.MATCH_HOST.equals(matchType) || Global.MATCH_INTF.equals(matchType)) | ||
return; | ||
|
||
throw new PropertyException("Option '%s'. Invalid address: '%s'".formatted(CachingOptions.CACHE_EMBEDDED_NETWORK_BIND_ADDRESS.getKey(), address)); | ||
} | ||
|
||
private static void validateExternalAddress(String address) { | ||
if (!InetAddressUtils.isIPv4Address(address) && !InetAddressUtils.isIPv6Address(address)) | ||
throw new PropertyException("Option '%s'. Invalid address: '%s'".formatted(CachingOptions.CACHE_EMBEDDED_NETWORK_EXTERNAL_ADDRESS.getKey(), address)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't do any validation in Keycloak as it restricts what can be set. For example, you cannot put the network interface, but -Djgroups.bind.address=eth0
or -Djgroups.bind.address=localhost
are both valid.
.../infinispan/src/main/java/org/keycloak/spi/infinispan/impl/embedded/JGroupsConfigurator.java
Outdated
Show resolved
Hide resolved
EXTERNAL_PORT(CachingOptions.CACHE_EMBEDDED_NETWORK_EXTERNAL_PORT, "jgroups.external_port"); | ||
|
||
final Option<?> option; | ||
final String property; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a list. jgroups.bind.address
is an Infinispan property, but public static final String BIND_ADDR="jgroups.bind_addr";
needs to be set too. Same for the bind port.
Using a custom JGroups configuration file won't be able to read the Infinispan properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me and Pedro discussed this and the jgroups.bind_addr
system property always takes precedence over the Infinispan property jgroups.bind.address
and any values configured in the JGroup's stack, so it's acceptable to just set the jgroups.bind_addr
property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have also made it so that we warn if either jgroups.bind_addr
or jgroups.bind.address
is set with the CLI option.
if (isBindAddressRequired(containerBuilder.getArgs(), keycloakCR.getSpec().getAdditionalOptions())) { | ||
// Set bind address as this is required for JGroups to form a cluster in IPv6 environments | ||
containerBuilder.addToArgs("--cache-embedded-network-bind-address=$(%s)".formatted(POD_IP)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing to me... why would anyone set the bind address in the Keycloak CR!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this line and change the environment variable name in line:
Line 537 in 7736ca2
envVars.add(new EnvVarBuilder().withName(POD_IP).withNewValueFrom().withNewFieldRef() |
i.e. replace
POD_ID
with KC_CACHE_EMBEDDED_NETWORK_BIND_ADDRESS
551febd
to
4996c08
Compare
@@ -534,7 +532,7 @@ private List<EnvVar> getDefaultAndAdditionalEnvVars(Keycloak keycloakCR) { | |||
} | |||
} | |||
|
|||
envVars.add(new EnvVarBuilder().withName(POD_IP).withNewValueFrom().withNewFieldRef() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a small chance this could be seen as a breaking change - if users assume that the POD_ID environment variable is always available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue this is a notable change, as it changes an internal behavior. We have never stated that the Operator will put that to the environment variables. So as long as we document it as a notable change, we should be fine.
@@ -83,7 +83,7 @@ | |||
) | |||
public class KeycloakDeploymentDependentResource extends CRUDKubernetesDependentResource<StatefulSet, Keycloak> { | |||
|
|||
public static final String POD_IP = "POD_IP"; | |||
public static final String BIND_ADDRESS = "KC_CACHE_EMBEDDED_NETWORK_BIND_ADDRESS"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expectation for users who are specifying an image - won't there be an incompatibility with existing images with the new operator as this option won't yet exist?
I know we don't go into detail about this scenario in the general docs, but for the most part changes in the operator tend to be backwards compatible at least to last supported major version, or dependent upon specific enablement.
If we don't want version dependent logic in the operator, we'll at least need instructions to update the images first, before upgrading the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expectation for users who are specifying an image - won't there be an incompatibility with existing images with the new operator as this option won't yet exist?
I feel like we're in very murky water here, given that the Operator doesn't officially support multiple Keycloak versions but it seems we don't want to break things when using older server versions. It would be great if this was documented somewhere 🙂
If we don't want version dependent logic in the operator, we'll at least need instructions to update the images first, before upgrading the operator.
By update the images first, I assume you mean to a version with --cache-embedded-network-bind-address
?
I don't think the user should ever be baking things related to the bind address into the image. Unless the user is providing a special JGroups String, such as "SITE_LOCAL", this value will typically be different for every container, hence the use of POD_IP.
We were previously relying on -Djgroups.bind.address
being set as the first container arg, even when an unsupported podTemplateSpec was provided. AFAIU this means the only way a user could override this property would be to:
- Explicitly provide
-Djgroups.bind.address
as part ofJAVA_OPTS_APPEND
- Configure
-Djgroups.bind_addr
as this takes precedence over-Djgroups.bind.address
at the JGroups level
If we don't want version dependent logic in the operator, we'll at least need instructions to update the images first, before upgrading the operator.
An alternative is that we fallback to using -Djgroups.bind.address
in the Operator and make the transition to the CLI args as part of KC 27.0.
The -Djgroups.bind.address
property will still work with my changes. The user will get a warning if they also configure --cache-embedded-network-bind-address
, but this should never be necessary with the Operator anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By update the images first, I assume you mean to a version with --cache-embedded-network-bind-address?
Yes, rebuilding against Keycloak 26.4 prior to upgrading the operator to 26.4.
Otherwise KC_CACHE_EMBEDDED_NETWORK_BIND_ADDRESS won't be recognized / used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is not officially supported to run different minor releases in KC and the Operator, I then recommend to add it to notable changes. As this was added for people running in IPv6 only environments, this restricts the audience even further.
Is there a planned docs on how to upgrade Keycloak with custom images and the Operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is not officially supported to run different minor releases in KC and the Operator
The only mention we have of this is a note on https://www.keycloak.org/operator/customizing-keycloak "To ensure full compatibility of Operator and Operand, make sure that the version of Keycloak release used in the custom image is aligned with the version of the operator." - which could even imply that you need to update your custom images before doing a micro update of the operator.
Is there a planned docs on how to upgrade Keycloak with custom images and the Operator?
The OLM instructions were updated to warn the user to prevent automatic upgrades. As far as I know we haven't elaborated, and don't have an issue captured, on all the steps necessary to actually upgrade the operator. Database backup and restore is especially hand-wavy. Some of this should be covered under #37356
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose what you suggest here will work with the change @ryanemerson is proposing here:
- The old config to pass the POD_IP via the Java System Property is understood by the new Keycloak version - I hope @ryanemerson can confirm that.
AFAIK the KC Operator is installed per namespace, and I would imagine people would usually not run multiple Keycloaks in a namespace. Even if they do, given the coupling between the operator image and the Keycloak image, they should be all the same version.
I give my +1 to have the procedure @shawkins described above documented in a follow-up issue in the CND team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JGroups will always check the system properties and use its value if it is set; it has priority over every other configuration path. The changes to the operator code can be reverted without a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @shawkins suggestion. The only issue I can see is if the server dropped some functionality that the Operator relied upon, e.g. when upgrading from 26.x to 27.x, however I think that's very much an edge case that we can take measures to prevent.
- The old config to pass the POD_IP via the Java System Property is understood by the new Keycloak version - I hope @ryanemerson can confirm that.
Yes, that is the case. Just with the caveats I describe in #40690 (comment)
AFAIK the KC Operator is installed per namespace, and I would imagine people would usually not run multiple Keycloaks in a namespace. Even if they do, given the coupling between the operator image and the Keycloak image, they should be all the same version.
The problem with the Operator framework is that it relies on CRDs which are installed in the global namespace of K8s, so mixing and matching Operator versions doesn't work well in practice and is not recommended. These limitations were a big motivator for some of the changes in OLM V1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed a commit reverting the Operator changes and I have created #40884 to track modifying the Operator in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to share my 2 cents here.
Officially, we do not support using older KC image with newer Operator (or vice versa). That said, we do not have a good documented process how to upgrade when using custom images. So we're trying to avoid doing breaking changes in the Operator when not needed. And the current breaking changes policy in the server helps us with that: we're not doing any in minor releases. That enables us to use newer server version with older operator. Hence the approach @shawkins makes sense to me to document.
That said, this only works for minor releases. I think we need to make it clear that for major upgrades (where we will not even consider supporting zero-downtime upgrade), the user must scale down their deployment before upgrading the Operator.
This might change in the future if we make the operator aware of the server version to support using different server versions with different operator versions. But for now, clear docs is IMHO the way to go.
@vmuzikar - with the Operator changes removed, I think the open questions are now resolved. Can you please confirm that adding the CLI options is ok for the CND team?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, looks good to me.
=== JGroups system properties replaced with CLI options | ||
|
||
Until now it was necessary to configure JGroups network addresses and ports using the `+jgroups.bind.*+` and `+jgroups.external_*+` | ||
system properties. In this release we have introduced the following CLI options to allow these addresses and ports to be | ||
configured directly via {project_name}: `cache-embedded-network-bind-address`, `cache-embedded-network-bind-port`, | ||
`cache-embedded-network-external-address`, `cache-embedded-network-external-port`. Configuring ports using the old | ||
properties will still function as before, but we recommend to change to the CLI options as this may change in the future. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to go to changes-26_4_0.adoc
as 26.3 has been released by now. Please update the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rebased and squashed on the latest main
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahus1 Yes, the options work for me. I haven't deeply reviewed other changes, though.
Closes keycloak#40481 Signed-off-by: Ryan Emerson <remerson@redhat.com>
1fddd97
to
66cfc23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving based on previous reviews.
Closes #40481