Skip to content

Commit 53f513c

Browse files
committed
DNM: Reproduce placement http endpoint race
With the delay introduce to the placement tls reconciliation we can reproduce the following sequence of events. * PlacementAPI CR is created with the non tlse endpoints and being actively reconciled by the placement-operator * Nova CR is CR is created and being actively reconciled by the nova-operator * placement-operator deploys the service and exposes it in keystone via a KeystoneEndpoint CR with the http URL * nova-operator deploys nova-cell0-conductor and that service creates a placement client that discovers the http URL for placement * opentstack-operator finally updates the PlacementAPI with the tlse config and therefore placement-operator updates the KeystoneEndpoint CR with the https endpoints. This does not trigger a restart in nova-cell0-conductor deployment as that only depends on the KeystoneEndpoint/keystone * two edpm compute node is deployed and a nova instance is created on one of them then requested to be migrated to the other node. * nova-cell0-conductor-0 uses its placement client to move get the instance allocations from placement. The client uses the http URL and fails as the placement now only speaks https. while true ; do date ; oc get pod | grep nova ; sleep 5 ; done ... Tue May 27 10:54:25 AM CEST 2025 nova-api-0 0/2 ContainerCreating 0 1s nova-api-8967-account-create-nqf8c 0/1 Completed 0 34s nova-api-db-create-cj4ls 0/1 Completed 0 44s nova-cell0-cell-mapping-r4mcr 1/1 Running 0 1s nova-cell0-conductor-0 1/1 Running 0 12s nova-cell0-conductor-db-sync-htqxh 0/1 Completed 0 29s nova-cell0-db-create-7pqqx 0/1 Completed 0 44s nova-cell0-e647-account-create-f8nz4 0/1 Completed 0 34s nova-cell1-9d60-account-create-r47q8 0/1 Completed 0 34s nova-cell1-conductor-db-sync-sr2qp 1/1 Running 0 1s nova-cell1-db-create-7kl84 0/1 Completed 0 44s nova-cell1-novncproxy-0 0/1 ContainerCreating 0 1s nova-metadata-0 0/2 ContainerCreating 0 1s nova-scheduler-0 0/1 ContainerCreating 0 1s while true ; do date ; oc get KeystoneEndpoint/placement -o yaml | yq ".spec.endpoints" ; sleep 5 ; done ... Tue May 27 10:54:21 AM CEST 2025 internal: http://placement-internal.openstack.svc:8778 public: http://placement-public.openstack.svc:8778 Tue May 27 10:54:26 AM CEST 2025 internal: http://placement-internal.openstack.svc:8778 public: http://placement-public.openstack.svc:8778 Tue May 27 10:54:31 AM CEST 2025 internal: http://placement-internal.openstack.svc:8778 public: http://placement-public.openstack.svc:8778 Tue May 27 10:54:36 AM CEST 2025 internal: http://placement-internal.openstack.svc:8778 public: http://placement-public.openstack.svc:8778 Tue May 27 10:54:42 AM CEST 2025 internal: http://placement-internal.openstack.svc:8778 public: http://placement-public.openstack.svc:8778 Tue May 27 10:54:47 AM CEST 2025 internal: https://placement-internal.openstack.svc:8778 public: https://placement-public-openstack.apps-crc.testing ❯ openstack server migrate test_0 --wait oc logs nova-cell0-conductor-0 ... 2025-05-27 09:15:26.911 1 WARNING nova.scheduler.utils [None req-180d4caa-8cdb-4e64-99ce-b1fae03e7ccf 1225f304b197465c9b7861edb324e225 bb30f64d028c4b93816803c9f9476c36 - - default default] Failed to compute_task_migrate_server: Failed to retrieve allocations for consumer af38a891-59d9-4aa1-9568-0d57a31a8b37: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>400 Bad Request</title> </head><body> <h1>Bad Request</h1> <p>Your browser sent a request that this server could not understand.<br /> Reason: You're speaking plain HTTP to an SSL-enabled server port.<br /> Instead use the HTTPS scheme to access this URL, please.<br /> </p> </body></html> : nova.exception.ConsumerAllocationRetrievalFailed: Failed to retrieve allocations for consumer af38a891-59d9-4aa1-9568-0d57a31a8b37: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
1 parent 19929e1 commit 53f513c

1 file changed

Lines changed: 10 additions & 0 deletions

File tree

pkg/openstack/placement.go

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ package openstack
33
import (
44
"context"
55
"fmt"
6+
"time"
67

78
"github.com/openstack-k8s-operators/lib-common/modules/common/condition"
89
"github.com/openstack-k8s-operators/lib-common/modules/common/helper"
@@ -18,6 +19,8 @@ import (
1819
ctrl "sigs.k8s.io/controller-runtime"
1920
)
2021

22+
var count int
23+
2124
// ReconcilePlacementAPI -
2225
func ReconcilePlacementAPI(ctx context.Context, instance *corev1beta1.OpenStackControlPlane, version *corev1beta1.OpenStackVersion, helper *helper.Helper) (ctrl.Result, error) {
2326
placementAPI := &placementv1.PlacementAPI{
@@ -105,6 +108,13 @@ func ReconcilePlacementAPI(ctx context.Context, instance *corev1beta1.OpenStackC
105108
} else if (ctrlResult != ctrl.Result{}) {
106109
return ctrlResult, nil
107110
}
111+
if count != 10 {
112+
Log.Info("XXX delaying placement endpoint update")
113+
time.Sleep(10 * time.Second)
114+
count++
115+
return ctrl.Result{Requeue: true}, nil
116+
}
117+
Log.Info("XXX continues with placement endpoint update")
108118
// set service overrides
109119
instance.Spec.Placement.Template.Override.Service = endpointDetails.GetEndpointServiceOverrides()
110120
// update TLS settings with cert secret

0 commit comments

Comments
 (0)