App state is not updating in cf cli and appsman ui when the app is down#4309
Merged
Samze merged 5 commits intocloudfoundry:mainfrom Apr 16, 2025
Merged
App state is not updating in cf cli and appsman ui when the app is down#4309Samze merged 5 commits intocloudfoundry:mainfrom
Samze merged 5 commits intocloudfoundry:mainfrom
Conversation
In case of duplicate actual_lrp events(same index), CAPI should look at the since and pick the latest
Remove empty lines
Samze
reviewed
Apr 15, 2025
Samze
reviewed
Apr 15, 2025
Code review changes
Samze
approved these changes
Apr 15, 2025
ari-wg-gitbot
added a commit
to cloudfoundry/capi-release
that referenced
this pull request
Apr 16, 2025
Changes in cloud_controller_ng:
- App state is not updating in cf cli and appsman ui when the app is down
PR: cloudfoundry/cloud_controller_ng#4309
Author: Sriram Nookala <snookala@vmware.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In some circumstances CAPI will report an app instance is running when it is down.
CAPI iterates over all actual_lrps returned from Diego and uses the app index as the key, so in the case CAPI will override each app instance information once and the state shown will be determined by the order of the actual lrp instances.
Example of a duplicate entry from cfdot actual-lrps. Note the process_guid and index are the same.
{
"process_guid": "57a8e43b-81f9-46e9-9f78-81e15bbfd231-de7f7844-156e-4fc7-9f21-db5d072fb0b7",
"index": 3,
"domain": "cf-apps",
"instance_guid": "",
"cell_id": "",
"address": "",
"ports": null,
"preferred_address": "UNKNOWN",
"crash_count": 0,
"state": "UNCLAIMED",
"placement_error": "unable to communicate to compatible cells",
"since": 1739568280529021112,
"modification_tag": {
"epoch": "780635af-9208-4d5e-5a08-ea49ebcb3f95",
"index": 5758
},
"presence": "ORDINARY",
"OptionalRoutable": {
"routable": false
},
"availability_zone": ""
}
{
"process_guid": "57a8e43b-81f9-46e9-9f78-81e15bbfd231-de7f7844-156e-4fc7-9f21-db5d072fb0b7",
"index": 3,
"domain": "cf-apps",
"instance_guid": "1f3ffac3-be77-45e0-5075-7357",
"cell_id": "23b06662-20e7-42dd-9377-6d8f10190ec4",
"address": "10.0.4.17",
"ports": [
{
"container_port": 8080,
"host_port": 61012,
"container_tls_proxy_port": 61001,
"host_tls_proxy_port": 61014
},
{
"container_port": 8080,
"host_port": 61012,
"container_tls_proxy_port": 61443,
"host_tls_proxy_port": 0
},
{
"container_port": 2222,
"host_port": 61013,
"container_tls_proxy_port": 61002,
"host_tls_proxy_port": 61015
}
],
"instance_address": "10.255.233.24",
"preferred_address": "HOST",
"crash_count": 0,
"state": "RUNNING",
"since": 1739222044495241579,
"modification_tag": {
"epoch": "4a424a13-b5ba-47b7-771a-1a61d99c2524",
"index": 2
},
"presence": "SUSPECT",
"metric_tags": {
"app_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"app_name": "static",
"instance_id": "3",
"organization_id": "c877a084-d65b-4758-9908-90201c6df339",
"organization_name": "org-1",
"process_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"process_instance_id": "1f3ffac3-be77-45e0-5075-7357",
"process_type": "web",
"source_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"space_id": "b248d5ab-2948-468b-ad0f-7b1b90e923d1",
"space_name": "space-1"
},
"OptionalRoutable": {
"routable": true
},
"availability_zone": "us-central1-f"
}
Fix
In the case of duplicates, CAPI should look at the since value of the actual_lrp information and take the latest definition.
Tested by killing the diego cell VM bosh delete-vm. Now cf app returns the correct status
instances: 0/2
memory usage: 1024M
state since cpu memory disk logging cpu entitlement details
#0 down 2025-04-15T00:34:03Z 0.0% 0B of 0B 0B of 0B 0B/s of 0B/s unable to communicate to compatible cells
#1 down 2025-04-15T00:34:03Z 0.0% 0B of 0B 0B of 0B 0B/s of 0B/s unable to communicate to compatible cells
I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the
mainbranchI have run all the unit tests using
bundle exec rakeI have run CF Acceptance Tests