In some circumstances CAPI will report an app instance is running when it is down.
{
"process_guid": "57a8e43b-81f9-46e9-9f78-81e15bbfd231-de7f7844-156e-4fc7-9f21-db5d072fb0b7",
"index": 3,
"domain": "cf-apps",
"instance_guid": "",
"cell_id": "",
"address": "",
"ports": null,
"preferred_address": "UNKNOWN",
"crash_count": 0,
"state": "UNCLAIMED",
"placement_error": "unable to communicate to compatible cells",
"since": 1739568280529021112,
"modification_tag": {
"epoch": "780635af-9208-4d5e-5a08-ea49ebcb3f95",
"index": 5758
},
"presence": "ORDINARY",
"OptionalRoutable": {
"routable": false
},
"availability_zone": ""
}
{
"process_guid": "57a8e43b-81f9-46e9-9f78-81e15bbfd231-de7f7844-156e-4fc7-9f21-db5d072fb0b7",
"index": 3,
"domain": "cf-apps",
"instance_guid": "1f3ffac3-be77-45e0-5075-7357",
"cell_id": "23b06662-20e7-42dd-9377-6d8f10190ec4",
"address": "10.0.4.17",
"ports": [
{
"container_port": 8080,
"host_port": 61012,
"container_tls_proxy_port": 61001,
"host_tls_proxy_port": 61014
},
{
"container_port": 8080,
"host_port": 61012,
"container_tls_proxy_port": 61443,
"host_tls_proxy_port": 0
},
{
"container_port": 2222,
"host_port": 61013,
"container_tls_proxy_port": 61002,
"host_tls_proxy_port": 61015
}
],
"instance_address": "10.255.233.24",
"preferred_address": "HOST",
"crash_count": 0,
"state": "RUNNING",
"since": 1739222044495241579,
"modification_tag": {
"epoch": "4a424a13-b5ba-47b7-771a-1a61d99c2524",
"index": 2
},
"presence": "SUSPECT",
"metric_tags": {
"app_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"app_name": "static",
"instance_id": "3",
"organization_id": "c877a084-d65b-4758-9908-90201c6df339",
"organization_name": "org-1",
"process_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"process_instance_id": "1f3ffac3-be77-45e0-5075-7357",
"process_type": "web",
"source_id": "57a8e43b-81f9-46e9-9f78-81e15bbfd231",
"space_id": "b248d5ab-2948-468b-ad0f-7b1b90e923d1",
"space_name": "space-1"
},
"OptionalRoutable": {
"routable": true
},
"availability_zone": "us-central1-f"
}
In some circumstances CAPI will report an app instance is running when it is down.
Reproduction
Steps to reproduce:
bosh delete-vmCAPI iterates over all actual_lrps returned from Diego (in this case 8) and uses the app index as the key, so in the case CAPI will override each app instance information once and the state shown will be determined by the order of the actual lrp instances. See https://github.com/cloudfoundry/cloud_controller_ng/blob/main/lib/cloud_controller/diego/reporters/instances_stats_reporter.rb#L48-L56
Example of a duplicate entry from
cfdot actual-lrps. Note the process_guid and index are the same.Fix
In the case of duplicates, CAPI should look at
sinceof the actual_lrp information and take the latest definition.