Commit 8b9fcd9
authored
fix(cluster) use 'listen_address' for contact point in refresh()
Previously, using `coordinator.host` to add the contact point to the
LB policy means that if the user specified a hostname, then it would be
used to index this node instead of the IP address. Nothing harmful in
that except some inconsistent log messages (sometimes an IP address
shows up, other times a hostname).
Problem
-------
An issue arises however when:
1. Several Cluster instances call `:refresh()` on the same C* cluster
2. DNS round-robin is in effect for the contact point hostnames
Let's consider clusterA and clusterB, both instances of the Cluster
module. Let's also consider the following C* cluster:
10.16.0.1 node1
10.16.0.2 node2
And the following DNS record:
cassandra.default.svc.cluster.local. 30 IN A 10.16.0.1
cassandra.default.svc.cluster.local. 30 IN A 10.16.0.2
First, clusterA calls `refresh()`, with `contact_points = { "cassandra"
}`, and as a result inserts the following topology in the cluster's shm:
cassandra:[peer info]
10.16.0.2:[peer info]
Its LB policy now has 2 entries: `cassandra` and `10.16.0.2`.
Then, clusterB calls `refresh()` as well, with the same `contact_points`
option, and as a result first purges the cluster's shm content, before
inserting the following:
10.16.0.1:[peer info]
cassandra:[peer info]
Note that because of the round-robin DNS resolution, `cassandra` pointed
to `10.16.0.2` this time.
Now, when clusterA will invoke its LB policy to elect a peer for a given
query, it will eventually look for `10.16.0.2`. However, such an entry
does not exist in the cluster's shm anymore. Therefore, the following
error is returned:
no host details for 10.16.0.2
Proposed solution
-----------------
By replacing the cache key of the peer's info in the shm from the
specified `contact_point` value (which is the user's input), to the
`listen_address` column of the `system.local` table, do not store hosts
details by hostname anymore.
This has the added benefit of ensuring all logs and other operations
done by the Cluster module are always using the IP address of the node.
From #1181 parent f43c638 commit 8b9fcd9
1 file changed
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
479 | 479 | | |
480 | 480 | | |
481 | 481 | | |
482 | | - | |
| 482 | + | |
483 | 483 | | |
484 | 484 | | |
485 | 485 | | |
| |||
493 | 493 | | |
494 | 494 | | |
495 | 495 | | |
496 | | - | |
| 496 | + | |
497 | 497 | | |
498 | 498 | | |
499 | 499 | | |
| |||
0 commit comments