AMKO uses federation to replicate AMKO configuration to a set of member clusters. This ensures a seamless recovery of AMKO configs for disaster recovery.
The set of clusters registered on AMKO is considered as a federation set. One of these clusters in the federation set must be designated as the leader. And, all the other clusters are designated as the followers.
Post a disaster, the user should manually pick any of the erstwhile follower clusters and designate that as the new leader. The user must ensure that only a single AMKO is designated as the leader at any given point in time.
Following objects are federated from the leader cluster to the follower:
GSLBConfig
GlobalDeploymentPolicy
or GDP
Add/Update/Delete events of the above objects are federated to the follower clusters.
Assume the following topology (a cluster is a kubernetes kubernetes/openshift cluster):
Each site has an Avi Controller deployed. Site 1’s Avi Controller is chosen as the GSLB Leader and all the other sites are GSLB followers. AMKO is deployed on all three sites. For cluster 1 (in site 1), it is marked as the leader. And, for clusters 2 and 3, it is marked as follower. User adds/updates the GSLBConfig
and GDP
objects in cluster 1, AMKO’s federator in cluster 1 federates the changes to these objects to all the follower clusters.
A CRD called AMKOCluster
governs the federation. A typical AMKOCluster
object for a leader AMKO looks like this:
apiVersion: amko.vmware.com/v1alpha1
kind: AMKOCluster
metadata:
name: amkocluster-sample
namespace: avi-system
spec:
isLeader: true
clusterContext: cluster1
version: 1.4.2
clusters:
- cluster1
- cluster2
status:
conditions:
- status: valid AMKOCluster object
type: current AMKOCluster Validation
- status: all cluster clients fetched
type: member cluster initialisation
- status: validated all member clusters
type: member cluster validation
- status: federated to all valid clusters successfully
type: GSLBConfig Federation
- status: federated to all valid clusters successfully
type: GDP Federation
namespace
: namespace of this object must be avi-system
.isLeader
: Users must specify whether the AMKO in the current cluster is leader. Default value is false
. If set to false
, AMKO won’t sync any objects to the Avi Controller, and the AMKO federator won’t federate the objects to the member clusters.clusterContext
: Users must specify the current cluster’s context. Providing the wrong cluster context can cause undefined behavior.version
: Current cluster’s AMKO version. If installed via helm, this field gets automatically filled up.clusters
: Member cluster list on which federation will be performed. Current cluster (if present) in this list will be ignored.status
: Indicates the current state of federation. Following types are reflected in the status:
current AMKOCluster Validation
: indicates the validity of the current AMKOCluster
object.member cluster initialisation
: indicates whether the cluster contexts given in spec.clusters
were fetched and initialised from the gslb-config-secret
secret. If a member cluster given in spec.clusters
is not found in the gslb-config-secret
, this step would fail.member cluster validation
: The federator validates all the member clusters in the spec.clusters
list and indicates a success/error. Validation includes some sanity checks, version mismatch checks, leader checks etc.GSLBConfig federation
: The federator indicates whether it was able to federate the GSLBConfig
object to all the clusters in spec.clusters
successfully.GDP Federation
: The federator indicates whether it was able to federate the GDP
/GlobalDeploymentPolicy
object to all the clusters in spec.clusters
successfully.Note that if helm
is used to deploy AMKO, this Custom Resource will be installed, and the users have to provide these values via values.yaml
.
Additional Notes:
spec.clusterContext
must contain the current cluster’s context.spec.version
is compared against the versions of all member clusters. All AMKO clusters must have the same version as the leader
cluster. The federation logic will not work if there’s a version mismatch.spec.clusters
contains the federation cluster set.AMKOCluster
’s status.During a cluster down event on the leader
AMKO, the federation of config objects will stop. However, at this point, all other clusters participating in federation would be synced with up to date configuration of the erstwhile GSLB leader
. Hence, switching to a new AMKO leader
does not require any manual steps of recovering the AMKO config objects.
Let’s assume that there are 3 sites with one cluster in each of them:
A disaster can occur either in the entire site or just for that cluster. Site failure would also mean that the kubernetes/openshift cluster along with the Avi Controller in that site are down. Whereas, a cluster failure would mean that only the kubernetes/openshift cluster is down.
The site where the leader AMKO was deployed and which hosted the Avi GSLB leader, fails. At this point, the user has to:
Choose a new follower AMKO to be the new leader, and perform the following steps on the cluster where this follower AMKO is deployed:
a. Edit the GSLBConfig
object and change the leader IP address:
$ kubectl edit gslbconfig -n avi-system gc-1
// Set the field spec.gslbLeader.controllerIP to the new leader's IP address
b. Set the isLeader
field in the AMKOCluster
object to true on this cluster:
$ kubectl edit amkocluster amkocluster-federation -n avi-system
This would reboot the new leader AMKO. Post reboot, the new leader
will take over the responsibilities of the earlier leader.
The cluster where the leader AMKO was deployed, fails. Since the Avi GSLB leader is still active, the user only has to choose a new AMKO leader out of the followers. The steps to recover AMKO and designation of the new leader
remains the same as the above section:
$ kubectl edit amkocluster amkocluster-federation -n avi-system
// set the spec.leader field to true
At any given point in time, the architecture only allows a single AMKO leader
. Conflicts leading from more than 1 leader must be resolved by the admin manually. This does not have any traffic impact on the existing GslbServices objects.
To resolve this situation, the admin must convert one of the leader AMKOs to follower by setting spec.isLeader
field to false
in the AMKOCluster
object:
$ kubectl edit amkocluster amkocluster-federation -n avi-system
This is especially important for situations, when a cluster, which hosted a leader instance of AMKO, previously failed. The user has switched a follower AMKO to be the new leader. And, the failed cluster recovers and brings back the old leader AMKO. The user must set the old leader to follower.