The Controller cluster in the NSX platform is the control plane component that is responsible in managing the switching and routing modules in the hypervisors.
The use of controller cluster in managing VXLAN based logical switches eliminates the need for multicast.
Each Controller Node is assigned a set of roles that define the type of tasks the node can implement. By default, each Controller Node is assigned all roles.
NSX controller roles:
API provider: Handles HTTP web service requests from external clients (NSX Manager) and initiates processing by other Controller Node tasks.
Persistence Server: Stores data from the NVP API and vDS devices that must be persisted across all Controller Nodes in case of node failures or shutdowns.
Logical manager: Monitors when endhosts arrive or leave vDS devices and configures the vDS forwarding states to implement logical connectivity and policies..
Switch manager: Maintains management connections for one or more vDS devices.
Directory server: manage VXLAN and the distributed logical routing directory of information.
Any multi-node HA mechanism has the potential for a “split brain” scenario in which a cluster is partitioned into two or more groups, and those groups are not able to communicate. In this scenario, each group might assume control of all tasks under the assumption that the other nodes have failed. NSX uses leader election to solve this split-brain problem. One of the Controller Nodes is elected as a leader for each role, which requires a majority vote of all active and inactive nodes in the cluster.
The leader for each role is responsible for allocating tasks to individual Controller Nodes and determining when a node has failed. Since election requires a majority of all nodes,
it is not possible for two leaders to exist simultaneously within a cluster, preventing a split brain scenario. The leader election mechanism requires a majority of all cluster nodes to be functional at all times.
Note: Currently NSX-V 6.1 support maximum 3 controllers
Here is example of 3 NSX Controllers and role election per Node members.
Node 1 master for roles: API Provider and Logical Manager
Node 2 master for roles: Persistence Server and Directory Server
Node 3 master for roles: Switch Manger.
The different majority number scenarios depending on the number of Controller Cluster nodes. It is evident how deploying 2 nodes (traditionally considered an example of a redundant system) would increase the scalability of the Controller Cluster (since at steady state two nodes would work in parallel)
without providing any additional resiliency. This is because with 2 nodes, the majority number is 2 and that means that if one of the two nodes were to fail, or they lost communication with each other (dual-active scenario), neither of them would be able to keep functioning (accepting API calls, etc.). The same considerations apply to a deployment with 4 nodes that cannot provide more resiliency than a cluster with 3 elements (even if providing better performance).
TSHOT NSX controllers
The next part of TSHOT NSX Controller base on VMware NSX MH 4.1 User Guide:
NSX Controller nodes ip address for the next screenshots are:
Node1 192.168.110.201, Node1 192.168.110.202, Node1 192.168.110.202
Verify NSX Controller installation
Ensure that the Controllers are installed on systems that meet the minimum requirements.
On each Controller:
The CLI command “request system compatibility-report” provides informational details that determine whether a Controller system is compatible with the Controller requirements.
# request system compatibility-report
Check controller status in NSX Manager
The NSX Manager continually checks whether all Controller Clusters are accessible. If a Controller Cluster is currently in disconnected status, your diagnostic efforts and log review should be focused on the time immediately after the Controller Cluster was last seen as connected.
Here example of “Disconnected” controller from NSX Manager:
This NSX “Controller nodes status” screenshot show status between the NSX Manager to Controller and not the overall controller cluster status.
So even if we have all controllers in “Normal”state like the figure below , that doesn’t mean the overall controller status is ok.
Checking the Controller Cluster Status from CLI
The current status of the Controller Cluster can be determined by running show control-cluster status:
# show control-cluster status
Join status: verify this node complete join to clusters process.
Majority status: check if this cluster is part of the majority.
Cluster ID: all node members need to be in the same cluster id
The current status of the Controller Node’s intra-cluster communication connections can be determined by running
show control-cluster connections
If a Controller node is a Controller Cluster majority leader, it will be listening on port 2878 (as indicated by the Y in the “listening” column).
The other Controller nodes will have a dash (-) in the “listening” column.
The next step is to check whether the Controller Cluster majority leader has any open connections as indicated by the number in the “open conns” column. On a properly functioning Controller, the open connections should be the same as the number of other Controller nodes in the Controller Cluster (e.g. In a three-node Controller Cluster, the Controller Cluster majority leader should show two open connections).
The command show control-cluster history will allow you to see a history of Controller Cluster-related events on this node including restarts, upgrades, Controller Cluster errors and loss of majority.
controller # show control-cluster history
Joining a Controller Node to Controller Cluster
This section covers issues that may be encountered when attempting to join a new Controller Node to an existing Controller Cluster. An explanation of why the issue occurs and instructions on how to resolve the issue are also provided.
Symptom: Joining a new Controller node to a Controller Cluster may fail all of the existing Controllers are disconnected.
Example for this situation:
As we can see controller-1 and controller-2 are in disconnected from the NSX manager
When we try to add new controller cluster we get this error message:
If n nodes have joined the NSX Controller Cluster, then a majority (strictly greater than 50%) of those n nodes must be alive and connected to each other, before any new data to the system. This means that if you have a Controller Cluster of 3 nodes, 2 of them must be alive and connected in order for new data to be written in NSX.
In our case to add new controller node to cluster we need at least on member of the cluster to be in “Normal” state.
Resolution: Start the Disconnected Controller. If the Controller is disconnected due to a permanent failure, remove the Controller from the Controller Cluster.
Symptom: the join control-cluster CLI command hangs without ever completing the join operation.
The IP address passed into the join control-cluster command was incorrect, and/or does not refer to a currently live Controller node.
For example the user type the command:
join control-cluster 192.168.110.201
Make sure that 192.168.110.201 is part of existing controller cluster.
Use the IP address of a properly configured Controller that is reachable across the network.
The join control-cluster CLI command fails.
Explanation: If you have a Controller configured as part of a Controller Cluster, that Controller has been disconnected from the Controller Cluster for a long period of time (perhaps it was taken offline or shut down), and during that time, the other Controllers in that Controller Cluster were removed from the Controller Cluster and formed into a new Controller Cluster, then the long-disconnected Controller will not be allowed to rejoin the Controller Cluster that it left, because that original Controller Cluster is gone.
The following event log message in the new Controller Cluster indicates that something like this has happened:
Node b567a47f-9a61-43b3-8d53-36b3d1fd0675 tried to join with incorrect cluster ID
You must issue the join control-cluster command with the force option on the old Controller to force it to clear its state and join the new Controller Cluster with a fresh start.
Note: The forced join command deletes previously joined node with the same IP.
nvp-controller # join control-cluster 192.168.110.201 force
Recovering node disconnect from cluster
When controller cluster majority issue arises, it will very difficult to spot it from the NSX manager GUI.
For example the current state of the controllers from the NSX manager point of view is that all the member are in “Normal” state.
But in fact the current status in my cluster is:
Node1 + Node 2 are create cluster and share the roles between them, for some rezone Node 3 disconnected from the majority of the cluster:
Output example from controller Node 3:
Node 3 think his alone and own all of the roles.
From Node 1 perspective he is the leader (have the Y) and have one open connection from Node2 as show:
To recover from this scenario Node 3 need to join to majority of the cluster, the ip address to join need to be to Node1 because his the leader of the majority.
join control-cluster 192.168.110.201 force
Recovering from lost all Controller Nodes
In this scenario all NSX Controller nodes failed or deleted, Do we need start from scratch ? 😦
The assumption is our environment already deployed NSX Edge, DLR and we have logical switch connected to VM’s and would like to preserve it.
The recovering process:
Migrate existing logical switch to Multicast mode.
Deployed 3 new NSX controllers.
Sync the new deployed NSX controllers to unicast mode with the current state of our NSX.
other useful commands:
Checking Controller Processes
Even if the “join-cluster” command on a node appears to have been successful, the node might not have come up completely for a variety of reasons. The way this error tends to manifest itself most visibly is that the controller process isn’t listening on all the ports it’s supposed to be, and no API requests or switch connections are happening.
# show network connections of-type tcp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program
tcp 0 0 172.29.1.20:6633 0.0.0.0:* LISTEN 14038/domain
tcp 0 0 172.29.1.20:7000 0.0.0.0:* LISTEN 14072/java
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 14067/domain
tcp 0 0 172.29.1.20:7777 0.0.0.0:* LISTEN 14038/domain
tcp 0 0 172.29.1.20:6632 0.0.0.0:* LISTEN 14038/domain
tcp 0 0 172.29.1.20:9160 0.0.0.0:* LISTEN 14072/java
tcp 0 0 172.29.1.20:2888 0.0.0.0:* LISTEN 14072/java
tcp 0 0 172.29.1.20:2888 172.29.1.20:55622 ESTABLISHED 14072/java
tcp 0 0 172.29.1.20:9160 172.29.1.20:52567 ESTABLISHED 14072/java
tcp 0 0 172.29.1.20:52566 172.29.1.20:9160 ESTABLISHED 14038/domain
tcp 0 0 172.29.1.20:443 172.17.21.9:46438 ESTABLISHED 14067/domain
The show network connection output shown in the preceding block is an example from a healthy Controller. If you find some of these missing, it’s likely that NSX didn’t get past its install phase. Here are some misconfigurations that can cause this:
Bad management address or listen IP
You’ve set an incorrect IP as the management-address, or as the listen-ip for one of the roles (like switch_manager or api_provider).
NSX attempts to bind to the specified address, and fails early if it cannot do so. You’ll see log messages in cloudnet_cpp.log.ERROR like:
E0506 01:20:17.099596 7188 dso-deployer.cc:516] Controller component installation of rpc-broker failed: Unable to bind a RPC port $tags:tracing:3ef7d1f519ffb7fb^
E0506 01:20:17.100162 7188 main.cc:271] RPC deployment subsystem not installed; exiting. $tags:tracing:3ef7d1f519ffb7fb^
Or in cloudnet_cpp.log.WARNING:
W0506 01:22:27.721777 7694 ssl-socket.cc:530] SSLSocket failed to bind to 126.96.36.199:6632: Cannot assign requested address
Note that if you are using DHCP for the IP addresses of your controller nodes (not recommended or supported), the IP address could have changed since the last time you configured it.
Verify that the IP addresses for switch_manager and api_provider are what they are supposed to be by performing the CLI command:
Bad first node address
You’ve provided the wrong IP address for the first node in the Controller Cluster. Run show
to determine whether the IPs listed correspond to the IPs of the Controllers in the Controller Cluster.
Out of disk space
The Controller may be out of disk space. Use the
see if any of the partitions have 0 bytes available.
The NSX CLI command show system statistics can be used to display resource utilization for disk space, disk I/O, memory, CPU and various other processes on the Controller Nodes. The command offers statistics with one-minute intervals for a window of one hour for various combinations. The show system statistics CLI command does auto-completion and can be used to view the list of metric data available.
show system statistics <datasource> : for the tabular output
show system statistics graph <datasource> : for the graphical format output
As an example, the following output shows the RRD statistics for the datasource disk_ops:write associated with the disk sda1 on the Controller in a tabular form:
# show system statistics disk-sda1/disk_ops:write
12:26 0.665714 <snip>
# show network interface
# show network default-gateway
# show network dns-servers
# show network ntp-servers
# show network ntp-status
# traceroute <ip_address or dns_name>
# ping <ip address>
# ping interface addr <alternate_src_ip> <ip_address>
# watch network interface breth0 traffic