NSX-V Troubleshooting registration to vCenter

NSX Manager is tightly connected to the vCenter  and with 1:1 relationship to vCenter.

With the process of connect NSX manager to vCenter we have two different configuration steps.

“Lookup Servers” and “vCenter Server”.

1

Lookup Service:

Lookup service allow group based authentication to NSX Manager, is optional configuration, without Lookup service configuration the functionality of NSX is not affected at all.

 VCenter Server:

This is must configuration. Register to vCenter injects a plugin into the vSphere Web Client for consumption within the Web management platform.

While try to Register to vCenter or configure the Lookup Service you might see this error:

“nested exception is java.net.UnknownHostException: vc-l-01a.corp.local( vc-l-01a.corp.local )”

2

Or when try to setup Lookup Services

“nested exception is java.net.UnknownHostException: vc-l-01a.corp.local( vc-l-01a.corp.local )”

3

Or similar to this Error:

“NSX Management Service operation failed.( Initialization of Admin Registration Service Provider failed. Root Cause: Error occurred while registration of lookup service, com.vmware.vim.sso.admin.exception.InternalError: General failure. )”

 

Most of the problems to register NSX Manager to vCenter or configure the SSO Lookup service are:

  1. Connectivity problem between the NSX managers to vCenter.
  2. Firewall block this connection.
  3. DNS not configured property on NSX Manager or vCenter.
  4. Time is not sync between NSX manager and vCenter.
  5. User used to SSO need to be with administrative rights.

 

TSHOT steps

Connectivity issue:

Verify connectivity from NSX manager to vCenter. Ping from NSX manager to vCenter with ip and FQDN.

Check for routing or static or default route in NSX Manager:

nsxmgr-l-01a# show ip route

Codes: K – kernel route, C – connected, S – static,

> – selected route, * – FIB route

S>* 0.0.0.0/0 [1/0] via 192.168.110.2, mgmt

C>* 192.168.110.0/24 is directly connected, mgmt

 

DNS Issue:

Verify DNS resolve from NSX manager to vCenter. Ping from NSX manager to vCenter with FQDN.

nsxmgr-l-01a# ping vc-l-01a.corp.local

PING vc-l-01a.corp.local (192.168.110.22): 56 data bytes

64 bytes from 192.168.110.22: icmp_seq=0 ttl=64 time=0.576 ms

If this not work verify that you configure DNS on the NSX manager.

Go to Manage -> Network -> DNS Servers:

4

Firewall Issue:

If you have firewall between NSX Manager and vCenter, verify it allow SSL on TCP/443, also allow ping for connective checks.

5

NTP issue:

Verify that actual time is sync between vCenter and NSX Manage.

6

From NSX Manager CLI:

nsxmgr-l-01a# show clock
Tue Nov 18 06:51:34 UTC 2014

 

From vCenter CLI:

vc-l-01a:~ # date
Tue Nov 18 06:51:31 UTC 2014

Note: After configuration of Time settings, Appliance needs to be restarted.

 

User permission issue:

This user using to register to vCen
ter or Lookup service need to be with administrative rights.

Try to work with default
administrator user: administrator@vsphere.local

Tagged with: , ,
Posted in Troubleshooting

Upgrade NSX-V, The right Way

During November I was the opportunity to take NSX advance bootcamp with one of brilliant PSO Architect in the NSX filed, Kevin Barrass
This blog was based on Kevin lecture, I add screenshots and my experience.

Upgrade NSX can be very easy if planned right, or very frustrating if we try to do shortcuts in the process. In this blog I will try to documents all the steps need for complete nsx-v upgrade.

High level upgrade flow:

Upgrade NSX Process

Upgrade NSX Process

Before start the upgrade procedure, pre upgrade steps must take under consideration:

  1. Read the NSX release notes.
  2. Check upgrade MD5 file.
  3. Verify the state of the NSX and vSphere infrastructure.
  4. Preserver the NSX infrastructure.

 

Read the NSX release notes:

How many times you face issue during upgrade process, waste hours of troubleshooting, sure you work exactly as guided, open support ticket and get answer: you hitting known upgrade issue and the workaround is writing in the release notes.  RTFM, Filling dummy…? J

This line writing with blood, do not skip this step!!!! Read the release notes:

 

Compare the MD5

Download any of your favorite MD5 tools, I’m using free winMd5Sum

2

Compare MD5 sum you get from Calculate against VMware official MD5 web site.

The link to software:

http://www.nullriver.com/

Verify NSX working state

Again this line came from filed, the scenario is you complete the upgrade process and now facing issue. How do we you know if the issue wasn’t there before we start the upgrade?

Do not assume everything is working before you start to touch the infrastructure, Check it!!! 

  1. Note current versions of NSX Manager, vCenter, ESXi and Edges Verify you can log into:
  • NSX Manager Web UI
  • vCenter and see NSX Manager in Plugin
  • ESG, DLR control VM’s
  1. Validate VXLAN is functional:
  • Ping between two VM’s on same logical switch (different hosts):
  • Ping -l 1472 –f <dest VM>
  • Ping between two VTEP’s (different hosts)
  • Ping ++netstack=vxlan -d -s 1572 <dest VTEP IP>
  1. Validate North south by pinging out from a VM
  2. Visual inspection of Host Prep, Logical Network Prep, Edges (check for all Green)

Verify vSphere working state

Check DRS is enabled on clusters

Validate vMotion functions correctly

Check host connection state with vCenter

 

Check you have minimum 3 esxi host in etch NSX Cluster.

During NSX upgrade in some situation, NSX cluster with 2 hosts or less can causes issues with DRS/Admission control/Anti-Affinity rules.  My recommendation to get success with upgrade process, try to work with 3 host in etch NSX cluster you plan to upgrade.

Preserve the NSX infrastructure

Do the upgrade during a maintenance window

Backups NSX-Managers:

Create a current backup of the NSX Manager, Check you know the backup password 🙂

3

Take VM level snapshot where possible of the NSX Manager:

4

Take controller snapshot

Starting from 6.0.4 we have spetial API call to take snapshot of controller.

https://NSXManagerIPAddress/api/2.0/vdn/controller/controllerID/snapshot

Example of backup controller-1:

5

 

To actually get the backup file we can use browser:

6

Starting from version 6.1.x we can take snapshot via UI:

7

Export the Distributed Firewall Rules if DFW is in use:

8

Upgrade NSX manager

Verify the NSX manager OVA file name ended with tar.gz

Some browser may remove the gz extension, if the file look like:

VMware-NSX-Manager-upgrade-bundle-6.1.0-X.X.gz

Change it to:

VMware-NSX-Manager-upgrade-bundle-6.1.0-2107742.tar.gz

Otherwise you will get error after complete uploading the OVA file to NSX manager:

“Invalid upgrade bundle file VMware-NSX-Manager-upgrade-bundle-6.0.x-xxxxx,gz, upgrade file name has extension tar.gz”

9

NSX manager Upgrade, Open NSX manager web interface and click on the Upgrade:

10

Click on the upgrade baton:

11

Click “Browse” and open the upgrade file, click Continue:

12

 

Note: NSX Manager will reboot during upgrade process, the forwarding path of VM workloads will not affected during this step unless:

We are using user identity with distributed firewall and new user login during NSX Manager is down.

 

The upgrade process built on two steps: validate the tar.gz image and start the actual upgrade process:

13

When NSX manager finish the validated process, the upgrade process start:

14

After complete upgrading Manager, Confirm the version from the Summary Tab of the NSX Manager Web UI:

15

Upgrade the NSX controllers

During upgrade controller nodes, the upgrade file is download to etch node, the process will start to upgrade node1, then node2 and end node3.

To start the upgrade process click on the “Upgrade Available”

16

During upgrade NSX controller we will face this state:

Node1: complete upgraded to 6.1

Node2: Is rebooting

Node3: In Normal state but in version 6.00

Results: we have one node active in 6.1 as conscience controller loss of Majority due to version mismatch

17

What does it mean? -> Impact on Control plane

Working with enable DRS live virtual enviroment, vMotion of VM can happen, VM may change is currint esxi host location, as results may face forwading issue because of other VTEP will not reflect this update.

Other issue may ocure if dynamic routing get update of topology state, for example new route add or remove. To avoid this issue we need keep routing unchange.

To limited the expose time window for forwading issue with worload VM’s my recommendation is the change the DRS setting to maual, this will limit the VM vMotion in NSX clusters durring controller update!!

18

Note: After compelte controller upgrade, change it back to privios configuration.

If we sure: VMs must not move, Dynamic routes must not change, then No impact on data plane

When controller node-2 complete is rebooting process, we get two controllers upgraded and on same version. At that point we gain back cluster majority, controller node-1 still need to finish his upgrade and rebooting process.

19

When all tree controller nodes completed the rebooting the cluster is upgrade.

20

Upgrade Clusters

During upgrade NSX clusters, esxi host required reboot, there will no impact on data plane for VM’s because thy will move automatically with DRS.

If DRS is disable, vSphere admin will need to move VM’s manually and reboot this esxi host.

This is rezone admission control with 2 hosts may prevent automatic host upgrade. My recommendation is to avoid 2 host clusters, or manually evacuate a host and put into maintenance mode.

If you have created anti-affinity rules for Controllers, 3 hosts will prevent the upgrade.

21

Disable anti-affinity rules by uncheck “Enable rule” for automatic hosts upgrade and enable it after upgrade complete.

22

With default anti-affinity rules for Edges/DLR, 2 hosts will prevent the upgrade. Uncheck the “Enable rule” anti-affinity rules for Edges to allow automatic hosts upgrade. Enable it after upgrade compete.

23

Click Cluster Host “Update”

If an upgrade is available to the Cluster an “Update” link is available in the NSX. When upgrade is initiated NSX Manager updates the NSX VIB on each host

Click on “update” to upgrade Cluster:

24

VIBs are updated on hosts

25

host reboot during upgrade:

26

Task view will reveal what happen during upgrade process run:

27

Once all hosts are rebooted, the host update is completed.

28

Upgrade DLR and ESG’s

During the upgrade process new ESG VM is deployed alongside the existing one, when the new ESG is ready, old ESG vnic are disconnected and new ESG vnics connected. The New ESG send GARP.

This process can affect forwarding plan, we can minims it with Edge working in ECMP mode.

Go to NSX Edges and Upgrade each one

29

Each ESG/DLR will then be upgraded

Check status is deployed and at correct version

31

Upgrade Guest Introspection / Data Security if required

NSX Guest Introspection / Data Security One Upgrade

If an upgrade is available to the Guest Introspection / Data Security an upgrade link is available in the NSX UI.

32

Click on upgrade if available

Follow NSX installation guide for specific details on upgrading Guest Introspection / Data Security.

Once upgrade is successful create new NSX Manager backup

The previous NSX Manager backup is only valid for the previous release

33

Don’t forget to Verify NSX working state

Tagged with:
Posted in Upgrade

Troubleshooting NSX-V Controller

Overview

The Controller cluster in the NSX platform is the control plane component that is responsible in managing the switching and routing modules in the hypervisors.

The use of controller cluster in managing VXLAN based logical switches eliminates the need for multicast.

1

Each Controller Node is assigned a set of roles that define the type of tasks the node can implement. By default, each Controller Node is assigned all roles.

NSX controller roles:

API provider: Handles HTTP web service requests from external clients (NSX Manager) and initiates processing by other Controller Node tasks.

Persistence Server: Stores data from the NVP API and vDS devices that must be persisted across all Controller Nodes in case of node failures or shutdowns.

Logical manager: Monitors when endhosts arrive or leave vDS devices and configures the vDS forwarding states to implement logical connectivity and policies..

Switch manager: Maintains management connections for one or more vDS devices.

Directory server: manage VXLAN and the distributed logical routing directory of information.

Any multi-node HA mechanism has the potential for a “split brain” scenario in which a cluster is partitioned into two or more groups, and those groups are not able to communicate. In this scenario, each group might assume control of all tasks under the assumption that the other nodes have failed. NSX uses leader election to solve this split-brain problem. One of the Controller Nodes is elected as a leader for each role, which requires a majority vote of all active and inactive nodes in the cluster.

2

The leader for each role is responsible for allocating tasks to individual Controller Nodes and determining when a node has failed. Since election requires a majority of all nodes,

it is not possible for two leaders to exist simultaneously within a cluster, preventing a split brain scenario. The leader election mechanism requires a majority of all cluster nodes to be functional at all times.

Note: Currently NSX-V 6.1 support maximum 3 controllers

Here is example of 3 NSX Controllers and role election per Node members.

3

Node 1 master for roles:  API Provider and Logical Manager

Node 2 master for roles: Persistence Server and Directory Server

Node 3 master for roles: Switch Manger.

The different majority number scenarios depending on the number of Controller Cluster nodes. It is evident how deploying 2 nodes (traditionally considered an example of a redundant system) would increase the scalability of the Controller Cluster (since at steady state two nodes would work in parallel)

without providing any additional resiliency. This is because with 2 nodes, the majority number is 2 and that means that if one of the two nodes were to fail, or they lost communication with each other (dual-active scenario), neither of them would be able to keep functioning (accepting API calls, etc.). The same considerations apply to a deployment with 4 nodes that cannot provide more resiliency than a cluster with 3 elements (even if providing better performance).

 

TSHOT NSX controllers

The next part of TSHOT NSX Controller base on VMware NSX MH 4.1 User Guide:

https://my.vmware.com/web/vmware/details?productId=418&downloadGroup=NSX-MH-412-DOC

NSX Controller nodes ip address for the next screenshots are:

Node1 192.168.110.201, Node1 192.168.110.202, Node1 192.168.110.202

Verify NSX Controller installation

Ensure that the Controllers are installed on systems that meet the minimum requirements.
On each Controller:

The CLI command “request system compatibility-report” provides informational details that determine whether a Controller system is compatible with the Controller requirements.

# request system compatibility-report

4

 

Check controller status in NSX Manager

The NSX Manager continually checks whether all Controller Clusters are accessible. If a Controller Cluster is currently in disconnected status, your diagnostic efforts and log review should be focused on the time immediately after the Controller Cluster was last seen as connected.

Here example of “Disconnected” controller from NSX Manager:

5

This NSX “Controller nodes status” screenshot show status between the NSX Manager to Controller and not the overall controller cluster status.

So even if we have all controllers in “Normal”state like the figure below , that doesn’t mean the overall controller status is ok.  

11

Checking the Controller Cluster Status from CLI

The current status of the Controller Cluster can be determined by running show control-cluster status:

 

# show control-cluster status

6

Join status: verify this node complete join to clusters process.

Majority status: check  if this cluster is part of the majority.

Cluster ID: all node members need to be in the same cluster id

The current status of the Controller Node’s intra-cluster communication connections can be determined by running

show control-cluster connections

7

If a Controller node is a Controller Cluster majority leader, it will be listening on port 2878 (as indicated by the Y in the “listening” column).

The other Controller nodes will have a dash (-) in the “listening” column.

The next step is to check whether the Controller Cluster majority leader has any open connections as indicated by the number in the “open conns” column. On a properly functioning Controller, the open connections should be the same as the number of other Controller nodes in the Controller Cluster (e.g. In a three-node Controller Cluster, the Controller Cluster majority leader should show two open connections).

The command show control-cluster history will allow you to see a history of Controller Cluster-related events on this node including restarts, upgrades, Controller Cluster errors and loss of majority.

controller # show control-cluster history

8

Joining a Controller Node to Controller Cluster

This section covers issues that may be encountered when attempting to join a new Controller Node to an existing Controller Cluster. An explanation of why the issue occurs and instructions on how to resolve the issue are also provided.

Symptom: Joining a new Controller node to a Controller Cluster may fail all of the existing Controllers are disconnected.

Example for this situation:

As we can see controller-1 and controller-2 are in disconnected from the NSX manager

5

When we try to add new controller cluster we get this error message:

10

Explanation:

If n nodes have joined the NSX Controller Cluster, then a majority (strictly greater than 50%) of those n nodes must be alive and connected to each other, before any new data to the system. This means that if you have a Controller Cluster of 3 nodes, 2 of them must be alive and connected in order for new data to be written in NSX.

In our case to add new controller node to cluster we need at least on member of the cluster to be in “Normal” state.

17

 

Resolution: Start the Disconnected Controller. If the Controller is disconnected due to a permanent failure, remove the Controller from the Controller Cluster.

Symptom: the join control-cluster CLI command hangs without ever completing the join operation.

Explanation:

The IP address passed into the join control-cluster command was incorrect, and/or does not refer to a currently live Controller node.

For example the user type the command:

join control-cluster 192.168.110.201

Make sure that 192.168.110.201 is part of existing controller cluster.

Resolution:

Use the IP address of a properly configured Controller that is reachable across the network.

Symptom:

The join control-cluster CLI command fails.

Explanation: If you have a Controller configured as part of a Controller Cluster, that Controller has been disconnected from the Controller Cluster for a long period of time (perhaps it was taken offline or shut down), and during that time, the other Controllers in that Controller Cluster were removed from the Controller Cluster and formed into a new Controller Cluster, then the long-disconnected Controller will not be allowed to rejoin the Controller Cluster that it left, because that original Controller Cluster is gone.

The following event log message in the new Controller Cluster indicates that something like this has happened:

Node b567a47f-9a61-43b3-8d53-36b3d1fd0675 tried to join with incorrect cluster ID

Resolution:

You must issue the join control-cluster command with the force option on the old Controller to force it to clear its state and join the new Controller Cluster with a fresh start.

Note: The forced join command deletes previously joined node with the same IP.

nvp-controller # join control-cluster 192.168.110.201 force

18

Recovering node disconnect from cluster

When controller cluster majority issue arises, it will very difficult to spot it from the NSX manager GUI.

For example the current state of the controllers from the NSX manager point of view is that all the member are in “Normal” state.

11

But in fact the current status in my cluster is:

12

Node1 + Node 2 are create cluster and share the roles between them, for some rezone Node 3 disconnected from the majority of the cluster:

Output example from controller Node 3:

13

 

Node 3 think his alone and own all of the roles.

From Node 1 perspective he is the leader (have the Y) and have one open connection from Node2 as show:

14

 

To recover from this scenario Node 3 need to join to majority of the cluster, the  ip address to join need to be to Node1 because his the leader of the majority.

join control-cluster 192.168.110.201 force

Recovering from lost all Controller Nodes

In this scenario all NSX Controller nodes failed or deleted,  Do we need start from scratch ? 😦

The assumption is our environment already deployed NSX Edge, DLR and we have logical switch connected to VM’s and would like to preserve it.

The recovering process:

 Step 1:

Migrate existing logical switch to Multicast mode.

15

Step 2:

Deployed 3 new NSX controllers.

Step 3:

Sync the new deployed NSX controllers to unicast mode with the current state of our NSX.

16

other useful commands:

Checking Controller Processes

Even if the “join-cluster” command on a node appears to have been successful, the node might not have come up completely for a variety of reasons. The way this error tends to manifest itself most visibly is that the controller process isn’t listening on all the ports it’s supposed to be, and no API requests or switch connections are happening.

# show network connections of-type tcp

Active Internet connections (servers and established)

Proto Recv-Q Send-Q Local Address      Foreign Address     State       PID/Program

tcp        0      0 172.29.1.20:6633   0.0.0.0:*           LISTEN      14038/domain

tcp        0      0 172.29.1.20:7000   0.0.0.0:*           LISTEN      14072/java

tcp        0      0 0.0.0.0:443        0.0.0.0:*           LISTEN      14067/domain

tcp        0      0 172.29.1.20:7777   0.0.0.0:*           LISTEN      14038/domain

tcp        0      0 172.29.1.20:6632   0.0.0.0:*           LISTEN      14038/domain

tcp        0      0 172.29.1.20:9160   0.0.0.0:*           LISTEN      14072/java

tcp        0      0 172.29.1.20:2888   0.0.0.0:*           LISTEN      14072/java

tcp        0      0 172.29.1.20:2888   172.29.1.20:55622   ESTABLISHED 14072/java

tcp        0      0 172.29.1.20:9160   172.29.1.20:52567   ESTABLISHED 14072/java

tcp        0      0 172.29.1.20:52566  172.29.1.20:9160    ESTABLISHED 14038/domain

tcp        0      0 172.29.1.20:443    172.17.21.9:46438   ESTABLISHED 14067/domain

 

The show network connection output shown in the preceding block is an example from a healthy Controller. If you find some of these missing, it’s likely that NSX didn’t get past its install phase.  Here are some misconfigurations that can cause this:

Bad management address or listen IP

You’ve set an incorrect IP as the management-address, or as the listen-ip for one of the roles (like switch_manager or api_provider).

NSX attempts to bind to the specified address, and fails early if it cannot do so.  You’ll see log messages in cloudnet_cpp.log.ERROR like:

E0506 01:20:17.099596  7188 dso-deployer.cc:516] Controller component installation of rpc-broker failed: Unable to bind a RPC port $tags:tracing:3ef7d1f519ffb7fb^

E0506 01:20:17.100162  7188 main.cc:271] RPC deployment subsystem not installed; exiting. $tags:tracing:3ef7d1f519ffb7fb^

Or in cloudnet_cpp.log.WARNING:

W0506 01:22:27.721777  7694 ssl-socket.cc:530] SSLSocket failed to bind to 172.1.1.1:6632: Cannot assign requested address

Note that if you are using DHCP for the IP addresses of your controller nodes (not recommended or supported), the IP address could have changed since the last time you configured it.

Verify that the IP addresses for switch_manager and api_provider are what they are supposed to be by performing the CLI command:

<switch_manager|api_provider>  listen-ip

 

Bad first node address

You’ve provided the wrong IP address for the first node in the Controller Cluster.   Run show

control-cluster startup-nodes

to determine whether the IPs listed correspond to the IPs of the Controllers in the Controller Cluster.

 

Out of disk space

The Controller may be out of disk space. Use the

“show status”

see if any of the partitions have 0 bytes available.

The NSX CLI command show system statistics can be used to display resource utilization for disk space, disk I/O, memory, CPU and various other processes on the Controller Nodes. The command offers statistics with one-minute intervals for a window of one hour for various combinations. The show system statistics CLI command does auto-completion and can be used to view the list of metric data available.

show system statistics <datasource>       : for the tabular output
show system statistics graph <datasource> : for the graphical format output

 

As an example, the following output shows the RRD statistics for the datasource disk_ops:write associated with the disk sda1 on the Controller in a tabular form:

# show system statistics disk-sda1/disk_ops:write

Time  Write

12:29             0.74

12:28         0.731429

12:27         0.617143

12:26         0.665714  <snip>

 

more commands:

# show network interface
# show network default-gateway
# show network dns-servers
# show network ntp-servers
# show network ntp-status
# traceroute <ip_address or dns_name>
# ping <ip address>
# ping interface addr <alternate_src_ip> <ip_address>
# watch network interface breth0 traffic
 
Posted in NSX-V, Troubleshooting

NSX Load Balancing

This next overview of Load Balancing  was taken from great work of Max Ardica and Nimish Desai in the official NSX Design Guide:

Overview

Load Balancing is another network service available within NSX that can be natively enabled on the NSX Edge device. The two main drivers for deploying a load balancer are scaling out an application (through distribution of workload across multiple servers), as well as improving its high-availability characteristics

NSX Load Balancing

NSX Load Balancing

The NSX load balancing service is specially designed for cloud with the following characteristics:

  • Fully programmable via API
  • Same single central point of management/monitoring as other NSX network services

The load balancing services natively offered by the NSX Edge satisfies the needs of the majority of the application deployments. This is because the NSX Edge provides a large set of functionalities:

  • Support any TCP applications, including, but not limited to, LDAP, FTP, HTTP, HTTPS
  • Support UDP application starting from NSX SW release 6.1.
  • Multiple load balancing distribution algorithms available: round-robin, least connections, source IP hash, URI
  • Multiple health checks: TCP, HTTP, HTTPS including content inspection
  • Persistence: Source IP, MSRDP, cookie, ssl session-id
  • Connection throttling: max connections and connections/sec
  • L7 manipulation, including, but not limited to, URL block, URL rewrite, content rewrite
  • Optimization through support of SSL offload

Note: the NSX platform can also integrate load-balancing services offered by 3rd party vendors. This integration is out of the scope for this paper.

In terms of deployment, the NSX Edge offers support for two types of models:

  • One-arm mode (called proxy mode): this scenario is highlighted in Figure below and consists in deploying an NSX Edge directly connected to the logical network it provides load-balancing services for.
One-Arm Mode Load Balancing Services

One-Arm Mode Load Balancing Services

The one-armed load balancer functionality is shown above:

  1. The external client sends traffic to the Virtual IP address (VIP) exposed by the load balancer.
  2. The load balancer performs two address translations on the original packets received from the client: Destination NAT (D-NAT) to replace the VIP with the IP address of one of the servers deployed in the server farm and Source NAT (S-NAT) to replace the client IP address with the IP address identifying the load-balancer itself. S-NAT is required to force through the LB the return traffic from the server farm to the client.
  3. The server in the server farm replies by sending the traffic to the LB (because of the S-NAT function previously discussed).

The LB performs again a Source and Destination NAT service to send traffic to the external client leveraging its VIP as source IP address.

The advantage of this model is that it is simpler to deploy and flexible as it allows deploying LB services (NSX Edge appliances) directly on the logical segments where they are needed without requiring any modification on the centralized NSX Edge providing routing communication to the physical network. On the downside, this option requires provisioning more NSX Edge instances and mandates the deployment of Source NAT that does not allow the servers in the DC to have visibility into the original client IP address.

Note: the LB can insert the original IP address of the client into the HTTP header before performing S-NAT (a function named “Insert X-Forwarded-For HTTP header”). This provides the servers visibility into the client IP address but it is obviously limited to HTTP traffic.

Inline mode (called transparent mode) requires instead deploying the NSX Edge inline to the traffic destined to the server farm. The way this works is shown in Figure below.

Two-Arms Mode Load Balancing Services

Two-Arms Mode Load Balancing Services

    1. The external client sends traffic to the Virtual IP address (VIP) exposed by the load balancer.
    2. The load balancer (centralized NSX Edge) performs only Destination NAT (D-NAT) to replace the VIP with the IP address of one of the servers deployed in the server farm.
    3. The server in the server farm replies to the original client IP address and the traffic is received again by the LB since it is deployed inline (and usually as the default gateway for the server farm).
    4. The LB performs Source NAT to send traffic to the external client leveraging its VIP as source IP address.

    This deployment model is also quite simple and allows the servers to have full visibility into the original client IP address. At the same time, it is less flexible from a design perspective as it usually forces using the LB as default gateway for the logical segments where the server farms are deployed and this implies that only centralized (and not distributed) routing must be adopted for those segments. It is also important to notice that in this case LB is another logical service added to the NSX Edge already providing routing services between the logical and the physical networks. As a consequence, it is recommended to increase the form factor of the NSX Edge to X-Large before enabling load-balancing services.

     

    In terms of scalability and throughput figures, the NSX load balancing services offered by each single NSX Edge can scale up to (best case scenario):

    • Throughput: 9 Gbps
    • Concurrent connections: 1 million
    • New connections per sec: 131k

     

    In below are some deployment examples of tenants with different applications and different load balancing needs. Notice how each of these applications is hosted on the same Cloud with the network services offered by NSX.

Deployment Examples of NSX Load Balancing

Deployment Examples of NSX Load Balancing

Two final important points to highlight:

  • The load balancing service can be fully distributed across This brings multiple benefits:
  • Each tenant has its own load balancer.
  • Each tenant configuration change does not impact other tenants.
  • Load increase on one tenant load-balancer does not impact other tenants load-balancers scale.
  • Each tenant load balancing service can scale up to the limits mentioned above.

Other network services are still fully available

  • The same tenant can mix its load balancing service with other network services such as routing, firewalling, VPN.

 

One Arm Load Balance Lab Topology

We have 3-tier application built from:

Web servers: web-sv-01a 172.16.10.11 , web-sv-02a 172.16.10.12

App: app-sv-01a 172.16.20.11

DB: db-sv-01a 172.16.30.11

We will add to this lab NSX Edge service gateway (ESG) for load balancer function.

The ESG (in red line) located at Web VXLAN 172.16.10.10 ip address.

One-Armed Lab topology

 

Configure One Arm Load Balance

Create NSX Edge gateway:

One-Arem-1

Select Edge Service Gateway (ESG):
One-Arem-2

Set the Admin password, enable SSH and Auto rule:

One-Arem-3

Install the ESG in Management Cluster:

One-Arem-4

In our lab appliance size is Compact, but we should pic the right size according to amount of traffic expected:

One-Arem-5

Configure interface and ip address, since this is one-arm we have only one interface:

One-Arem-6

Create default gateway

One-Arem-8

Configure default accept fw rule:

One-Arem-9

Complete the installation:

One-Arem-10

Verify ESG is deployed:

One-Arem-11

Enable Load Balance in the ESG, go to Load Balance and click Edit:

One-Arem-12

Mark checkbox “Enable Load Balancer”

One-Arem-13

Create the application profile:

One-Arem-14

Describe a name , in the Type select HTTPS and Enable SSL Passthrough:

One-Arem-15

Create the pool:

One-Arem-16

In the Algorithm select ROUND-ROBIN, Monitor is default https, and add two servers memeber to monitor:

One-Arem-16h

To add Members click on the + icon, the port we monitor is 443:

One-Arem-17

We need crete the VIP:

One-Arem-18

In this step we glue all the configuration parts, tied the application profile to pool and give it Virtual ip address:

One-Arem-19

Now we can check with client web browser to VIP ip address that load balancer is actually works.

in the web browser we open to VIP 172.16.10.10 ip addres.

the results was hit 172.16.10.11 web-sv-01a:

One-Arem-verification-1

When we try to refresh our web browser client we see we hit 172.16.10.12 web-sv-02a :

One-Arem-verification-2

Troubleshooting One Arm Load Balance

General Loadbalancer troubleshooting workflow

Review the configuration through UI

Check the pool member status through UI

Do online troubleshooting via CLI:

  • Check LB engine status (L4/L7)
  • Check LB objects statistics (vips, pools, members)
  • Check Service Monitor status (OK, WARNING, CRITICAL)
  • Check system log message (# show log)
  • Check LB L4/L7 session table
  • Check LB L7 sticky-table status

 

Check the configuration through UI

 

 

One-Arem-TSHOT-1

 

  1. Check the pool member status through UI:

 

One-Arem-TSHOT-2

Possible errors discovered:

  1. 80/443 port might be used by other services (e.g. sslvpn);
  2. Member port and monitor port are miss configured hence health check failed.
  3. Member in WARNING state should be treated as DOWN.
  4. L4 LB is used when:
    a) TCP/HTTP protocol;
    b) no persistence settings and L7 settings;
    c) accelerateEnable is true;
  5. Pool is in transparent mode but Edge doesn’t sit in the return pat

Do online troubleshooting via CLI:

Check LB engine status (L4/L7)

# show service loadbalancer

Check LB objects statistics (vips, pools, members)

# show service loadbalancer virtual [vip-name]

# show service loadbalancer pool [poo-name]

Check Service Monitor status (OK, WARNING, CRITICAL)

# show service loadbalancer monitor

Check system log message

# show log

Check LB session table

# show service loadbalancer session

Check LB L7 sticky-table status

# show service loadbalancer table

 

 

One-Arm-LB-0> show service loadbalancer
<cr>
error Show loadbalancer Latest Errors information.
monitor Show loadbalancer HealthMonitor information.
pool Show loadbalancer pool information.
session Show loadbalancer Session information.
table Show loadbalancer Sticky-Table information.
virtual Show loadbalancer virtualserver information.

#########################################################

One-Arm-LB-0> show service loadbalancer
———————————————————————–
Loadbalancer Services Status:

L7 Loadbalancer : running
Health Monitor : running

#########################################################

One-Arm-LB-0> show service loadbalancer monitor
———————————————————————–
Loadbalancer HealthMonitor Statistics:

POOL                               MEMBER                                  HEALTH STATUS
Web-Servers-Pool-01  web-sv-02a_172.16.10.12   default_https_monitor:OK
Web-Servers-Pool-01  web-sv-01a_172.16.10.11   default_https_monitor:OK
One-Arm-LB-0>

##########################################################

One-Arm-LB-0> show service loadbalancer virtual
———————————————————————–
Loadbalancer VirtualServer Statistics:

VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| SESSION (cur, max, total) = (0, 3, 35)
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)
+->POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

####################################################################
One-Arm-LB-0> show service loadbalancer pool
———————————————————————–
Loadbalancer Pool Statistics:

POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

##########################################################################

One-Arm-LB-0> show service loadbalancer session
———————————————————————–
L7 Loadbalancer Current Sessions:

0x5fe50a2b230: proto=tcpv4 src=192.168.110.10:49392 fe=Web-Servers-VIP be=Web-Servers-Pool-01 srv=web-sv-01a_172.16.10.11 ts=08 age=8s calls=3 rq[f=808202h,i=0,an=00h,rx=4m53s,wx=,ax=] rp[f=008202h,i=0,an=00h,rx=4m53s,wx=,ax=] s0=[7,8h,fd=13,ex=] s1=[7,8h,fd=14,ex=] exp=4m52s
0x5fe50a22960: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=09 age=0s calls=2 rq[f=c08200h,i=0,an=00h,rx=20s,wx=,ax=] rp[f=008002h,i=0,an=00h,rx=,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,0h,fd=-1,ex=] exp=20s
———————————————————————–

 

Disconnect web-sv-01a_172.16.10.11 from the network

 

 

One-Arem-TSHOT-3

From the GUI we can see the effect in members pool status:

One-Arem-TSHOT-4

 

One-Arm-LB-0> show service loadbalancer virtual
———————————————————————–
Loadbalancer VirtualServer Statistics:

VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| SESSION (cur, max, total) = (0, 3, 35)
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)
+->POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: DOWN
| | STATUS = DOWN, MONITOR STATUS = default_https_monitor:CRITICAL
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

Tagged with: ,
Posted in Design, Install

NSX L2 Bridging

Overview

This next overview of L2 Bridging  was taken from great work of Max Ardica and Nimish Desai in the official NSX Design Guide:

There are several circumstances where it may be required to establish L2 communication between virtual and physical workloads. Some typical scenarios are (not exhaustive list):

  • Deployment of multi-tier applications: in some cases, the Web, Application and Database tiers can be deployed as part of the same IP subnet. Web and Application tiers are typically leveraging virtual workloads, but that is not the case for the Database tier where bare-metal servers are commonly deployed. As a consequence, it may then be required to establish intra-subnet (intra-L2 domain) communication between the Application and the Database tiers.
  • Physical to virtual (P-to-V) migration: many customers are virtualizing applications running on bare metal servers and during this P-to-V migration it is required to support a mix of virtual and physical nodes on the same IP subnet.
  • Leveraging external physical devices as default gateway: in such scenarios, a physical network device may be deployed to function as default gateway for the virtual workloads connected to a logical switch and a L2 gateway function is required to establish connectivity to that gateway.
  • Deployment of physical appliances (firewalls, load balancers, etc.).

To fulfill the specific requirements listed above, it is possible to deploy devices performing a “bridging” functionality that enables communication between the “virtual world” (logical switches) and the “physical world” (non virtualized workloads and network devices connected to traditional VLANs).

NSX offers this functionality in software through the deployment of NSX L2 Bridging allowing VMs to be connected at layer 2 to a physical network (VXLAN to VLAN ID mapping), even if the hypervisor running the VM is not physically connected to that L2 physical network.

L2 Bridge topology

 

Figure above shows an example of L2 bridging, where a VM connected in logical space to the VXLAN segment 5001 needs to communicate with a physical device deployed in the same IP subnet but connected to a physical network infrastructure (in VLAN 100). In the current NSX-v implementation, the VXLAN-VLAN bridging configuration is part of the distributed router configuration; the specific ESXi hosts performing the L2 bridging functionality is hence the one where the control VM for that distributed router is running. In case of failure of that ESXi host, the ESXi hosting the standby Control VM (which gets activated once it detects the failure of the Active one) would take the L2 bridging function.

Independently from the specific implementation details, below are some important deployment considerations for the NSX L2 bridging functionality:

  • The VXLAN-VLAN mapping is always performed in 1:1 fashion. This means traffic for a given VXLAN can only be bridged to a specific VLAN, and vice versa.
  • A given bridge instance (for a specific VXLAN-VLAN pair) is always active only on a specific ESXi host.
  • However, through configuration it is possible to create multiple bridges instances (for different VXLAN-VLAN pairs) and ensure they are spread across separate ESXi hosts. This improves the overall scalability of the L2 bridging function.
  • The NSX Layer 2 bridging data path is entirely performed in the ESXi kernel, and not in user space. Once again, the Control VM is only used to determine the ESXi host where a given bridging instance is active, and not to perform the bridging function.

 

 

Configure L2 Bridge

In this scenario  we would like to Bridge Between App VM connected to VXLAN 5002 to virtual machine connected to VLAN 100.

Create Bridge 1

My current Logical Switch configuration:

Logical Switch table

We have pre configure port group in VLAN 100:

Port group

Configuration  Bridging done at DLR. in my lab dlr name is Distributed-Router:

Double  Click on the edge-1:

DLR1

 

Click on the Bridging and then green + button:

DLR2

Type Bridge Name, Logical Switch ID and Port-Group name:

DLR3

 

Click OK and Publish:

DLR4

 

Now VM on Logical Switch App-Tier-01 can communicate with Physical or virtual machine on VLAN 100.

 

Design Consideration

Currently in NSX-V 6.1 We can’t enable routing on the VXLAN logical switch that is bridged to a VLAN.

In other words Default Gateway for VLAN can’t be configured on the logical router:

None working  L2 Bridge Topology

None working L2 Bridge Topology

So How VM in VXLAN 5002 can communicate with VXLAN 5001 ?

The big difference is VXLAN 5002 is no longer connected to DLR Lif, it connected to NSX Edge

Working Bridge Topology

Redundancy

DLR Control VM can work in high availability mode, if the Active DLR control VM fail, the standby Control VM take over, that mean the Bridge instance will move to new

esxi host location.

HA

 

Troubleshooting:

We need to know where is the Active DLR Control VM is located(if we have HA). Inside this esxi host the Bridging happen in kernel space.

The easy way to find it is to look at “Configuration” Tab in the Manage:

DLR5We can see that Control VM located in esx-02a.corp.local

SSH to this esxi host,  find the Vdr Name of the DLR Control VM:

~ # net-vdr -I -l

VDR Instance Information :
—————————

Vdr Name: default+edge-1
Vdr Id: 1460487509
Number of Lifs: 4
Number of Routes: 5
State: Enabled
Controller IP: 192.168.110.201
Control Plane IP: 192.168.110.52
Control Plane Active: Yes
Num unique nexthops: 1
Generation Number: 0
Edge Active: Yes

Now we know that “default+edge-1” is the VDR name.

 

net-vdr -b –mac default+edge-1

###################################################################################################

~ # net-vdr -b –mac default+edge-1

VDR ‘default+edge-1’ bridge ‘Bridge_App_VLAN100’ mac address tables :
Network ‘vxlan-5002-type-bridging’ MAC address table:
total number of MAC addresses: 0
number of MAC addresses returned: 0
Destination Address Address Type VLAN ID VXLAN ID Destination Port Age
——————- ———— ——- ——– —————- —
Network ‘vlan-100-type-bridging’ MAC address table:
total number of MAC addresses: 0
number of MAC addresses returned: 0
Destination Address Address Type VLAN ID VXLAN ID Destination Port Age
——————- ———— ——- ——– —————- —

###################################################################################################

From this output we can see there is no any mac address learning ,

After connect VM to Logical Switch App-Tier-01 and ping VM in VLAN 100.

Now we can see mac address from both VXLAN 5002 and VLAN100:

net-vdr

 

Tagged with:
Posted in Install, Troubleshooting

NSX Role Based Access Control

One of the most challenging problems in managing large networks is the complexity of security administration.

“Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within an enterprise. In this context, access is the ability of an individual user to perform a specific task, such as view, create, or modify a file. Roles are defined according to job competency, authority, and responsibility within the enterprise”

Within NSX we have four built in roles, We can map User or Group to one of the NSX Role. but i think Instead of assigning roles to individual users the preferred way is to assigning role to group.

Organizations create user groups for proper user management. After integration with SSO, NSX Manager can get the details of groups to which a user belongs to.

NSX Roles

Within NSX Manager we have four pre built RBAC roles cover different nsx permission and area in NSX environment.

The four NSX built in roles are: Auditor, Security Administrator, NSX administrator and Enterprise Administrator:

NSX RBAC Diagram

NSX RBAC Diagram

Configure the Lookup Service in NSX Manager

Whenever we want to assign role on NSX, we can assign role to SSO User or Group. When Lookup service is not configured then the group based role assignment would not work i.e the user from that group would not be able to login to NSX.

The reason is we cannot fetch any group information from the SSO server. The group based authentication provider is only available when Lookup service is configured. User login where the user is explicitly assigned role on NSX will not be affected. This means that the customer has to individually assign roles to the users and would not be able to take advantage of SSO groups.

For NSX, vCenter SSO server is one of the identity provider for authentication. For authentication on NSX, prerequisite is that the user / group has to be assigned role on NSX.

NSX Manager Lookup Service

NSX Manager Lookup Service

Note: NTP/DNS must configure on the NSX Manager for lookup service to work.

Configure Active Directory Groups

In this blog i will use Microsoft Active directory  as user Identity source.  in “Active Directory Users and Computers” i created four different groups. The groups will have the same name is the NSX roles to make life easier:

Auditor, Security Administrator, NSX Administrator, Enterprise Administrator.

AD Groups

AD Groups

We create four A/D users and Add each user to different A/D group. for example nsxadmin user:

the user nsxadmin is associate with the group NSX Administrator. the association done by the Add button:

AD user

AD user

Same way i will associate a others users to A/D groups:

username:     groups:

auditor1      ->  Auditor

secadmin ->   Security Administrator

nsxadmin ->  NSX Administrator

entadmin ->  Enterprise Administrator

Connect Directory Domain to NSX Manager.

Go to “Network & Security” tab double click on the “NSX Manager”

map ad to nsx manager role 1

map ad to nsx manager role 1

Double click on “192.168.110.42” icon:

map ad to nsx manager role 2

Go to “Manage -> “Domains” -> Click on the green Plus button:

map ad to nsx manager role 8

Fill Name and NetBIOS name fields with appropriate information of your Domain Name and NetBIOS name:

In My example my domain name is corp.local:

map ad to nsx manager role 9

Enter LDAP (i.e AD) IP address or hostname and domain account (username and password):

map ad to nsx manager role 10

Click on next. NSX Manager will try to connect to LDAP (i,e AD) server using the above info. If result is successful, screenshot on next page will appear.

This configuration allow the NSX Manager read Active Directory “Security Event Log” this logs contain Active Directory users logon/logoff from to domain

NSX used this information to improve user identity firewall rules.

map ad to nsx manager role 11

Click Next and Finish:

map ad to nsx manager role 12

Mapping Active Directory  Groups to NSX Managers Roles

Now we can map Active Directory groups to pre-built NSX Manager roles.

Go to “Manage -> “Users” -> Click on the green Plus button:

map ad to nsx manager role 3

Here we can select if we want to map specific A/D user to NSX Role or A/D Group to Role.

map ad to nsx manager role 4

In this blog i will use A/D group, we create A/D group called auditor. The format to input here is:

“group_name”@domain.name.  let’s start with auditor group, this group is “Read Only” permission:

map ad to nsx manager role 5

Select one of the NSX Role, for Auditor A/D group we chose Auditor

map ad to nsx manager role 6

We can limit the scope this group can work inside nsx manager object, for this example there is no limit:

map ad to nsx manager role 7

Same way Map all others A/D groups to NSX Roles:

Auditor@corp.local                           – >  Auditor

Security Administrator@corp.local        -> Security Administrator

NSX Administrator@corp.local               -> NSX Administrator

Enterprise Administrator@corp.local     -> Enterprise Administrator

Try our first login with user Auditor1:

Login1

 The login successfull but where is the “Network & Security” tab gone ?

Login2

So far we configure all NSX Manager part but we didnt take care of the vCenter Configuration permission for that group. are you confusing ?

vCenter has is own Role for each group. we need to configure roles to etch A/D group we configured. These settings determine what the user can make the in vCenter environment.

Configure vCenter Roles:

Let’s start by configure the Auditor Role for Auditor A/D group. we know this group is for “Read Only” in the NSX Manager, so it will make sense to give this group “Read Only” to all other vCenter environment.

Go to vCenter -> Manage -> Permissions and click the green button:

vCenter Roles 1

We need to choose Roles from the Assigned Role, if we select No-Access we will not be able login to vCenter. So we need to choose something from “Read-Only” to “Administrator”

For Auditor Role “Read Only” is the Minimum.

Select “Read Only” from the Assigned Role drop down list and click on the “Add” button from “User and Group”:

vCenter Roles 2

From the Domain Select your Domain name, in our lab the domain is “CORP”, choose your Active Directory group from the list (Auditor for this example) and click the “Add” button:

vCenter Roles 3

Click Ok and Ok for Next Step:

vCenter Roles 4

Same way we need to configure all other groups roles:

vCenter Roles 5

Now we can try to login with auditor1 user:

auditor1

As we can see auditor1 is in “Read Only” role:

auditor2

We can  verify that auditor1 can’t change any other vCenter configuration:

auditor3

Test secadmin user map to “NSX Security” role, this user cannot Change any NSX infrastructure related task like create new  add new NSX Controller Node:

secadmin1

But secadmin can create new firewall rule:

secadmin2

When logging with nsxadmin user map to NSX Administrator Role we can see that the user can add new Controller Node:

nsxadmin1

But nsxadmin user cannot change or see any firewall rules configure :

nsxadmin2

What if the user member of two A/D Group ?

The user will gain combined permission access of both of the groups.

For example: the user memberof “Auditor” group and “NSX Security”, the results will be user will have read only permission on all nsx infrastructure and also gain access to all security related area in NSX.

Summery

In this post we demonstrate the NSX manager different roles. We configure Microsoft Active Directory as External database source for user’s identity.

Tagged with: ,
Posted in Install

VMware NSX Edge Scale Out with Equal-Cost Multi-Path Routing

This post was written by Roie Ben Haim and Max Ardica, with a special thanks to Jerome Catrouillet, Michael Haines, Tiran Efrat and Ofir Nissim for their valuable input

The modern data center design is changing, following a shift in the habits of consumers using mobile devices, the number of new applications that appear every day and the rate of end-user browsing which has grown exponentially. Planning a new data center requires meeting certain fundamental design guidelines. The principal goals in data center design are: Scalability, Redundancy and High-bandwidth.

In this blog we will describe the Equal Cost Multi-Path functionality (ECMP) introduced in VMware NSX release 6.1 and discuss how it addresses the requirements of scalability, redundancy and high bandwidth. ECMP has the potential to offer substantial increases in bandwidth by load-balancing traffic over multiple paths as well as providing fault tolerance for failed paths. This is a feature which is available on physical networks but we are now introducing this capability for virtual networking as well. ECMP uses a dynamic routing protocol to learn the next-hop towards a final destination and to converge in case of failures. For a great demo of how this works, you can start by watching this video, which walks you through these capabilities in VMware NSX.

 

https://www.youtube.com/watch?v=Tz7SQL3VA6c

 

Scalability and Redundancy and ECMP

To keep pace with the growing demand for bandwidth, the data center must meet scale out requirements, which provide the capability for a business or technology to accept increased volume without redesign of the overall infrastructure. The ultimate goal is avoiding the “rip and replace” of the existing physical infrastructure in order to keep up with the growing demands of the applications. Data centers running business critical applications need to achieve near 100 percent uptime. In order to achieve this goal, we need the ability to quickly recover from failures affecting the main core components. Recovery from catastrophic events needs to be transparent to end user experiences.

ECMP with VMware NSX 6.1 allows you to use upto a maximum of 8 ECMP Paths simultaneously. In a specific VMware NSX deployment, those scalability and resilience improvements are applied to the “on-ramp/off-ramp” routing function offered by the Edge Services Gateway (ESG) functional component, which allows communication between the logical networks and the external physical infrastructure.

ECMP Topology

ECMP Topology

 

External user’s traffic arriving from the physical core routers can use up to 8 different paths (E1-E8) to reach the virtual servers (Web, App, DB).

In the same way, traffic returning from the virtual server’s hit the Distributed Logical Router (DLR), which can choose up to 8 different paths to get to the core network.

How the Path is Determined

NSX for vSphere Edge Services Gateway device:

When a traffic flow needs to be routed, the round robin algorithm is used to pick up one of the links as the path for all traffic of this flow. The algorithm ensures to keep in order all the packets related to this flow by sending them through the same path. Once the next-hop is selected for a particular Source IP and Destination IP pair, the route cache stores this. Once a path has been chosen, all packets related to this flow will follow the same path.

There is a default IPv4 route cache timeout, which is 300 seconds. If an entry is inactive for this period of time, it is then eligible to be removed from route cache. Note that these settings can be tuned for your environment.

Distributed Logical Router (DLR):

The DLR will choose a path based on a Hashing algorithm of Source IP and Destination IP.

 

What happens in case of a failure on one of Edge Devices?

In order to work with ECMP the requirement is to use a dynamic routing protocol: OSPF or BGP. If we take OSPF for example, the main factor influencing the traffic outage experience is the tuning of the

OSPF timers.

OSPF will send hello messages between neighbors, the OSPF “Hello” protocol is used and determines the Interval as to how often an OSPF Hello is sent.

Another OSPF timer called “Dead” Interval is used, which is how long to wait before we consider an OSPF neighbor as “down”. The OSPF Dead Interval is the main factor that influences the convergence time. Dead Interval is usually 4 times the Hello Interval but the OSPF (and BGP) timers can be set as low as 1 second (for Hello interval) and 3 seconds (for Dead interval) to speed up the traffic recovery.

 

ECMP failed Edge

ECMP failed Edge

 

In the example above, the E1 NSX Edge has a failure; the physical routers and DLR detect E1 as Dead at the expiration of the Dead timer and remove their OSPF neighborship with him. As a consequence, the DLR and the physical router remove the routing table entries that originally pointed to the specific next-hop IP address of the failed ESG.

As a result, all corresponding flows on the affected path are re-hashed through the remaining active units. It’s important to emphasize that network traffic that was forwarded across the non-affected paths remains unaffected.

 

Troubleshooting and visibility

With ECMP it’s important to have introspection and visibility tools in order to troubleshoot optional point of failure. Let’s look at the following topology.

TSHOT

TSHOT

A user outside our Data Center would like to access the Web Server service inside the Data Center. The user IP address is 192.168.100.86 and the web server IP address is 172.16.10.10.

This User traffic will hit the Physical Router (R1), which has established OSPF adjacencies with E1 and E2 (the Edge devices). As a result R1 will learn how to get to the Web server from both E1 and E2 and will get two different active paths towards 172.16.10.10. R1 will pick one of the paths to forward the traffic to reach the Web server and will advertise the user network subnet 192.168.100.0/24 to both E1 and E2 with OSPF.

E1 and E2 are NSX for vSphere Edge devices that also establish OSPF adjacencies with the DLR. E1 and E2 will learn how to get to the Web server via OSPF control plane communication with the DLR.

From the DLR perspective, it acts as a default gateway for the Web server. This DLR will form an OSPF adjacency with E1 and E2 and have 2 different OSPF routes to reach the user network.
From the DLR we can verify OSPF adjacency with E1, E2.

We can use the command: “show ip ospf neighbor”

show ip ospf neighbor

show ip ospf neighbor

From this output we can see that the DLR has two Edge neighbors: 198.168.100.3 and 192.168.100.10.The next step will be to verify that ECMP is actually working.

We can use the command: “show ip route”

show ip route

show ip route

The output from this command shows that the DLR learned the user network 192.168.100.0/24 via two different paths, one via E1 = 192.168.10.1 and the other via E2 = 192.168.10.10.

Now we want to display all the packets which were captured by an NSX for vSphere Edge interface.

In the example below and in order to display the traffic passing through interface vNic_1, and which is not OSPF protocol control packets, we need to type this command:
“debug packet display interface vNic_1 not_ip_proto_ospf”

We can see an example with a ping running from host 192.168.100.86 to host 172.16.10.10

Capture traffic

Capture traffic

If we would like to display the captured traffic to a specific ip address 172.16.10.10, the command capture would look like: “debug packet display interface vNic_1 dst_172.16.10.10”

debug packet display interface vNic_1 dst

debug packet display interface vNic_1 dst

* Note: When using the command “debug packter display interface” we need to add underscore between the expressions after the interface name.

Useful CLI for Debugging ECMP

To check which ECMP path is chosen for a flow

  • debug packet display interface IFNAME

To check the ECMP configuration

  • show configuration routing-global

To check the routing table

  • show ip route

To check the forwarding table

  • show ip forwarding

 

Useful CLI for Dynamic Routing

  • show ip ospf neighbor
  • show ip ospf database
  • show ip ospf interface
  • show ip bgp neighbors
  • show ip bgp

ECMP Deployment Consideration

ECMP currently implies stateless behavior. This means that there is no support for stateful services such as the Firewall, Load Balancing or NAT on the NSX Edge Services Gateway. The Edge Firewall gets automatically disabled on ESG when ECMP is enabled. In the current NSX 6.1 release, the Edge Firewall and ECMP cannot be turned on at the same time on NSX edge device. Note however, that the Distributed Firewall (DFW) is unaffected by this.

 

About the authors:

Roie Ben Haim

Roie works as a professional services consultant at VMware, focusing on design and implementation of VMware’s software-defined data center products.  Roie has more than 12 years in data center architecture, with a focus on network and security solutions for global enterprises. An enthusiastic M.Sc. graduate, Roie holds a wide range of industry leading certifications including Cisco CCIE x2 # 22755 (Data Center, CCIE Security), Juniper Networks JNCIE – Service Provider #849, and VMware vExpert 2014, VCP-NV, VCP-DCV.

Max Ardica

Max Ardica is a senior technical product manager in VMware’s networking and security business unit (NSBU). Certified as VCDX #171, his primary task is helping to drive the evolution of the VMware NSX platform, building the VMware NSX architecture and providing validated design guidance for the software-defined data center, specifically focusing on network virtualization. Prior to joining VMware, Max worked for almost 15 years at Cisco, covering different roles, from software development to product management. Max owns also a CCIE certification (#13808).

 

 

Tagged with: ,
Posted in Design, NSX 6.1, Troubleshooting
Categories
Blog Stats
  • 26,369 hits
Archive

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 41 other followers