Monday, August 31, 2015

Cisco IP SLA - Investigating IP Service Levels & Change Routing

Cisco IP SLAs is a part of Cisco IOS that allows Cisco customers to analyze IP service levels for IP applications and services by using active traffic monitoring for measuring network performance. With Cisco IOS IP SLAs, service provider customers can measure and provide service level agreements, and enterprise customers can verify service levels, verify outsourced service level agreements, and understand network performance. Cisco IOS IP SLAs can perform network assessments, verify quality of service (QoS), ease the deployment of new services, and assist with network troubleshooting.

IP SLAs sends data across the network to measure performance between multiple network locations or across multiple network paths. It simulates network data and IP services and collects network performance information in real time. Cisco IOS IP SLAs generates and analyzes traffic either between Cisco IOS devices or from a Cisco IOS device to a remote IP device such as a network application server. Measurements provided by the various Cisco IOS IP SLAs operations can be used for troubleshooting, for problem analysis, and for designing network topologies. 

IP SLAs collects a unique subset of these performance metrics:

  • Delay (both round-trip and one-way)
  • Jitter (directional)
  • Packet loss (directional)
  • Packet sequencing (packet ordering)
  • Path (per hop)
  • Connectivity (directional)
  • Server or website download time
In this article, I will use the ICMP Echo operation measures end-to-end response time between a Cisco router and a web server using IP. Response time is computed by measuring the time taken between sending an ICMP Echo request message to the destination and receiving an ICMP Echo reply.

Example

Suppose that the CiscoEXM router has two different links, one is the main connection (red link) and the other one (blue link) is the backup connection; the question is: how can I enable the backup link if the main connection goes down? In general, the best solution for this scenario is to use the dynamic routing protocol, but what can I do if I can’t use them? The solution is the IP SLA.



 1. Define the ip sla operation. The CiscoEXM router will send an ICMP request to 172.16.255.2 (the CiscoEXM default gateway) every 10 second with a timeout of 1000ms and a threshold value of 500ms.

CiscoEXM(config)#ip sla 1
CiscoEXM(config-ip-sla)#icmp-echo 172.16.255.2 source-interface FastEthernet1/0
CiscoEXM(config-ip-sla-echo)#timeout 5000
CiscoEXM(config-ip-sla-echo)#frequency 10
CiscoEXM(config-ip-sla-echo)#threshold 500

 2. Start the ip sla. It is possible schedule the SLA operation in different ways but in this tutorial I want to start the ip SLA operation immediately and forever. Notice that the “1”  refers to “ip sla 1” command.

CiscoEXM(config)#ip sla schedule 1 start-time now life forever

3 . Track the state of IP SLA. Every IP SLAs operation maintains an operation return-code value. This return code is interpreted by the tracking process. The return code may return OK, Over Threshold, and several other return codes. Two aspects of an IP SLAs operation can be tracked: state and reach-ability.


 In this case, I prefer to use the “reachability”, so the “track state” will be down only in case of a ICMP timeout.

CiscoEXM(config)#track 10 ip sla 1 reachability

Remember: If you want to use the “state”, remember that the “track state” will be down also if the the threshold is reached.

Note: with Cisco IOS Release 12.4(20)T, 12.2(33)SXI1, 12.2(33)SRE and Cisco IOS XE Release 2.4, the track rtr command is replaced by the track ip sla command. See the track ip sla command for more information.

 4. Define the tracked route. At the end, I must delete the old default gateway entry, add the default gateway with the track feature (notice that the number 10 represents the track object defined in the previous step) and insert a default route with a distance administrative less “strong” . Hence if the track status is down the last route will be used to forward all the traffic (notice that the number 5 define the administrative distance).

CiscoEXM(config)#ip route 0.0.0.0 0.0.0.0 172.16.255.2 track 10
CiscoEXM(config)#no ip route 0.0.0.0 0.0.0.0 172.16.255.2
CiscoEXM(config)#ip route 0.0.0.0 0.0.0.0 172.16.255.6 5

 5. Check the IP SLA.
Now that I have defined the IP SLA object, I can check some useful informations when the main link (red link) is UP or NOT.

Red link UP
To display information about the IP route track table:

CiscoEXM#show ip route track-table
 ip route 0.0.0.0 0.0.0.0 172.16.255.2 track 10 state is [up]
CiscoEXM#

To display information about the IP routing table:

CiscoEXM#show ip route 
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is 172.16.255.2 to network 0.0.0.0

S*    0.0.0.0/0 [1/0] via 172.16.255.2
      172.16.0.0/16 is variably subnetted, 4 subnets, 2 masks
C        172.16.255.0/30 is directly connected, FastEthernet1/0
L        172.16.255.1/32 is directly connected, FastEthernet1/0
C        172.16.255.4/30 is directly connected, FastEthernet1/1
L        172.16.255.5/32 is directly connected, FastEthernet1/1
  
To display information about IP SLA

CiscoEXM#show track
Track 10
  IP SLA 1 reachability
  Reachability is Up
    12 changes, last change 00:33:48
  Latest operation return code: OK
  Latest RTT (millisecs) 24
  Tracked by:
    STATIC-IP-ROUTING 0
 
Red link DOWN
First of all, I ping the web server (192.168.1.10) that it is on the headquarter, then I unplug the CiscoEXM fastethernet1/0 cable

CiscoEXM#ping 192.168.1.10 repeat 200
Type escape sequence to abort.
Sending 200, 50-byte ICMP Echos to 192.168.1.10, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!....
May 8 03:23:30.082: %TRACKING-5-STATE: 10 ip sla 1 reachability Up->Down.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 97 percent (195/200), round-trip min/avg/max = 20/66/248 ms
CiscoEXM#


As you can see, when the cable is unplugged there are three different events:

  1. there are some timeout
  2. the tracking state goes down (May 8 03:23:30.082: %TRACKING-5-STATE: 10 ip sla 1 reachability Up->Down)
  3. the route tracked goes down and the backup default route (ip route 0.0.0.0 0.0.0.0 172.16.255.6 5) is up.
So the output of the previous show commands will be:

The track object is down:

CiscoEXM#show ip route track-table
ip route 0.0.0.0 0.0.0.0 172.16.255.2 track 10 state is [down]

The default route is 172.16.255.6 (the backup connection):
 
CiscoEXM#show ip route 
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is 172.16.255.6 to network 0.0.0.0

S*    0.0.0.0/0 [5/0] via 172.16.255.6
      172.16.0.0/16 is variably subnetted, 4 subnets, 2 masks
C        172.16.255.0/30 is directly connected, FastEthernet1/0
L        172.16.255.1/32 is directly connected, FastEthernet1/0
C        172.16.255.4/30 is directly connected, FastEthernet1/1
L        172.16.255.5/32 is directly connected, FastEthernet1/1

The return code is “Timeout”: 

CiscoEXM#show ip sla statistics 
IPSLAs Latest Operation Statistics

IPSLA operation id: 1
        Latest RTT: NoConnection/Busy/Timeout
Latest operation start time: 10:42:03 UTC Wed May 8 2013
Latest operation return code: Timeout
Number of successes: 8
Number of failures: 3
Operation time to live: Forever


The track object is down:

CiscoEXM#show track
Track 10
  IP SLA 1 reachability
  Reachability is Down
    13 changes, last change 00:00:22
  Latest operation return code: Timeout
  Tracked by:
    STATIC-IP-ROUTING 0

Red link again UP
At this point I ping the web server (192.168.1.10) , then I reconnect the CiscoEXM fastethernet1/0 cable

CiscoEXM#ping 192.168.1.10 size 50 repeat 200
Type escape sequence to abort.
Sending 200, 50-byte ICMP Echos to 192.168.1.10, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (200/200), round-trip min/avg/max = 12/71/172 ms
CiscoEXM#



CiscoEXM#show ip route 
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is 172.16.255.2 to network 0.0.0.0

S*    0.0.0.0/0 [1/0] via 172.16.255.2
      172.16.0.0/16 is variably subnetted, 4 subnets, 2 masks
C        172.16.255.0/30 is directly connected, FastEthernet1/0
L        172.16.255.1/32 is directly connected, FastEthernet1/0
C        172.16.255.4/30 is directly connected, FastEthernet1/1
L        172.16.255.5/32 is directly connected, FastEthernet1/1

As you can see, when the main line comes up (now the default gateway is again 172.16.255.2), there isn’t a packet lost.

*** Note: Switches running the IP base image support only IP SLAs responder functionality and must be configured with another device that supports full IP SLAs functionality, for example, a switch. 

**** Useful commands for Monitoring IP SLAs Operations

Command

Purpose

show ip sla application

Display global information about Cisco IOS IP SLAs.

show ip sla authentication

Display IP SLAs authentication information.

show ip sla configuration [entry-number]

Display configuration values including all defaults for all IP SLAs operations or a specific operation.

show ip sla enhanced-history {collection-statistics | distribution statistics} [entry-number]

Display enhanced history statistics for collected history buckets or distribution statistics for all IP SLAs operations or a specific operation.

show ip sla ethernet-monitor configuration [entry-number]

Display IP SLAs automatic Ethernet configuration.

show ip sla group schedule [schedule-entry-number]

Display IP SLAs group scheduling configuration and details.

show ip sla history [entry-number | full | tabular]

Display history collected for all IP SLAs operations

show ip sla mpls-lsp-monitor {collection-statistics | configuration | ldp operational-state | scan-queue | summary [entry-number] | neighbors}

Display MPLS label switched path (LSP) Health Monitor operations,

show ip sla reaction-configuration [entry-number]

Display the configured proactive threshold monitoring settings for all IP SLAs operations or a specific operation.

show ip sla reaction-trigger [entry-number]

Display the reaction trigger information for all IP SLAs operations or a specific operation.

show ip sla responder

Display information about the IP SLAs responder.

show ip sla statistics [entry-number | aggregated | details]

Display current or aggregated operational status and statistics.


References:

Tuesday, August 11, 2015

VMware ESXi 5.5 / 6.0 GA "Driver Not Found" problems solution for HP P440ar RAID controllers on New Gen9 Servers.


Symptoms

When configuring and using ESXi 5.5 / 6.0 Scripted Install there are several factors that can cause an installation to fail. Reviewing the log files and any on-screen prompts can often indicate the root cause of the issue. This article is to help you troubleshoot these issues.
Basically, The ESXi 5.5/6.0 GA ISO provided by VMware does not have the latest HP Smart Array driver that supports HP Gen 9 Smart Array controllers.


As a result, an ESXi installation or upgrade to ESXi 5.5/6.0 may fail, you see errors similar to:


Error (see log for more info)
An error has occurred while parting the installer script

error:file:///ks.cfg:line 3: Error (see log for more infor):
Could not find first boot partition

 SOLUTION:

 *** If you face any problem regarding download links given below then please leave a comment in this post, I'll fix that as soon as possible.

Your kind cooperation will help me as well as other readers. ***

1. Install the host using HP's custom ESXi 6.0 GA installation media which includes the scsi-hpsa-5.5.0.74-1OEM.550.0.0.1331820 driver.
To install the host using the HP ESXi 6.0 Custom Image, download the ISO from VMware downloads and install ESXi as detailed in the Installing ESXi section in the vSphere 6.0 Installation and Setup guide.
 


2. Create a custom ISO and inject the updated driver separately.

To manually create a custom ESXi 6.0 image, inject the driver and install the host:

  • Create a custom ESXi build using the ESXi-Customizer-v2.7.2 software.

    Note: ESXi-Customizer is not a VMware product. You can download the software at ESXi-Customizer Downloads page.
  • Replace the current version of the driver 6.0.0.44-4vmw.600.0.0.2494585 with version scsi-hpsa-5.5.0.74-1OEM.550.0.0.1331820.x86_64.vib. 
  •  
 To replace the driver:


  1. Download the driver vib file from Drivers & software ESXI 6.0.Drivers & software ESXI 5.5.0

    Note: The preceding link was correct as of December 17, 2015. If you find the link is broken, provide feedback and a VMware employee will update the link.
  2. Run this command to install the driver:
    esxcli software vib install -d path_to_vib

Other Common Errors & Solutions:



This table outlines common errors and how to troubleshoot them:


Error Solution
error:nfs://192.168.0.1/FileShare
/ks.cfg:line12: install --firstdisk
specified, but no suitable disk found
This error can occur if there is insufficient disk space on the host to install ESXi. The minimum requirement for the size of the boot disk is 900 MB. Ensure that your disk meets this requirement.
NFS mount failure for URL nfs://192.168.0.101:/NFS/ks1.cfg
Log onto the host and examine /var/log/weasel.log. This reveals a more informative message. According to the logs, you see the error:
Could not open file over NFS nfs://192.168.0.101:/NFS/ks1.cfg
Exception: Error (see log for more info):
Error copy / download file
This means the ks1.cfg file was not found. Either the file was not created or the mount point is incorrect. Ensure that the file exists and that the share has been created.
Did not get an IP Address from DHCP server
This error indicates that the host cannot contact a DHCP server. The host requires an IP address to be able to mount the NFS share containing the ks.cfg file.
If you do not specify IP information at the boot command. If no network choices were specified at the command line, the installer defaults to DHCP. If there is no DHCP server on the network, then the installer has no way of retrieving files over the network. Ensure your DHCP server is working correctly.
In environment where there is no DHCP server available, static network configuration entries need to be added to the boot command. A network configuration must be specified to allow access to the ks.cfg file if no DHCP server is available on the network. This can be achieved by specifying these parameters at the boot prompt:
ip=, netmask=, gateway=
warning:nfs://192.168.0.101/
NFS/ks.cfg:line 5: bootproto was
set to static but "--hostname="
was not set. Hostname will be set automatically.
This is not an error, but a warning that temporarily halts installation. When setting a static IP, specify a hostname to avoid this warning.