Sunday, September 30, 2012


Supervisor 2/2E and Admin VDC

In June of 2012 Cisco announced a new addition to the Nexus 7000 family, the Supervisor 2 and Supervisor 2E. These new members of the family bring some really exciting capabilities to the platform. I like Virtual Device Contexts (VDCs) a bit….ok, quite a bit….ok, they are my favorite thing to talk about on this platform and I earned the VDCBadger moniker in 2011 at Cisco Live. If I could sum it up, I’d be like Turtle Boy – “I like VDCs”.  For reference - http://www.youtube.com/watch?v=CMNry4PE93Y

Enough rambling, the new supervisors bring new capabilities that we’ll discuss in more detail. First and foremost, let’s look at what we have. There are two models of the new supervisor: Supervisor 2 and Supervisor 2E. There are some key differences between the modules starting with the CPUs. Both SUP2 and SUP2E use Intel Xeon Quad-Core CPUs which alone brings a lot more control plane power, but SUP2E has two CPUs. Additionally the SUP2 ships with 12GB of RAM and SUP2E has 32GB of RAM. You can see in the images below the differences in hardware.

Supervisor 2
Supervisor 2E
            
 
 
 
 
 
 
These combine to allow some significant increases in scalability across the chassis, primarily with the number of VDCs. Supervisor 2 supports 4 VDCs while Supervisor 2E supports 8! Additionally a new capability called “Admin VDC” comes in NX-OS 6.1(1) on the Supervisor 2/2E so you’ll frequently see the VDC count listed as 4+1 and 8+1 with the +1 being the Admin VDC. More details on Admin VDC in a bit. Know that Admin VDC is a management context that is a direct result of customer feedback. The additional CPU also brings higher scale for Nexus 2000s (FEX), IEEE 1588 PTP clients with more scale increments across the chassis to come.

The new Supervisors also take advantage of a new 64 bit kernel for NX-OS, USB flash and operations that require CPU, like saving the configuration, ISSU, etc are all faster. SUP2 also brings FCoE on the F2 series modules to customers as well. Finally, one last nerd knob is CPU Shares – pretty much QoS for the CPU in a multi-VDC environment. One thing you’ll notice is missing is the Connectivity Management Processor (CMP). This was done intentionally and not without a lot of thought and feedback. Long story short, most customers were not using it. Everyone agreed CMP was a cool idea, but it was rarely plugged in. Removing it means the SUP2 and SUP2E use less power, which is a key concern for a lot of customers. What does SUP2 look like on the CLI? Funny you ask, I happen to have some CLI.

Hardware

  cisco Nexus7000 C7009 (9 Slot) Chassis ("Supervisor module-2")
  Intel(R) Xeon(R) CPU         with 32745276 kB of memory.
  Processor Board ID JAF1608ACEK

 

N7K-1# show mod
Mod  Ports  Module-Type                         Model           Status
---  -----  ----------------------------------- --------------- ------
2    0      Supervisor module-2                 N7K-SUP2E       active *
3    32     1/10 Gbps Ethernet Module           N7K-F132XP-15   ok
4    8      10 Gbps Ethernet XL Module          N7K-M108X2-12L  ok
6    48     1/10 Gbps Ethernet Module           N7K-F248XP-25   ok
7    48     10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L  ok

Looks a lot like a you’d expect.  J

I mentioned Admin VDC, so let’s dig in to what Admin VDC does. First, let’s talk about VDCs in general – there are two kinds of VDCs – the default VDC and non-default VDCs. The default VDC is the VDC the switch operates in if you are not using VDCs. Read that last sentence again.  It really means that even if you are not using VDCs, you are using VDCs. 


Customers asked for an administrative context to perform system-wide operations and the Admin VDC came to be. See, we *do* read your surveys and hear your feedback!

Admin VDC is a new type of VDC, specialized in that it is 100% administrative only. Admin VDC is designed to allow “run the box” type functions to be performed in a context separate from data plane traffic. The following configuration or tasks can be performed in Admin VDC:

1 – VDC operations – creation, deletion, suspension, all resource allocation including CPU shares
2 – Install operations – ISSU/ISSD of the NX-OS, EPLD upgrades, feature set installation (FCoE/FabricPath/FEX/MPLS) and licensing
3 – Reload – individual VDC or entire chassis
4 – Control Plane Policing – class map, policy map definition and application
5 – Ethanalyzer of control plane traffic
6 -  GOLD diagnostic - start/stop tests, configure tests
7 – Miscellaneous module operations - Out of service, purge config for removed modules
8 – Admin VDC specific debugging – bootvar, copp, diagnostics(GOLD), ethdstats, exceptionlog, giscm, license, oim, plog, and psshelper_gsvc.  Of the list, ethdstats, gicsm, oim, plog and pssheler_gsvc are system level processes and the rest are either admin VDC only tasks like boot configuration, CoPP, GOLD and licensing.

Finally, Admin VDC cannot have Ethernet interfaces inside it other than the Management interface (mgmt0). This also means no routing protocols, L2 protocols or other L2/L3 features are available or configurable in Admin VDC.
In a switch with a default VDC configuration, all of these functions would be done from the default VDC, but by moving these capabilities to the Admin VDC the data place VDCs are “cleaner.” An additional benefit is that it lends itself to multi-tenancy where a network operator could give control over an entire VDC without the tenant seeing system-wide configuration parameters.

So how does one get Admin VDC? First, you need SUP2 or SUP2E – it’s coming for SUP1 – so remain calm and carry on. With your shiny new SUP2/SUP2E, you’ll see Admin VDC is a prompt during the boot cycle.
 

   Enter the password for "admin":
  Confirm the password for "admin":
  Do you want to enable admin vdc (yes/no) [n]:

 You can also convert the default VDC to Admin VDC if you answered no to the prompt. Let’s take a N7K with 4 VDCs and convert the default VDC.  That looks like this:

N7K-1# show vdc

vdc_id  vdc_name state    mac               type        lc
------  -------- -----    ----------        ---------   ------
1       N7K-1    active   00:26:98:0f:d9:c1 Ethernet    m1 f1 m1xl m2xl
2       Agg1     active   00:26:98:0f:d9:c2 Ethernet    m1 f1 m1xl m2xl
3       OTV1     active   00:26:98:0f:d9:c3 Ethernet    m1 f1 m1xl m2xl
4       Access1  active   00:26:98:0f:d9:c4 Ethernet    f2
 
Now, let’s convert it.

N7K-1# config
Enter configuration commands, one per line.  End with CNTL/Z.
N7K-1(config)# system admin-vdc
N7K-1(config)#
N7K-1(config)# show vdc
vdc_id  vdc_name  state    mac                 type        lc
------  --------  -----    ----------          ---------   ------
1       N7K-1      active  00:26:98:0f:d9:c1   Admin       None
2       Agg1       active  00:26:98:0f:d9:c2   Ethernet    m1 f1 m1xl m2xl
3       OTV1       active  00:26:98:0f:d9:c3   Ethernet    m1 f1 m1xl m2xl
4       Access1    active  00:26:98:0f:d9:c4   Ethernet    f2

Note that when you do this, some key changes are made to what used to be the default VDC. See how the linecard support (LC) is changed to none. Refer to my earlier comment about routing protocols, L2 technologies, etc. not working in Admin VDC.

In case you are wondering what happens if you try to create too many VDCs see below where we have a SUP2E with 9 VDCs (8+1) and we try to create a new VDC called “TooMany.”

N7K-1(config)# show vdc

vdc_id  vdc_name     state       mac                 type        lc
 ------  --------     -----      ----------          ---------   ------
1       N7K-1        active     00:26:98:0f:d9:c1   Admin       None
2       Agg1         active     00:26:98:0f:d9:c2   Ethernet    f2
3       Core1        active     00:26:98:0f:d9:c3   Ethernet    m1 f1 m1xl m2xl
4       Access1      active     00:26:98:0f:d9:c4   Ethernet    m1 f1 m1xl m2xl
5       FCoE         active     00:26:98:0f:d9:c5   Storage     f1 f2
6       DMZ          active     00:26:98:0f:d9:c6   Ethernet    m1 f1 m1xl m2xl
7       Lab          active     00:26:98:0f:d9:c7   Ethernet    m1xl m2xl
8       FP-Test      active     00:26:98:0f:d9:c8   Ethernet    f2
9       Nuke         active     00:26:98:0f:d9:c9   Ethernet    f1

N7K-1(config)# vdc TooMany
ERROR: You have reached the maximum number of allowed vdcs [8]
N7K-1(config)#

Pretty darn cool, IMHO. As always, your comments and feedback are appreciated!

Sunday, September 16, 2012

FabricPath Authentication in NX-OS

A few weeks ago I was asked by a colleague about some issues they were having with FabricPath authentication. I really hadn’t delved into the details of it yet so took the opportunity to do so. I thought my experience would be good to blog about and share with others. I can see this being a topic some of our more security conscious customers implement and would make a good topic for the CCIE Data Center.

First and foremost, I’m going to assume you have a basic working knowledge of FabricPath, Cisco’s scalable Layer 2 solution that eliminates Spanning Tree Protocol and adds some enhancements that are sorely needed in L2 networks like Time To Live (TTL), Reverse Path Forwarding (RPF) and uses IS-IS as a control plane protocol.  It’s the fact that FabricPath uses IS-IS that makes it very easy and familiar for customers to enable authentication in their fabric. If you have ever configured authentication for a routing protocol in Cisco IOS or NX-OS, this will be similar with all of your favorites like key chains, key strings and hashing algorithms. Hopefully that nugget of information doesn’t send you into a tail spin of despair.  ;)

With FabricPath there are two levels of authentication that can be enabled. The first is at the domain level for the entire switch (or VDC!). Authentication here will prevent routes from being learned. Important to note that ISIS adjacencies can be formed on the interface level even when the domain authentication is mismatched. This domain level authentication is for LSP and NSP exchange not PDUs on the interfaces.  If you are not careful, you can blackhole traffic during the implementation of authentication, just like you would with any other routing protocol.
A quick order of operation to enable domain level authentication would be to define a key-chain with keys which contain key-strings defined underneath. The key strings are the actual password and NX-OS allows you to define multiple key-strings so you can rotate passwords as needed and even includes nerd knobs for setting start and end times. After the key chains are defined, they are applied to the FabricpPath domain. Let’s quit typing and let the CLI do the talking.

We start with a VDC that has FabricPath, is in a fabric with other devices but doesn’t have authentication enabled. We can see we have not learned any routes.
N7K-2-Access2# show fabricpath route
FabricPath Unicast Route Table
'a/b/c' denotes ftag/switch-id/subswitch-id
'[x/y]' denotes [admin distance/metric]
ftag 0 is local ftag
subswitch-id 0 is default subswitch-id


FabricPath Unicast Route Table for Topology-Default
0/4/0, number of next-hops: 0
        via ---- , [60/0], 24 day/s 00:32:41, local
0/69/1, number of next-hops: 0
1/69/0, number of next-hops: 0
        via ---- , [60/0], 15 day/s 04:18:01, local
2/69/0, number of next-hops: 0
        via ---- , [60/0], 15 day/s 04:18:01, local

We can also see we are adjacent to some other devices, but also note that we *don’t* see their name under system ID, just the MAC address. This is a quick point that something is amiss with the control plane. They are in bold and red below.
N7K-2-Access2# show fabricpath isis adj
Fabricpath IS-IS domain: default Fabricpath IS-IS adjacency database:
System ID       SNPA            Level  State  Hold Time  Interface
0026.980f.d9c4  N/A             1      UP     00:00:25   port-channel1
0024.98eb.ff42  N/A             1      UP     00:00:29   Ethernet3/9
0024.98eb.ff42  N/A             1      UP     00:00:27   Ethernet3/10
0026.980f.d9c2  N/A             1      UP     00:00:22   Ethernet3/20
0026.980f.d9c2  N/A             1      UP     00:00:29   Ethernet3/21

Now we’ll add the authentication and start with the key-chain and call it “domain” then define key 0 and the key-string of “domain” (not very creative am I?) and then finally apply it to the fabricpath domain default.
 
N7K-2-Access2# config
Enter configuration commands, one per line.  End with CNTL/Z.
N7K-2-Access2(config)# key chain domain
N7K-2-Access2(config-keychain)# key 0
N7K-2-Access2(config-keychain-key)# key-string domain
N7K-2-Access2(config-keychain-key)# fabricpath domain default
N7K-2-Access2(config-fabricpath-isis)# authentication key domain

Now let’s see what that does for us.  Much happier now aren’t we?
N7K-2-Access2(config-fabricpath-isis)# show  fabricpath route
FabricPath Unicast Route Table
'a/b/c' denotes ftag/switch-id/subswitch-id
'[x/y]' denotes [admin distance/metric]
ftag 0 is local ftag
subswitch-id 0 is default subswitch-id

FabricPath Unicast Route Table for Topology-Default
0/4/0, number of next-hops: 0
        via ---- , [60/0], 24 day/s 00:33:32, local
0/69/1, number of next-hops: 0
1/1/0, number of next-hops: 2
        via Eth3/20, [115/40], 0 day/s 00:00:10, isis_fabricpath-default
        via Eth3/21, [115/40], 0 day/s 00:00:10, isis_fabricpath-default
1/2/0, number of next-hops: 2
        via Eth3/9, [115/40], 0 day/s 00:00:11, isis_fabricpath-default
        via Eth3/10, [115/40], 0 day/s 00:00:11, isis_fabricpath-default
1/69/0, number of next-hops: 0
        via ---- , [60/0], 15 day/s 04:18:52, local
1/100/0, number of next-hops: 4
        via Eth3/9, [115/40], 0 day/s 00:00:11, isis_fabricpath-default
        via Eth3/10, [115/40], 0 day/s 00:00:11, isis_fabricpath-default
        via Eth3/20, [115/40], 0 day/s 00:00:10, isis_fabricpath-default
        via Eth3/21, [115/40], 0 day/s 00:00:10, isis_fabricpath-default
2/69/0, number of next-hops: 0
        via ---- , [60/0], 15 day/s 04:18:52, local
N7K-2-Access2(config-fabricpath-isis)#

The exact same sequence applies to interface-level authentication and looks like the CLI below. We can see that when we have two non-functioning states here – INIT and LOST. INIT is from me removing the key-chain and flapping the interface (shut/no shut) and LOST is from me removing the pre-defined key chain and the adjacency going down to N7K-1-Agg1.
N7K-2-Access2# show fab isis adj
Fabricpath IS-IS domain: default Fabricpath IS-IS adjacency database:
System ID       SNPA            Level  State  Hold Time  Interface
N7K-1-Access1   N/A             1      UP     00:00:27   port-channel1
N7K-2-Agg2      N/A             1      INIT   00:00:22   Ethernet3/9
N7K-2-Agg2      N/A             1      UP     00:00:23   Ethernet3/10
N7K-1-Agg1      N/A             1      LOST   00:04:57   Ethernet3/20
N7K-1-Agg1      N/A             1      UP     00:00:30   Ethernet3/21

Now we’ll add our key chain and key string.
N7K-2-Access2# config
Enter configuration commands, one per line.  End with CNTL/Z.
N7K-2-Access2(config)#
N7K-2-Access2(config-keychain-key)# int e3/9
N7K-2-Access2(config-if)# fabricpath isis authentication-type cleartext
N7K-2-Access2(config-if)# fabricpath isis authentication key-chain interface
N7K-2-Access2(config-if)#
N7K-2-Access2(config-if)# key chain interface
N7K-2-Access2(config-keychain)#key 0
N7K-2-Access2(config-keychain-key)# key-string 7 interface
N7K-2-Access2(config-keychain-key)#
N7K-2-Access2(config-keychain-key)# int e3/9
N7K-2-Access2(config-if)# fabricpath isis authentication-type cleartext
N7K-2-Access2(config-if)# fabricpath isis authentication key-chain interface
N7K-2-Access2(config-if)#

A quick check shows us we’re happily adjacent to our swiches.

N7K-2-Access2(config-keychain)# show fab isis adj
Fabricpath IS-IS domain: default Fabricpath IS-IS adjacency database:
System ID       SNPA            Level  State  Hold Time  Interface
N7K-1-Access1   N/A             1      UP     00:00:30   port-channel1
N7K-2-Agg2      N/A             1      UP     00:00:29   Ethernet3/9
N7K-2-Agg2      N/A             1      UP     00:00:26   Ethernet3/10
N7K-1-Agg1      N/A             1      UP     00:00:24   Ethernet3/20
N7K-1-Agg1      N/A             1      UP     00:00:31   Ethernet3/21

Finally, a quick command to check the FabricPath authentication status on your device is below:

N7K-2-Access2# show fab isi

Fabricpath IS-IS domain : default
  System ID : 0024.98eb.ff43  IS-Type : L1
  SAP : 432  Queue Handle : 11
  Maximum LSP MTU: 1492
  Graceful Restart enabled. State: Inactive
  Last graceful restart status : none
  Metric-style : advertise(wide), accept(wide)
  Start-Mode: Complete [Start-type configuration]
  Area address(es) :
    00
  Process is up and running
  CIB ID: 3
  Interfaces supported by Fabricpath IS-IS :
    port-channel1
    Ethernet3/9
    Ethernet3/10
    Ethernet3/20
    Ethernet3/21
  Level 1
  Authentication type: MD5
  Authentication keychain: domain  Authentication check specified
  MT-0 Ref-Bw: 400000
  Address family Swid unicast :
    Number of interface : 5
    Distance : 115
  L1 Next SPF: Inactive

N7K-2-Access2#


With this simple exercise you’ve configured FabricPath authentication. Not too bad and very effective. As always when configuring passwords on your device, cut and paste from a common text file is important to avoid empty white spaces at the end of passwords and other nuances that can lead you down the wrong path. In general, I would expect a customer who implements FabricPath authentication will probably configure both domain and interface level authentication.

 

As always, your comments and feedback are appreciated!

 

Sunday, March 20, 2011

OTV Deep Dive - Part 3

After a long delay, let's pick up where we left off last with our OTV deep dive. This post will focus on a key concept with OTV that is critical to understand. We'll examine how we localize our First Hop Redundancy Protocols (FHRPs). These protocols are Host Standby Routing Protocol (HSRP v1 and v2) Virtual Router Redundancy Protocol (VRRP), and Gateway Load Balancing Protocol (GLBP). These protocols allow two network devices to share a common IP address to be used as the default gateway on a subnet and provide redundancy and load balancing to clients in that subnet.
Before we can discuss FHRP localization, let's review why this might be significant to our design. Typically with FHRPs the members of the group are local to each other both logically and physically. Depending on the FHRP there is load balancing or redirection between the devices to the "active" member to handle traffic. This works well when considered locally and most of us use it without a second thought.
When we start to stretch or extend our VLANs across distances, latency is introduced. While a 1ms one-way latency may not sound significant, when accumulated over a complete flow or transaction, it can become quite detrimental to performance. This is exacerbated if the two devices are both in the same location, but have default gateways in another data center. Sub optimal switching and routing at its finest. This effect is referred to as tromboning traffic and is illustrated below where device A needs to talk with device B and the default gateway resides across a stretched VLAN.











We address this with OTV by implementing filters to prevent the FHRP peers in each opposite data centers from seeing each other and therefore becoming localized. There are two approaches to do this, one using a MAC access list which we won't cover, and the other, recommended one is via an IP ACL that is applied as a VLAN ACL (VACL). To be fair, both work equally well in my experience, but he IP ACL is easier to operationalize and I am a staunch believer in making network easier to maintain and avoiding what I refer to as Science Fair Projects. We've all worked on, inherited or (hopefully not!) created a Science Fair Project - let's avoid that. ;)

The configuration for the IP ACL looks like this:

ip access-list HSRP_IP
10 permit udp any 224.0.0.2/32 eq 1985
20 permit udp any 224.0.0.102/32 eq 1985

This access list matches the multicast addresses for HSRPv1, and HSRPv2, though can be modified for VRRP and GLBP.
This access-list is then applied as a VACL to filter the FHRP hellos from entering the OTV through the internal interfaces. The VACL looks like below where we’ll filter HSRP on VLAN 31-33.

vlan access-map HSRP_Local 10
match ip address HSRP_IP
action drop
vlan access-map HSRP_Local 20
match ip address ALL
action forward
vlan filter HSRP_Local vlan-list 16,23

If you are like me and want to verify your VACL is applied and matching, the steps are not as easy we’d like them to be but the capability does exist. *NOTE* that I am not responsible for you monkeying around with any of the other commands available when you attach to the module. You’ve been warned. :)
The first thing to do is attach to the module where your internal interfaces physically are. In the example below, it’s module 1. If your OTV is configured in a non-default VDC, you’ll need to set the parser to use that VDC as below.

champs1# attach mod 1
Attaching to module 1 ...
To exit type 'exit', to abort type '$.'
module-1# vdc 3
module-1# show system internal access-list input statistics
VLAN 16 :
=========
Tcam 1 resource usage:
----------------------
Label_b = 0x806
Bank 0
------
IPv4 Class
Policies: VACL(HSRP_Local) [Merged]
Entries:
[Index] Entry [Stats]
---------------------
[0013] deny udp 0.0.0.0/0 224.0.0.102/32 eq 1985 [1863]
[0014] deny udp 0.0.0.0/0 224.0.0.2/32 eq 1985 [4121]

[0015] permit ip 0.0.0.0/0 0.0.0.0/0 [1766386]

VLAN 23 :
=========
Tcam 1 resource usage:
----------------------
Label_b = 0x806
Bank 0
------
IPv4 Class
Policies: VACL(HSRP_Local) [Merged]
Entries:
[Index] Entry [Stats]
---------------------
[0013] deny udp 0.0.0.0/0 224.0.0.102/32 eq 1985 [1863]
[0014] deny udp 0.0.0.0/0 224.0.0.2/32 eq 1985 [4121]
[0015] permit ip 0.0.0.0/0 0.0.0.0/0 [1766386]


With this configuration, the FHRP in each data center will be locally active and mitigate the tromboning we mentioned earlier. This has a significant impact in that now we only send traffic across the Data Center Interconnect (DCI) that needs to go across as the local routers in each site can service the traffic.

Note that is technique is useful for optimizing egress traffic but does nothing to help draw or “attract” traffic into the right data center. Other technologies that provide that functionality will be the topic of future blogs. ;)

One last step to undertake when performing FHRP isolation is to exclude the FHRP MAC addresses from being advertised by OTV. You might be thinking OTV won't know about the FHRP MACs becuase of the VACL, right? Wrong. :) Due to the nature of MAC address learning, OTV will learn about the MAC addresses before the VACL drops them so we need to tell OTV to not advertise them. This is a three part process where we'll define the mac access list, add it to a route-map and then apply it to the OTV ISIS process as shown below.

mac-list OTV_HSRP seq 10 deny 0000.0c07.ac00 ffff.ffff.ff00
mac-list OTV_HSRP seq 11 deny 0000.0c9f.f000 ffff.ffff.ff00
mac-list OTV_HSRP seq 15 deny 0100.5e00.0000 ffff.ffff.ff00
mac-list OTV_HSRP seq 20 permit 0000.0000.0000 0000.0000.0000


route-map OTV_HSRP_filter permit 10
match mac-list OTV_HSRP

otv-isis default
vpn Overlay0
redistribute filter route-map OTV_HSRP_filter


We’ll cover AED election, and some other fun topics in the next post (hopefully sooner rather than later.

As always, your comments and feedback are appreciated!

Sunday, February 20, 2011

OTV Deep Dive - Part Two

Now that we've covered OTV theory and nomenclature, let's dig in to the fun stuff and talk about the CLI and what OTV looks like when it's setup. We'll be using the topology below comprised of four Nexus 7000s and eight VDCs.
























We'll focus first on the minimum configuration required to get basic OTV adjacency up and working and then add in multi-homing for redundancy. First, make sure the L3 network that OTV will be traversing is multicast enabled. Today with current shipping code, neighbor discovery is done via multicast which helps facilitate easy additions and removal of sites from the OTV network. With this requirement met, we can get rolling.

A simple initial config is below and we'll dissect it.

First, we enable the feature
feature otv

Then we configure the Overlay interface
interface Overlay1

Next we configure the join interface. This is the interface that will be used for the IGMP join and will be the source IP address of all packets after encapsulation.
otv join-interface Ethernet1/7.1

Now we'll configure the control group. As its name implies the control group is the multicast group used by all OTV speakers in an Overlay network. This should be a unique multicast group in the multicast network.
otv control-group 239.192.1.1

Then we configure the data group which is used to encapsulate any L2 multicast traffic that is being extended across the Overlay. Any L3 mutlicast will be routed off of the VLAN through whatever regular multicast mechanisms exist on the network.
otv data-group 239.192.2.0/24

Next to last bare minimum config to add is the list of VLANs to be extended.
otv extend-vlan 31-33,100,1010,1088-1089

Finally, no shut to enable the interface.
no shutdown


We can now look at the Overlay interface but honestly, won't see much. Force of habit after a no shut on an interface. :)

show int o1
Overlay1 is up
BW 1000000 Kbit
Last clearing of "show interface" counters never
RX
0 unicast packets 77420 multicast packets
77420 input packets 574 bits/sec 0 packets/sec
TX
0 unicast packets 0 multicast packets
0 output packets 0 bits/sec 0 packets/sec

If we configure the other hosts in our network and multicast is working, we'll see adjacencies form as below.

champs1-OTV# show otv adj

Overlay Adjacency database

Overlay-Interface Overlay1 :
Hostname System-ID Dest Addr Up Time State
champs2-OTV 001b.54c2.41c4 10.100.251.14 2d05h UP
fresca-OTV 0026.9822.ea44 10.100.251.78 2d05h UP
pepsi-OTV f866.f206.fd44 10.100.251.82 2d05h UP

champs1-OTV#


With this in place, we now have a basic config and will be able to extend VLANs between the four devices.

The last thing we'll cover in this post is how multi-homing can be enabled. First to level set on multi-homing in this context I'm referring to the ability have redundancy in each site and not have a crippling loop.

The way this is accomplished in OTV is by the use of the concept of a site VLAN. The site VLAN is a VLAN that's dedicated to OTV and NOT extended across the Overlay but is trunked between the two OTV edge devices. This VLAN doesn't need any IP addresses or SVIs created, it just needs to exist and be added to the OTV config as shown below.

otv site-vlan 99

With the simple addition of this command the OTV edge devices will discover each other locally and then use an algorithm to determine a role each edge device will assume on a per VLAN basis. This role is called the Authoritative Edge Device (AED). The AED is responsible for forwarding all traffic for a given VLAN including broadcast and multicast traffic. Today the algorithm aligns with the VLAN ID with one edge device supporting the odd numbered VLANs and the other supporting the even numbered VLANs. This can be seen by reviewing the output below.

champs1-OTV# show otv vlan


OTV Extended VLANs and Edge Device State Information (* - AED)

VLAN Auth. Edge Device Vlan State Overlay
---- ----------------------------------- ---------- -------
31* champs1-OTV active Overlay1
32 champs2-OTV inactive(Non AED) Overlay1
33* champs1-OTV active Overlay1

1000 champs2-OTV inactive(Non AED) Overlay1
1010 champs2-OTV inactive(Non AED) Overlay1
1088 champs2-OTV inactive(Non AED) Overlay1
1089* champs1-OTV active Overlay1


If we look at the output above we can see that this edge device is the AED for VLANs 31, 33 and 1098 and is the non-AED for 32,1000, 1010 and 1088. In the event of a failure of champs2, champs1 will take over and become the AED for all VLANs.

We'll explore FHRP localization and what happens across the OTV control group in the next post. As always, your thoughts, comments and feeback are welcome.

Wednesday, February 16, 2011

OTV Deep Dive - Part One

I've been meaning to do this for a long time and now that I have the blog and am awake in the hotel room at 3AM, what better thing to do than talk about a technology I've been fortunate enough to work with for almost a year. This will be a series of posts as I'd like to take a structured approach to the technology and dig into the details and mechanics as well as operational aspects of the technology.

Overlay Transport Virtualization (OTV) is a feature available on the Nexus 7000 series switches that enables extension of VLANs across Layer 3 networks. This enables new options of data center scale and design that have not been available in the past. The two common use cases I've worked with customers to implement include data center migration and workload mobility. Interestingly, many jump to a multiple physical data center scenario and start to consider stretched clusters and worry about data sync issues and while OTV can provide value in those scenarios it also is a valid solution inside the data center where L3 interconnects may segment the network but the need for mobility is present.

OTV is significant in its ability to provide this extension without the hassles and challenges associated with traditional Layer 2 extension such as merging STP domains, MAC learning and flooding. OTV is designed to drop STP BPDUs across the Overlay interface which means STP domains on each side of the L3 network are not merged. This is significant in that it minimizes fate sharing where a STP event in one domain ripples to other domains. Additionally OTV uses IS-IS at its control plane to advertise MAC addresses and provide capabilities such as loop avoidance and optimized traffic handling. Finally, OTV doesn't have state that needs maintained as is required with pseudo wire transports like EoMPLS and VPLS. OTV is an encapsulating technology and as such add a 42 byte header to each frame transported across the Overlay. Below is the frame format in more detail.















We'll start defining the components and interfaces used when discussing OTV. Refer the topology below.

















We have a typical data center aggregation layer based on Nexus 7000 which is our boundary between Layer 2 and Layer 3. The two switches, Agg1 and Agg2 utilize a Nexus technology, virtual Port Channel (vPC) to provide multi-chassis Etherchannel (MCEC) to the OTV Edge devices. In this topology, the OTV edge devices happen to be Virtual Device Contexts (VDC) that share the same sheet metal as the Agg switches but are logically separate. We'll dig into VDCs more in future blog posts, but know that VDCs are a very, very powerful feature within NX-OS on the Nexus 7000.

Three primary interfaces are used in OTV. The internal interface as its name implies is internal to OTV and is where the VLANs that are to be extended are brought in to the OTV network. These are normal Ethernet interfaces running at Layer 2 and can be trunks or access ports depending on your network's needs. It is important to note that the internal interfaces *DO* participate in STP and as such, considerations such as rootguard and appropriate STP prioritization should be taken into account. In most topologies you wouldn't want, or need the OTV edge device to be the root though if that works in your topology, OTV will work as desired.

The next interface is the join interface which is where the encapsulated L2 frames are placed on the L3 network for transport to the appropriate OTV edge device. The join interface has an IP address and behaves much as a client in that it issues IGMP requests to join the OTV multicast control group. In some topologies it is desirable to have the join interface participate in a dynamic routing protocol and that is not a problem either. As mentioned earlier, OTV encapsulates traffic and adds a 42 byte header to each packet so it may be prudent to ensure your transit network can support packets larger than 1500 bytes. Though not a requirement, performance may suffer if jumbo frames are not supported.

Finally, the Overlay interface is where OTV specific configuration options are applied to define key attributes such as multicast control groups, VLANs to be extended and join interfaces. The Overlay interface is where the (in)famous 5 commands to enable OTV are entered though anyone who's worked with the technology recognize more than 5 commands are needed for a successful implementation. :) The Overlay interface is similar to a Loopback interface in that it's a virtual interface.

In the next post, we'll discuss the initial OTV configuration and multi-homing capabilities in more detail. As always, I welcome your comments and feedback.

Twitter Delicious Facebook Digg Stumbleupon Favorites More