Design

Route Summarization from Distribution to the Core

This post is about a particular problem being faced once route summarization is introduced and how it can be avoided.

This concept is covered very well in these books from Cisco Press:

http://www.ciscopress.com/store/ccde-study-guide-9781587143809

http://www.ciscopress.com/store/optimal-routing-design-9781587051876

In this topology diagram 1 above, C1 and C2 are the core routers.

D1 and D2 are the distribution routers.

How is route summarization configured in this topology?

As show in above diagram 2, route summarization is configured from the distribution routers D1 and D2 towards the core routers C1 and C2.

For the sake of this argument, we will only be discussing traffic flow from Core routers C1 and C2 towards destination router D

Forwarding of traffic under normal operations:

C1 can forward traffic to D1 and D2 both in order to reach router D

C2 can also forward traffic to D1 and D2 both in order to reach router D

Depending on where the traffic arrives, D1 / D2 can forward traffic towards router D

Now let us see what happens when the link between D2 and router D goes down.

Failure Scenario:

Diagram 3 above shows the link between router D2 and router D going down.

Route for prefixes advertised by router D will still be there on core routers C1 and C2 because the distribution layer routers D1 and D2 are sending summarized route towards the core routers C1 and C2.

When the traffic from core routers destined to router D hits distribution router D2, traffic is black holed.

D2 cannot forward traffic towards router D because the link towards router D is down.

How can this be avoided?

You will have to introduce a layer 3 link between distribution layer routers D1 and D2.

And this new link will not have any route summarization configured over it.

Refer to above diagram 4

With the same failure situation as above, now when traffic from core routers C1 and C2 destined to router D reaches D2; router D2 can forward the traffic towards D.

This is because D2 learns the prefixes advertised by router D over back to back link between D1 and D2.

D1 can then forward traffic towards router D

I was told recently that organization XYZ suffered outage because one of their core devices did not have redundancy.
In other words, there was a single point of failure somewhere in their network.
And then their technical team kept fire fighting until the issue was resolved.

This post is about avoiding this nuisance called SPOF - Single Point of Failure
And how are single point of failures avoided ?
Let's just quickly go through technologies that take care of High Availability.

A. Introduce device clustering
==============================

Examples are
1. Stacking switches
These could be 3750 stacked switches at the access layer.
Or 6500 VSS which works well at the distribution layer.

2. Active / Passive Firewalls
or Active / Active Firewalls

Active / Passive is pretty straightforward concept where one active firewall in the cluster handles all the data traffic and the other device sits idle ready to take control once active unit fails.
Active / Active is where one device is made primary for half of the security contexts and the other device is primary for rest of the security contexts.

3. Virtual Port-channel
Nexus family of Cisco suppport VPC feature.
Two Nexus 5000 switches or two Nexus 7000 can be configured as VPC peers.
This feature simplifies the layer 2 topology, it removes blocking at layer 2.

The way blocked ports are removed is because logically there are two devices with port-channel between them.
And this single port-channel is in forwarding state.
Bandwidth wise, this is better because you are getting throughput from both the links.
This is a considerable improvement when compared with traditional layer 2 networks where one link would be forwarding and the other redundant link would be in blocking state.

One more benefit out of VPCs is the way they behave with FHRPs viz. HRSP, VRRP
vPC interaction with FHRPs ensures that both VPC peers can forward traffic northbound; the traffic hitting HSRP standby node need not cross the vPC peer link.

Last but not the least, VPCs could also be double-sided where northbound and southbound devices both are forming vPC towards the other end.
Northbound device forming VPC towards southbound device & vice versa.

B. Interface redundancy
========================
Protocols like LACP / Pagp allow interfaces to be combined into a port-channel.
This again avoids layer 2 blocked ports and the aggregated logical interface is forwarding traffic.
Even in the SAN world, you could combine two interfaces into a single logical interface and you could trunk VSANs over the logical link.

What this means is increased throughput/speed.

You could also use port-channel hashing methods like src-dst-ip or src-dst-mac to influence load sharing over bundled interfaces.

C. Redundancy at application level.
=====================================
Whereby the application is hosted on multiple servers and multiple servers are hosted using a virtual IP on load balancer.
Client request hits virtual IP of laod balancer.
And load balancer forward traffic towards pool members based on load balancing algorithm configured.

D. Redundancy for server network interfaces.
============================================
Server interfaces could be bundled in active/active or active/passive style.
IP address assignment goes under the logical bundled interface.

E. Redundancy at router level
==============================
SVIs created at the distribution layer can be combined with FHRPs like HRSP, VRRP, GLBP.
These FHRPs provide multiple gateways which are physically redundant.
Alternatively, you could have odd VLANs active on one gateway and even numbered VLANs active on the other gateway.

Design

Tuesday, April 5, 2016

Route Summarization from Distribution to Core

Sunday, March 6, 2016

Always avoid Single Point of Failures SPOFs