Friday 19 June 2015

MPLS CASE STUDY: WHY ROUTE SUMMARIZATION IS NOT RECOMMENDED ON LOOPBACKS IN AN MPLS ENVIRONMENT - NETWORK

MPLS generates Implicit Null label (The Implicit NULL label is the label that has a value of 3) for directlyCONNECTED interfaces and for the summary routes. Label Switch Router (LSR) generates this label and shares it with its directlyCONNECTED peers with PoP(Point of Presence) tag. The advantage of using this label helps the destined router to increase its performance as the top most label tag was removed by the penultimate router (Router before the destined router).  
The question that arises here is what will happen when all the IGP loopbacks are advertised as a single network in order to save the number of routes advertised?  Will this drop the traffic as per Penultimate Hop Popping (PHP) logic or will traffic forwarding still work as it is supposed to?
This article focuses on the impact of route summarization summary on loopback addresses in an MPLS environment and examines available work-arounds to overcome problems caused by the route summarization.
Readers seeking more information on MPLS IP VPN Networks, and how they work, can refer to our article: Understanding MPLS IP VPNs, Security Attacks and VPN Encryption. The article covers basic MPLS concepts and explains how MPLS IP VPNs work.

UNDERSTANDING MPLS LABELS: PUSH / POP / SWAP / PHP

When talking about MPLS environments we often come across terms such as PUSH, POP & SWAP & PHP. Below we explain what these terms mean and their functions:

Following is a brief explanation of the popular MPLS terms PUSH, POP & SWAP, PHP:
  1. PUSH: Adding a label to incoming packet. Also known as label imposition.
  2. SWAP: Swap the incoming label with another outgoing label
  3. POP: Remove the label from outgoing packet. Also known as label disposition.
  4. PHP: Stands for Penultimate Hop Popping. It refers to the process whereby the outermost label of an MPLS tagged packet is removed by a Label Switch Router (LSR) before the packet is passed to an adjacent Label Edge Router (LER).

REQUIREMENTS FOR OUR TEST ENVIRONMENT

Prior to reading this document you should be familiar with mpls vpn environment. This article assumes the reader has experience in MPLS environments and routing protocols. It is recommended that you understand functions such as Penultimate Hop Popping (PHP) andDouble Penultimate Hop Popping Lookup.


UNDERSTAND THE CURRENT TOPOLOGY - EXAMPLE

As shown in the figure below, the service provider network is based upon a tier-three architecture. It has three types of layers:  Core, Distribution and Access.
cisco mpls route summarization
Core Layer is usually referred to as Tier 1, Distribution Layer as Tier 2 and Access Layer as Tier 3. Core Layer is used to connect the different areas with each other and is largely responsible for the forwarding of traffic only. At the Distribution Layer, areas are segregated and used for the area summarization. Customers are terminated at the Access Layer i.e. Tier-3.  
  • Tier one consists of core which will be participating only in Area 0
  • Tier two is directly connected with area 0 and local area
  • Tier three is connected to Tier two and does not directly connect with Tier one. Tier three participates only in the local area & customers connect at this layer.
The same model is used for all the locations. Every Tier 2 has an allocated pool of /16 subnet. If you're wondering why /16, it is because it allows the summary to be performed for Area 1 to Area 0. By performing the summary, only a single route will come in Area 0 and no more flaps will participate in spf (shortest-path-first) calculations.
In figure 1, PUNE (router name) is a Tier two POP and aggregates all links from which arrives Tier 3 or from local PUNE. A schematic ip pool of 10.1.0.0/16 is allocated to router PUNE provisioning team and further this pool is divided into 255 multiple networks of /24 as shown below:
  • 10.1.1.0/24
  • 10.1.2.0/24
  • 10.1.3.0/24
Every /24 is allocated to each POP.  10.1.255.0/24 is reserved for loopback addresses & 10.1.253.0/24,10.1.252.0/24 & 10.1.252.0/24 is reserved for WAN addresses.

REQUIREMENTS OF POP

With every /24 pool given to every PoP (Point of Presence) (where all the devices are physically installed), a /32 IP address is given from the 10.1.255.0/24 pool. When the routes are advertised in MP-BGP, labels are only required for the loopbacks of the PoP routers and not for the entire subnet. The reason is that when the forwarding occurs only the next hop is checked, which is nothing but the loopback address of the PoP router. It means LDP is performing on loopback addresses.
Note: That’s why labels are always required for loopback addresses and not for entire routes like Wan and Lan Addresses.

PERFORMING ROUTE SUMMARIZATION

Pune router is configured as the ABR (Area Border Router, in which one interface is connected with Area 0 and another interface is connected with Area 1). To reduce the number of routes from the backbone area in Area 0, Pune Router (ABR) has to perform a summarization of Area 0 networks advertised into Area 1.
Before this summarization takes place everything works fine, however, after the Pune Router advertised its summary route towards Area 1 (Not Area 0), connectivity was lost.
Customers start complaining that their VPN network is not working. They are not able to access their VPN’s across the country.
No changes have been made in the network except the summarization. The entire network is reachable except the customer’s VPN network. Does it mean that summarization made the customer’s network go down? How could it be possible?

UNDERSTANDING WHY CONNECTIVITY WAS LOST

Let us examine why customer VPN networks went down as soon as the ABR router performed its route summarization of Area 0 and advertised it into Area 1
Penultimate Hop Popping will occur only for directly connected & summarized routes. It means every router is giving an implicit null to the adjacent router for its loopback address. In figure 1 router T-PE2 is giving implicit null to router T-PE1 for its loopback 10.1.255.2. It means when a packet destined for router 10.1.255.2 as soon as it arrives at router T-PE1, the upper label (which is an IGP label) is removed by T-PE1.
Now a packet that has a single label called VPN label is forwarded toward T-PE2. Once the packet is received by T-PE2, it will perform the lookup for that VPN label, remove the VPN label and forward it out one of the connected interfaces as a pure IP packet.
Now let's explain what happened in case of summarization at the ABR router.
On the ABR router, summarization is performed for the 10.1.0.0/16 pool, which also includes the loopback addresses. As soon as the summary is announced by ABR, an implicit null was announced to the directlyCONNECTED peers in Area 0.

Traffic originating from Core-3 and Core-4 with a destination to T-PE2 must pass through the router Pune. But due to summarization,Pune router is announcing 10.1.0.0/16 with implicit null to all its peersCONNECTED in Area 0.

At this point the VPN packets, destined for Area 1, originating from Core 3 and Core 4, received by router Pune, have only a single label which is VPN label (Pune router is receiving VPN packets with single label due to PHP).  As soon as the VPN packets destined for T-PE2arrive at Pune ABR with single VPN label, they are dropped because that VPN label is not available in Pune ABR. The behavior of routerPune remains the same for the entire traffic forwarding and finally traffic gets blackholed at router Pune.

WORKAROUNDS


Workaround No.1 - Perform the summary which excludes your loopback addresses
e.g 10.16.0.0/16 is the major pool which has 256 subnets starting from 10.16.0.0/24 10.16.1.0/24, 10.16.2.0/24, 10.16.3.0/24 up to 10.16.255.0/255.
Out of the 256 subnets, we can reserve for loopback addresses a few of the subnets starting from Network IDs 10.16.0.0/24, 10.16.1.0/24, 10.16.2.0/24, 10.16.3.0/24,10.16.4.0/24, 10.16.5.0/24, 10.16.6.0/24, and 10.16.7.0/24, and rest of the /24 pools for Network ID 10.16.8.0/24 onwards can be used for the backbone addresses. In this case, it is not necessary to perform  summarization of the loopbacks pools, while we are able to summarize the rest of the pools. 

Workaround No.2 - Make use of a different pool of ip addresses that will never participate in summarization. 
Our second workaround involves reserving Network 10.18.0.0/16 for the loopback addresses and assigning Network IDs 10.17.0.0/16 and 10.16.0.0/16 to the backbone addresses. In this case we can perform the summarization on 10.16.0.0/16 and 10.17.0.0/16 pools, excluding Network ID 10.18.0.0/16.

No comments:

Post a Comment

SQL Server Services and Tools

  Microsoft provides both data management and business intelligence (BI) tools and services together with SQL Server. For data management, S...