Imagine that you have two or more sites which you want to connect together using MPLS technology. You cannot afford dark fiber and your Service Provider cannot offer you L2 connections of any kind. The only thing your SP can offer is L3 transport. Still, you want to build your own MPLS environment and there is no way to convince your SP to enable CsC.
I will use the following topology to demonstrate one of the possibilities to build an overlay MPLS over SP backbone.
The provider has already his MPLS backbone configured (P1, PE1, PE2 and PE3) and it’s offering you, as customer, IP transport over it’s backbone. Usually, from customer perspective, you don’t get to see the SP backbone, but just for reference, it is using ISIS for IGP, MP-BGP and MPLS VPN to transport our prefixes.
From IP prefixes allocation I’m using “xy” in the third octet (x – lower router number, y – higher router number) and “z” in the last octet (router number) with a /24 mask.
We have three locations named CPE1, CPE2 and CPE3. Currently between my CPE and provider PE I have enabled BGP, but you can use any protocol (even static) if your SP is able to route your IP prefixes over its backbone. On each CPE device I have a Loopback interface and its IP address will be the only prefix you announce (through BGP in this demonstration) to SP.
Let’s establish the BGP connection from our CPE to SP PE. As I’m playing the role of customer here, only the CPE exhibits will be shown:
CPE5
interface Loopback0 ip address 5.5.5.5 255.255.255.255 ! router bgp 65001 bgp router-id 5.5.5.5 bgp log-neighbor-changes neighbor 10.0.35.3 remote-as 65000 neighbor 10.0.35.3 description R3PE3 neighbor 10.0.35.3 timers 5 20 ! address-family ipv4 neighbor 10.0.35.3 activate no auto-summary no synchronization network 5.5.5.5 mask 255.255.255.255 exit-address-family |
CPE6
interface Loopback0 ip address 6.6.6.6 255.255.255.255 ! router bgp 65001 bgp router-id 6.6.6.6 bgp log-neighbor-changes neighbor 10.0.26.2 remote-as 65000 neighbor 10.0.26.2 description R2PE2 neighbor 10.0.26.2 timers 5 20 ! address-family ipv4 neighbor 10.0.26.2 activate no auto-summary no synchronization network 6.6.6.6 mask 255.255.255.255 exit-address-family |
CPE7
interface Loopback0 ip address 7.7.7.7 255.255.255.255 ! router bgp 65001 bgp router-id 7.7.7.7 bgp log-neighbor-changes neighbor 10.0.47.4 remote-as 65000 neighbor 10.0.47.4 description R4PE4 neighbor 10.0.47.4 timers 5 20 ! address-family ipv4 neighbor 10.0.47.4 activate no auto-summary no synchronization network 7.7.7.7 mask 255.255.255.255 exit-address-family |
BGP neighborship suppose to be up now and on each CPE I should receive the Loopback prefixes of the other two CPE devices.
R5CPE5#sh ip bgp sum | b Nei Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.0.35.3 4 65000 126 125 2 0 0 00:10:04 0 ! R6CPE6#sh ip bgp sum | b Nei Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.0.26.2 4 65000 136 135 2 0 0 00:10:50 0 ! R7CPE7#sh ip bgp sum | b Nei Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.0.47.4 4 65000 134 134 4 0 0 00:10:46 0 |
BGP neighborship is up alright, but where are my prefixes? State/PfxRcd is 0, when it should show 2.
I did that on purpose.
Notice that we are using the same AS number on all our sites. I think you already know that the rule in BGP is that if our own ASN is seen in the AS-Path of a particular IP prefix, BGP will not install that prefix in BGP table. This is fixable:
1. We ask our provider to have a little “as-override” command in its BGP configuration for our neighbor
2. We use different ASN on each site (assuming that we are using private ASN)
3. We configure “allowas-in” on BGP neighborship with SP
Basically you may use any of the three methods (or other if you can think of any other), but in my case I don’t want to ask the SP nor I want to change my ASN scheme. I’ll go with the third option and be careful not run into loop issues (consider this is an Enterprise environment I think it’s doable).
CPE5
router bgp 65001 neighbor 10.0.35.3 allowas-in |
CPE6
router bgp 65001 neighbor 10.0.26.2 allowas-in |
CPE7
router bgp 65001 neighbor 10.0.47.4 allowas-in |
Let’s check again and do some testing. I will use CPE5
R5CPE5#sh ip route bgp 6.0.0.0/32 is subnetted, 1 subnets B 6.6.6.6 [20/0] via 10.0.35.3, 00:04:02 7.0.0.0/32 is subnetted, 1 subnets B 7.7.7.7 [20/0] via 10.0.35.3, 00:04:02 ! R5CPE5#ping 6.6.6.6 source 5.5.5.5 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 6.6.6.6, timeout is 2 seconds: Packet sent with a source address of 5.5.5.5 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 32/44/64 ms R5CPE5#ping 7.7.7.7 source 5.5.5.5 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds: Packet sent with a source address of 5.5.5.5 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 28/52/96 ms |
Next part involves the creation of Tunnels interfaces, to have a full mesh connection between the three sites, enable of IGP / MPLS and creation of a second Loopback interface which we will use later for the iBGP configuration. I did chose IS-IS for IGP and LDP for MPLS. The new Loopback interface will be routed using the IS-IS protocol.
A note from my side. Since I had limited number of routers, my CPE devices will be kind of P / PE / CE router in my overlay MPLS demonstration.
CPE5
int Tun56 tunnel source lo0 tunnel destination 6.6.6.6 ip address 192.168.56.5 255.255.255.0 mpls ip ip router isis ! int Tun57 tunnel source lo0 tunnel destination 7.7.7.7 ip address 192.168.57.5 255.255.255.0 mpls ip ip router isis ! int Lo1 ip address 55.55.55.55 255.255.255.255 ! router isis net 47.0005.0005.0005.0005.00 passive-interface lo1 is-type level-2-only |
CPE6
int Tun56 tunnel source lo0 tunnel destination 5.5.5.5 ip address 192.168.56.6 255.255.255.0 mpls ip ip router isis ! int Tun67 tunnel source lo0 tunnel destination 7.7.7.7 ip address 192.168.67.6 255.255.255.0 mpls ip ip router isis ! int Lo1 ip address 66.66.66.66 255.255.255.255 ! router isis net 47.0006.0006.0006.0006.00 passive-interface lo1 is-type level-2-only |
CPE7
int Tun57 tunnel source lo0 tunnel destination 5.5.5.5 ip address 192.168.57.7 255.255.255.0 mpls ip ip router isis ! int Tun67 tunnel source lo0 tunnel destination 6.6.6.6 ip address 192.168.67.7 255.255.255.0 mpls ip ip router isis ! int Lo1 ip address 77.77.77.77 255.255.255.255 ! router isis net 47.0007.0007.0007.0007.00 is-type level-2-only passive-interface lo1 |
I will use CPE5 for some show commands output and to check that everything is running fine:
R5CPE5#ping 192.168.56.6 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.168.56.6, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 44/48/52 ms R5CPE5#ping 192.168.57.7 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.168.57.7, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 32/40/52 ms ! R5CPE5#sh isis topology IS-IS paths to level-2 routers System Id Metric Next-Hop Interface SNPA R5CPE5 -- R6CPE6 10 R6CPE6 Tu56 *Tunnel* R7CPE7 10 R7CPE7 Tu57 *Tunnel* ! R5CPE5#show mpls interfaces Interface IP Tunnel Operational Tunnel56 Yes (ldp) No Yes Tunnel57 Yes (ldp) No Yes ! R5CPE5#show mpls ldp neighbor Peer LDP Ident: 6.6.6.6:0; Local LDP Ident 5.5.5.5:0 TCP connection: 6.6.6.6.64820 - 5.5.5.5.646 State: Oper; Msgs sent/rcvd: 16/16; Downstream Up time: 00:07:47 LDP discovery sources: Tunnel56, Src IP addr: 192.168.56.6 Addresses bound to peer LDP Ident: 10.0.26.6 6.6.6.6 192.168.56.6 192.168.67.6 Peer LDP Ident: 7.7.7.7:0; Local LDP Ident 5.5.5.5:0 TCP connection: 7.7.7.7.11545 - 5.5.5.5.646 State: Oper; Msgs sent/rcvd: 16/16; Downstream Up time: 00:07:26 LDP discovery sources: Tunnel57, Src IP addr: 192.168.57.7 Addresses bound to peer LDP Ident: 10.0.47.7 7.7.7.7 192.168.57.7 192.168.67.7 |
I will create now two VRF instances as I want to separate Financial department traffic from the Technical one.
On all three CPE devices:
ip vrf FIN rd 65001:1 route-target import 65001:1 route-target export 65001:1 ! ip vrf TEK rd 65001:2 route-target import 65001:2 route-target export 65001:2 |
Configuration of iBGP depends on the previous Loopback1 interfaces, so be sure that you have that interfaces reachable through IS-IS. Because it’s an iBGP and I don’t want to type the same command over and over, I will use peer-groups.
CPE5
router bgp 65001 ! neighbor OM peer-group neighbor OM remote-as 65001 neighbor OM timers 5 20 neighbor OM description Overlay-MPLS neighbor OM update-source lo0 ! address-family vpnv4 neighbor 66.66.66.66 peer-group OM neighbor 77.77.77.77 peer-group OM |
CPE6
router bgp 65001 neighbor OM peer-group neighbor OM remote-as 65001 neighbor OM timers 5 20 neighbor OM description Overlay-MPLS neighbor OM update-source lo1 ! address-family vpnv4 neighbor 55.55.55.55 peer-group OM neighbor 77.77.77.77 peer-group OM |
CPE7
router bgp 65001 neighbor OM peer-group neighbor OM remote-as 65001 neighbor OM timers 5 20 neighbor OM description Overlay-MPLS neighbor OM update-source lo1 ! address-family vpnv4 neighbor 55.55.55.55 peer-group OM neighbor 66.66.66.66 peer-group OM |
We should check that everything is up. I will use again CPE5:
R5CPE5#show ip bgp vpnv4 all sum | b Nei Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 66.66.66.66 4 65001 46 46 1 0 0 00:03:30 0 77.77.77.77 4 65001 31 31 1 0 0 00:02:18 0 |
Finally we are getting somewhere. In real world you will have the CPE routers connected to downstream devices, using subinterfaces in particular VRF and so on. I’m short on devices, so I will use some additional Loopback interfaces and add them to VRF FIN and TEK for testing.
CPE5
int Lo51 ip vrf forwarding FIN ip address 10.51.51.51 255.255.255.255 ! int Lo52 ip vrf forwarding TEK ip address 10.52.52.52 255.255.255.255 ! router bgp 65001 ! address-family ipv4 vrf FIN network 10.51.51.51 mask 255.255.255.255 ! address-family ipv4 vrf TEK network 10.52.52.52 mask 255.255.255.255 |
CPE6
int Lo61 ip vrf forwarding FIN ip address 10.61.61.61 255.255.255.255 ! int Lo62 ip vrf forwarding TEK ip address 10.62.62.62 255.255.255.255 ! router bgp 65001 ! address-family ipv4 vrf FIN network 10.61.61.61 mask 255.255.255.255 ! address-family ipv4 vrf TEK network 10.62.62.62 mask 255.255.255.255 |
CPE7
int Lo71 ip vrf forwarding FIN ip address 10.71.71.71 255.255.255.255 ! int Lo72 ip vrf forwarding TEK ip address 10.72.72.72 255.255.255.255 ! router bgp 65001 ! address-family ipv4 vrf FIN network 10.71.71.71 mask 255.255.255.255 ! address-family ipv4 vrf TEK network 10.72.72.72 mask 255.255.255.255 |
To check if everything is working fine, I will use CPE5 for some tests:
R5CPE5#sh ip route vrf FIN | b Ga Gateway of last resort is not set 10.0.0.0/32 is subnetted, 3 subnets B 10.61.61.61 [200/0] via 66.66.66.66, 00:04:42 C 10.51.51.51 is directly connected, Loopback51 B 10.71.71.71 [200/0] via 77.77.77.77, 00:02:46 ! R5CPE5#sh ip route vrf TEK | b Ga Gateway of last resort is not set 10.0.0.0/32 is subnetted, 3 subnets B 10.62.62.62 [200/0] via 66.66.66.66, 00:04:59 C 10.52.52.52 is directly connected, Loopback52 B 10.72.72.72 [200/0] via 77.77.77.77, 00:03:04 ! R5CPE5#ping vrf FIN 10.71.71.71 source 10.51.51.51 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.71.71.71, timeout is 2 seconds: Packet sent with a source address of 10.51.51.51 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 40/44/48 ms ! R5CPE5#ping vrf TEK 10.62.62.62 source 10.52.52.52 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.62.62.62, timeout is 2 seconds: Packet sent with a source address of 10.52.52.52 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 40/47/52 ms ! R5CPE5#show mpls forwarding-table Local Outgoing Prefix Bytes tag Outgoing Next Hop tag tag or VC or Tunnel Id switched interface 16 Pop tag 192.168.67.0/24 0 Tu57 point2point Pop tag 192.168.67.0/24 0 Tu56 point2point 17 Pop tag 66.66.66.66/32 0 Tu56 point2point 18 Pop tag 77.77.77.77/32 0 Tu57 point2point 19 Aggregate 10.51.51.51/32[V] 1040 20 Aggregate 10.52.52.52/32[V] 520 |
You may wonder why somebody would put together such a complex configuration. There may be multiple reasons, beyond the scope of this example, but I would like to add MPLS TE, Encrypted Site-to-Site traffic with route manipulation, independent configuration from the SP, learning purposes and many more.
Can we encounter problems with this configuration? Well, yes.
If the provider has a very low MTU size support, you may get a lot of fragmentation. Also maintenance and operation of Tunnels may be tricky for a very large environment, but there are solutions to limit the number of tunnels. Still the benefits exit.
To enumerate one benefit from real world. Applying this configuration in Enterprise environment, you have the possibility to change your SP without too much of a dazzle as long as your new provider can transport the IP address of your primary Loopback interface. The rest, stays the same.
Please let me know if you have questions or if something in my explanation is wrong.
very neat article. thank you for sharing this knowledge.
how do you fix the MTU issue when you don’t know what the SP is using? start with very low MTU and do incremental increases?
Thanks for your comment!
MTU size is always tricky when it involves environment where you cannot have control. From own experience, we had a leased line from a certain provider, we wanted dot1q over it and had problems because of MTU size.
The best suggestion is a strong SLA with SP in which to specify the minimum MTU size they should deliver. Depending of multiple factors (financial, technical) the SP may deliver or not. It would help to calculate the minimum MTU size you need before going to negotiate with provider.
Also it’s important what “tunnel” technology will be used. Here I had a simple GRE tunnel, but if you need encryption you may want to use IPsec also or maybe something tunnel-less like GETVPN. All this may be a factor in your decision about MTU size.