Introduction
The Cisco IOS Software-based
GET VPN (Cisco IOS GET VPN) is a tunnel-less technology that provides end-to
end security for voice, video, and data in a native mode for a fully meshed
network. It uses the ability of the core network to route and replicate the
packets between various sites within the enterprise. Cisco IOS GET VPN preserves
the original source and destination addresses in the encryption header for
optimal routing; hence, it is largely suited for an enterprise running over a
private Multiprotocol Label Switching (MPLS)/IP-based core network. Cisco IOS
GET VPN uses Group Domain of Interpretation (GDOI) as the keying protocol for
encrypting and decrypting the data packets.
Though MPLS VPNs can provide a certain level of security, many critical applications need end-to end Encryption as well, but banking and insurance sectors needs to have end to end tunnels for compliance. Cisco IOS GET VPN is a group key-based solution that provides end-to-end security for both unicast and multicast applications. It is enabled in customer edge routers without using tunnels.
Though MPLS VPNs can provide a certain level of security, many critical applications need end-to end Encryption as well, but banking and insurance sectors needs to have end to end tunnels for compliance. Cisco IOS GET VPN is a group key-based solution that provides end-to-end security for both unicast and multicast applications. It is enabled in customer edge routers without using tunnels.
XYZ Corp GET VPN solution design
relies on following core building blocks to provide the required functionality:
·
GDOI
·
Key servers (KSs)
·
Cooperative (COOP) KSs
·
GMs
·
IP tunnel header preservation
·
Group security association
·
Rekey mechanism
·
Time-based anti-replay (TBAR)
Each of these are explained
in this post .
GDOI
The GDOI group key
management protocol is used to provide a set of cryptographic keys and policies
to a group of devices. In a GET VPN network, GDOI is used to distribute common
IPsec keys to a group of enterprise VPN gateways that must communicate
securely. These keys are periodically refreshed and are updated on all the VPN
gateways using a process called “rekey.”
The GDOI protocol is protected by a Phase 1 Internet Key Exchange (IKE) SA. All participating VPN gateways must authenticate themselves to the device providing keys using IKE. All IKE authentication methods, for example, pre-shared keys (PSKs) and public key infrastructure (PKI), are supported for initial authentication. After the VPN gateways are authenticated and provided with the appropriate security keys via the IKE SA, the IKE SA expires and GDOI is used to update the GMs in a more scalable and efficient manner.
KSs
A key server (KS) is an IOS
device responsible for creating and maintaining the GET VPN control plane. All encryption
policies, such as interesting traffic, encryption protocols, security
association, rekey timers, and so on, are centrally defined on the KS and are
pushed down to all GMs at registration time.
GMs authenticate with the KS using IKE Phase 1 (pre-shared keys or PKI) and then download the encryption policies and keys required for GET VPN operation. The KS is also responsible for refreshing and distributing the keys.
Unlike traditional IPsec,
interesting traffic defined on the KS (using an access control list (ACL)) is downloaded
to every GM, whether or not the GM owns that network
GMs
A GM is an IOS router
responsible for actual encryption and decryption i.e. a device responsible to
handle GET VPN data plane. A GM is only configured with IKE phase 1 parameters
and KS/Group information. As mentioned before, encryption policies are defined
centrally on the KS and downloaded to the GM at the time of registration. Based
on these downloaded policies, GM decides whether traffic needs to be encrypted
or decrypted and what keys to use.
COOP KSs
The KS is the most important
entity in the GET VPN network because the KS maintains the control plane. Therefore,
a single KS is a single point of failure for an entire GET VPN network. Because
redundancy is an important consideration for KSs, GET VPN supports multiple
KSs, called cooperative (COOP) KSs, to ensure seamless fault recovery if a KS
fails or becomes unreachable.
A GM can be configured to register to any available KS from a list of all COOP KSs. GM configuration determines the registration order. The KS defined first is contacted first, followed by the second defined KS, and so on.
When COOP KSs boot, all KSs
assume a “secondary” role and begin an election process. One KS, typically the
one having the highest priority, is elected as a “primary” KS. The other KSs
remain in the secondary state. The primary KS is responsible for creating and
distributing group policies to all GMs, and to periodically synchronize the
COOP KSs.
Cooperative KSs exchange
one-way announcement messages (primary to secondary). If a secondary KS does not
hear from the primary KS for a certain length of time, the secondary KS tries
to contact the primary KS and requests updated information. If the primary KS
does not respond, or if the secondary KS does not hear from the primary KS, a
COOP re-election is triggered and a new primary KS is elected. Up to eight KSs
can be defined as COOP KSs, but more than four COOP servers are seldom
required. Since rekey information is generated and distributed from a single
primary KS, the advantage of deploying more than two KSs is the ability to
handle registration load in case of a network failure and reregistration taking
place at the same time. This is especially important when using Public Key Infrastructure
(PKI) because IKE negotiation using PKI requires a lot more CPU power compared
to IKE negotiation using pre-shared keys (PSKs).
Time Based Anti-Replay
In traditional IPsec
solutions, anti-replay capabilities prevent a malicious third party from
capturing IPsec packets and relaying those packets at a later time to launch a
denial of service attack against the IPsec endpoints. These traditional IPsec
solutions use a counter based sliding window protocol: The sender sends a packet
with a sequence number, and the receiver uses the sliding window to determine
whether a packet is acceptable, or has arrived out-of-sequence and is outside
the window of acceptable packets. Because we use the group SA in GET VPN,
counter-based anti-replay is ineffective. A new method to guard against
replay-attacks is required. GET VPN uses time-based anti-replay (TBAR), which
is based on a pseudo-time clock that is maintained on the KS. An advantage of
using pseudotime for TBAR is that there is no need to synchronize time on all
the GET VPN devices using NTP.
Group SA
Unlike traditional IPsec
encryption solutions, GET VPN uses the concept of group SA. All members in the GET
VPN group can communicate with each other using a common encryption policy and
a shared SA. With a common encryption policy and a shared SA, there is no need
to negotiate IPsec between GMs; this reduces the resource load on the IPsec
routers. Traditional GM scalability (number of tunnels and associated SA) does
not apply to GET VPN GMs.
Rekey Process
As mentioned above, the KS
is not only responsible for creating the encryption policies and keys, but also
for refreshing keys and distribute them to GMs. The process of sending out new
keys when existing keys are about to expire, is known as the rekey process. GET
VPN supports two types of rekey messages: unicast and multicast.
If a GM does not receive
rekey information from the KS (for example, the KS is down or network connectivity
is broken), the GM tries to reregister to an ordered set of KSs 60 seconds
before the existing IPsec SAs expire. If reregistration is successful, the GM
receives new SAs as part of the reregistration process and traffic in the data
plane flows without disruption. If reregistration is unsuccessful (the
preferred KS is unavailable), the GM tries three more times, at 10-second
intervals, to establish a connection with the
KS. If all attempts to
contact the preferred KS fail, the GM tries the next KS in the ordered list 20
seconds before existing IPsec SAs expire.
Tunnel Header Preservation
In traditional IPsec, tunnel
endpoint addresses are used as new packet source and destination. The packet is
then routed over the IP infrastructure, using the encrypting gateway source IP
address and the decrypting gateway destination IP address. In the case of GET
VPN, IPsec protected data packets encapsulate the original source and
destination packet addresses of the host in the outer IP header to “preserve”
the IP address.
Cisco
IOS GET VPN Benefits
·
Offers a tunnel-less encryption solution
·
Uses the underlying routing infrastructure
·
Provides for centralized management of policies and
keys in the key server
·
Offers end-to-end security for voice, video, and
data
·
Provides any-to-any enterprise connectivity for
critical applications
·
Offers optimal routing by preserving source and
destination addresses in the encryption header
·
Offers flexibility to use unicast or multicast rekey
mechanisms based on the core network support
·
Provides multicast encryption in native mode
·
Uses (requires) multicast replication in the MPLS/IP
core, removing the need for a group member to replicate multiple copies for
each receiver (such as a hub in a hub-and-spoke tunnelled network)
·
Requires less overhead in provider edge (PE) routers
because they do not need to decrypt and encrypt traffic
·
Provides efficient distribution of rekeys using
multicast transport
·
Offers zero-touch provisioning in key server for
addition of new group members if planned addressing schemes are in place
·
Offers redundancy in key-server failure by using
cooperative key-server feature
·
Prevents replay attacks
·
Selectively bypasses encryption using group-member
access control list (ACL)
·
Offers a scalable security solution for large-scale
networks
XYZ
Corp GETVPN Setup
It’s a typical design which is having dual KS on different DC locations for redundancy and link level redundancy with separate providers. Each GM connected to DC via dual links and associates with KSs , depending on the last mile availability , the KSs association depends .
It’s a typical design which is having dual KS on different DC locations for redundancy and link level redundancy with separate providers. Each GM connected to DC via dual links and associates with KSs , depending on the last mile availability , the KSs association depends .
The KSs are just for keying
function, it has nothing to do with actual VPN traffic, the GM pass the
encrypted traffic to each GM based on the ACL polices. It’s on demand for each
GM to GM communication. The primary KS is responsible for creating and distributing
group policy. It also periodically sends out group information updates to all
other KSs to keep those servers in synchronization. If the secondary KSs
somehow miss the updates, they contact the primary KS to directly request
information updates. The secondary KSs mark the primary KS as unreachable (that
is, "dead") if the updates are not received for an extended period of
time.
With multiple COOP KSs, the
policies configured on each KS must be considered. It is recommended that the
same GET VPN policies should be configured on each of the KSs. If a different
COOP KS assumes the primary role, it should distribute the same rules in a
rekey message to the GMs. If the policies were different, the GMs would receive
different policies whenever a different KS is elected as primary. This can
cause disruption.
Locations
Connectivity
Each of the location is connected
via dual last mile to ISP1 and ISP2 MPLS. Which is underlying connectivity for
the GETVPN to work. Each GM associates with KS depending on the primary KS
defined. Irrespective of the KS association as long as the all KSs are in
synch, the encryption functionality works perfectly. If reregistration is
successful, the GM receives new SAs as part of the reregistration process and
traffic in the data plane flows without disruption. If reregistration is
unsuccessful (the preferred KS is unavailable), the GM tries three more times,
at 10-second intervals, to establish a connection with the KS. If all attempts
to contact the preferred KS fail, the GM tries the next KS in the ordered list
20 seconds before existing IPsec SAs expire.
XYZ
Corp GETVPN Enhancements
IOS upgrading as
per the Cisco’s recommendation.
L Leased link
between the Key Servers for dedicated connectivity.
Backup Link between KSs
During a network split, COOP
KSs may lose connectivity to each other. This might lead to multiple KS
operating in primary mode. This results in GMs in different portions of the
split network having different keys. While the GMs continue to operate, there
are cases when GMs have complete connectivity, but KSs can experience a network
split that can lead to loss of communication between GMs. Whenever KSs lose
connectivity with the primary KS, multiple rekeys might be exchanged in the
system as new primaries are elected. This can be quite disruptive.
To increase resiliency, it is highly recommended to provide multiple paths between the COOP KSs, such as with an out of band network backup. This path should not be inline with the data plane, and preferably a separate link.
This kind of a backup link provides a continuous channel between the COOP KSs, ensures that they remain synchronized, prevents fluctuation in primary roles, and prevents unnecessary rekeys being sent.
Network Split and Merge: KS and GM Split
Initially KS1 is the primary KS and provides the keys for GMs GM1 and GM2. After a network split, KS2 also becomes a primary KS. Now, GM1 receives its key from KS1, while GM2 receives its key from KS2.
Initially KS1 is the primary KS and provides the keys for GMs GM1 and GM2. After a network split, KS2 also becomes a primary KS. Now, GM1 receives its key from KS1, while GM2 receives its key from KS2.
The sequence of events follows:
1. Initially, KS1 and KS2
have connectivity and KS1 (the primary KS) sends the TEK in the rekey messages
to GM1 and GM2.
2. A network split occurs
that isolates KS1 and GM1 from KS2 and GM2.
3. KS2 detect the loss of
connectivity with KS1 and KS2 transitions to primary state.
4. KS2 issues a rekey to the
network and provides a new KEK and TEK to the GMs. The rekey message from KS2
contains the old TEK and the new TEK. GM2 continues to use the old TEK for
encryption because it has a lifetime which expires sooner.
5. In the case of a unicast
rekey, GM2 responds and KS2 eventually times out GM1. In the case of a
multicast rekey, KS2 is not aware of GM1s state.
6. On TEK-rollover (rekey interval),
KS1 issues a new TEK. In the case of a unicast rekey, KS1 eventually times out
GM2. In case of a multicast rekey, KS1 is not aware of GM2’s state.
7. Now, GM1 has a different
TEK and KEK as compared to GM2.
To summarize: During this kind of network split, GMs have connectivity to the KSs in their respective partition, but do not have connectivity across partitions.
After the network split is resolved, the primary KS sends out multiple rekeys, each encrypted using one of the different KEKs, so that all GMs understand this rekey information and synchronize to the same set of keys.
To summarize: During this kind of network split, GMs have connectivity to the KSs in their respective partition, but do not have connectivity across partitions.
After the network split is resolved, the primary KS sends out multiple rekeys, each encrypted using one of the different KEKs, so that all GMs understand this rekey information and synchronize to the same set of keys.