- Released At: 13-09-2023
- Page Views:
- Downloads:
- Table of Contents
- Related Documents
-
|
AD-WAN Branch Solution Technology White Paper |
|
|
|
Document version: 5W101-20230526
Copyright © 2023 New H3C Technologies Co., Ltd. All rights reserved.
No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New H3C Technologies Co., Ltd.
Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are the property of their respective owners.
The information in this document is subject to change without notice.
Contents
Challenges arising from movement of enterprise applications to cloud
Compelling needs for WAN transformation
Building new-generation software-defined WAN architecture
Network and application visibility
Application deep visibility (DPI)
Network and application visibility
Multi-tiered service provider PoP network
AD-WAN architecture: Microservices
Controller service architecture
Microservice development by using Springboot and Dubbo
Northbound APIs of AD-WAN: RESTful APIs
RESTful APIs have become the mainstream application programming interfaces
Northbound APIs provided by AD-WAN
Southbound APIs of AD-WAN: NETCONF
NETCONF applications in AD-WAN
SLA-based service quality evaluation
Introduction to IP precedence and DSCP values
SLA/DSCP in the AD-WAN branch solution
NetStream-based traffic statistics measurement
NetStream in the AD-WAN branch solution
NAT session log parsing and processing
URL data parsing and processing
RIR traffic engineering log analysis
Traffic engineering log analysis
Traffic engineering statistics
Cloud connection establishment
Cloud connections in AD-WAN branch solution
Route advertisement and traffic forwarding in SDWAN EVPN
VPN instance-based tenant isolation
SDWAN tunnel establishment with NAT traversal
RIR application in the AD-WAN branch solution
DPI-based application recognition
Introduction to APR used in DPI
Solution deployment for a supermarket enterprise
Solution deployment for a large enterprise
Solution deployment for a joint-stock bank
Background
Traditional WAN challenges
The wide area network (WAN) has long served the sole purpose of connecting geographically dispersed locations. For example, a WAN extends connectivity between branches, between branches and their headquarters, or between data centers. Independent of application systems, traditional WANs were primarily managed from the perspective of network nodes instead of applications. Without visibility of applications, a traditional WAN could hardly adapt to accelerated application provisioning and business changes driven by new technologies such as cloud computing and mobile Internet.
A traditional WAN has the following issues:
· Slow service provisioning and long onboarding process.
Provisioning services on a traditional WAN is complex and requires manual configuration node by node, which is tedious, slow, inefficient, and prone to errors.
· Poor traffic engineering.
Nodes on a traditional WAN do not have a holistic view of the network. They make routing decision independently to forward traffic on the shortest path, which is not necessarily optimal. It is commonplace that some links on the WAN are heavily loaded while some others have a light traffic load.
Conventional policy-based routing and traffic engineering can alleviate this situation, but they are difficult to configure and cannot dynamically adapt to rapid network and application changes.
· Poor maintainability.
Extensive manual work and low visibility into traffic and applications makes it difficult to identify and resolve network issues. To operate and maintain the network, IT staff members must have a high level of expertise and skills.
· Lack of programmability.
Traditional WAN nodes typically lack the programmability required to provision services quickly to accommodate rapidly growing applications and increasingly diversified customer demands.
Network settings for applications are typically static and cannot automatically change in response to changes to offer good business experience.
Figure 1 Traditional WAN challenges
Challenges arising from movement of enterprise applications to cloud
As cloud computing grows rapidly and gains large-scale deployments, enterprises are increasingly moving their on-premises IT systems to the cloud. Wherever the applications are located, users want to have the same experience. They need to access the cloud-hosted applications as fast and secure as they access on-premises applications.
Cloud hosting of applications fundamentally changes the application traffic model. To provide desirable, consistent application experience, a WAN architecture must address the following challenges arising from this change:
· Higher requirements for scalability and availability—Movement of on-premises applications to the cloud shifts a large amount of traffic from the LAN to the WAN. To accommodate the growing accesses to the cloud-hosted applications without decreasing user experience, the WAN architecture must be highly scalable and reliable to provide sufficient bandwidth and high availability.
· Interconnect expenditure control—The expenditure of enterprises on interconnects grows as WAN traffic increases. It is compelling for an enterprise to deliver assured network services at a minimal cost.
· Differentiated granular application traffic control—As on-premises applications move to the cloud, WAN traffic is increasingly diversified. To deliver assured, consistent experience to critical applications, a WAN must provide more granular differentiated quality of services to applications than ever before.
Traditional WAN architectures can hardly address these challenges because of their complexity, rigidity, and lack of programmability. To align with the business growth in this cloud computing era, enterprises must transform their WAN architectures.
Figure 2 Challenges arising from cloud hosting of enterprise applications
Compelling needs for WAN transformation
To address challenges and issues in this cloud computing era, enterprises must transform their WAN architectures. The new-generation WAN architecture must deliver the following capabilities:
· Open architecture—Builds an open, future-proofed network that can easily scale and smoothly evolve to accommodate new applications, business changes, and new technologies.
· Intelligent application-based routing—Delivers assured application experience to critical applications.
· Easy deployment and operations—Provides plug-and-play deployment and automated network service provisioning.
· Status visibility and visualization—Monitors various resources in real time and presents visualized data in different views for the easy of operations management.
· Programmability and flexibility—Programs the network to quickly accommodate new applications and business changes.
Figure 3 A new-generation WAN architecture
Building new-generation software-defined WAN architecture
Software-defined networking (SDN) is an architecture designed for network automation, flexibility, and ease of management. It has evolved from the initial OpenFlow for data and control plane separation to encompass a broad range of technologies and architectures for the following goals:
· Software-defined architecture—Enables the network to change proactively as applications and traffic characteristics change.
· Application-driven network—Automates the network to change dynamically and quickly as new applications or user demands arise.
· Easy network operations and management—Deploys applications to automate network and service provisioning, present a visualized holistic view of the network, and distribute traffic dynamically.
H3C AD-WAN solution
H3C Application-Driven WAN (AD-WAN) solution helps customers build a new-generation software-defined WAN for openness, programmability, and ease of operation. This WAN will be able to change dynamically, quickly, and easily as business grows or new applications and technologies arise.
AD-WAN solution overview
Solution architecture
Figure 4 Solution architecture
The H3C AD-WAN branch solution is a standard SDN network architecture and converged, layered, and open network architecture. This solution offers a converged network control center and intelligent brain that consolidates intelligent network management, intelligent control, and intelligent analysis. It provides end-to-end network and service automation, visual data display, and refinement of network management. This solution provides a user-oriented, unified portal that enables SSO, one-click configuration deployment, unified protection, and one station O&M. The H3C AD-WAN branch solution consists of the following components:
· Unified Platform—Digital network engine acting as the brain of the whole network. Based on containerized platform and service-oriented software architecture, Unified Platform provides users with data center, campus, and WAN convergence services. It provides standard northbound Restful APIs used for open and flexible integration into different OSS/BSS management systems. Standard protocols are provided in the southbound direction to collaborate with the device layer, such as SNMP, NETCONF, and Telemetry.
· Management component—Provides traditional management capabilities such as device version management, configuration management, alarms, performance monitoring, and topology, as well as value-added services such as QoS.
· Control component—Provides zero touch provisioning, WAN optimization, and network traffic tuning to prioritize high-priority service needs.
· Analysis component—Implements rapid network status awareness and second-level operation and maintenance based on Telemetry. This component displays the most critical elements in the network to assist with O&M. The analysis service applies AI to O&M to collect network-wide information, including network devices, traffic, quality, correlated events, and alarms. With big data and AI technologies, machine learning, and deep analysis algorithms, this component observes the network from the perspective of applications and proactively senses the problems of the network and applications. In response to network and service problems, it provides automated troubleshooting capabilities to help users quickly locate and delimit faults, reduce O&M costs, and improves the competitiveness of your enterprise's products.
Technical benefits
The AD-WAN branch solution helps customers provide application-driven WAN services with ease. It unifies the management of network resources, provides a holistic view of the network from different angles, and drives intelligence in analysis of running data. It automatically engineers service traffic and optimizes network performance constantly in adaption to network changes, such as changes in topology and performance. It offers the following features and benefits:
· Scalability, compatibility, and programmability—With a fully open architecture and decoupled layers and components, the system is easy to scale as business grows. Open APIs in different degrees of abstraction between layers enable the orchestrator and the controller to define and program the network flexibly for applications.
· Scenario-based sizing—The solution provides applications and features aligned with the network scenario to address business requirements.
· Simplified operations and maintenance—The solution provides a holistic view for you to manage and tune the network to deliver end-to-end services and simplify network operations.
· Automation—The solution automates device onboarding, service provisioning, and traffic distribution and engineering.
· End-to-end network services—The solution enables you to reconfigure the network to accommodate changes or new applications as they arise. For example, you can dynamically enforce and adjust application policies, QoS, and security policy.
This solution addresses traditional WAN issues and helps enterprises accelerate service provisioning, reduce leased line costs, simplify network operations, provide visibility into application traffic, and improve application experience.
Value of the solution
The AD-WAN branch solution significantly simplifies network operations and reduces deployment and assurance costs to a level that can never be achieved with traditional WAN deployment and management. It drives positive business value for customers.
Service consistency
With AD-WAN, an enterprise has more connection options to connect branches to the headquarters and does not have to depend on expensive MPLS VPN, MSTP, and SDH connections. For example, the enterprise can establish connections over Internet or a 4G/5G network to save costs. The AD-WAN controller automatically monitors network performance and intelligently reroute traffic to meet the service level agreement (SLA) profile for an application. In conjunction with WAN optimization features of devices, the solution provides the enterprise users with a consistent service experience across Internet connections and private WAN connections while decreasing IT costs.
Figure 5 Headquarters-branch network scenario
Network and application visibility
The H3C AD-WAN branch solution collects comprehensive data from the infrastructure layer by using many data collection techniques, including NetStream and Network Quality Analyzer (NQA). The solution aggregates, analyzes, and visualizes the collected data to present a holistic view of the network from different angles and at different levels. At a higher level, the solution can present the overall network topology. At a lower level, the solution can present node health states, link quality and bandwidth usage, and traffic statistics about applications.
· Topology visibility—Collects and visualizes topology data to display the underlay topology and the topology of each overlay.
Figure 6 AD-WAN topology visibility
· Node visibility—Uses various techniques to collect and visualize diverse node information. This information includes node running states, software versions, and statistics about CPUs, memory, disk usage, and temperature.
Figure 7 Node visibility
· Link visibility—Obtains link quality information through the iNQA on-path flow analysis technique, and visually presents link bandwidth information, real-time traffic on links, and history link data.
Figure 8 Link traffic visibility
¡ Provides link traffic visibility based on the timeline, and supports displaying the real-time bandwidth and bandwidth usage.
¡ Predefines the time spans of the last 1 hour, 3 hours, 6 hours, 12 hours, 1 day, 1 week, and 1 month on the timeline, and allows you to flexibly customize the time span.
Figure 9 Link quality visibility
¡ Provides link quality visibility based on the timeline, and supports displaying the link latency, jitter, and packet loss ratio.
¡ Predefines the time spans of the last 1 hour, 3 hours, 6 hours, 12 hours, 1 day, 1 week, and 1 month on the timeline, and allows you to flexibly customize the time span.
· Multidimensional application visibility—Presents application-specific traffic statistics, service quality, and data paths based on tunnel data collection of SDWAN and iNQA.
Figure 10 Application visibility
· Issue visibility—Presents node, link, and application health states. The solution presents node issues such as loss of node connectivity, interface down events, persistent high CPU usage, persistent high memory usage, and overtemperature. The solution presents link issues such as poor link quality and link failure.
Figure 11 Issue visibility
Flexible QoS deployment
More and more enterprises are migrating their applications to the cloud. Cloud migration causes the explosion of WAN traffic, such as video conferencing, voice calls, email, file download. The efficient use of WAN bandwidth becomes the primary pursuit of enterprise IT.
Traditional QoS lacks a holistic view of the network, and network resource scheduling is limited to individual sites. Complex CLI commands pose many maintenance and operation risks. AD-WAN makes network-wide bandwidth resources visualized and allows you to deploy QoS configuration to all devices, eliminating low efficiency and high risks. The QoS configuration provides guarantees for key services and block or rate-limit illegitimate traffic or low-priority traffic to enhance user experience.
You can apply traffic classifiers and rate limits to any direction to implement application-based rate limiting on interfaces of devices. If the bandwidth hired from a service provider is smaller than that of the interface, you can configure a rate limit for the WAN interface to prevent excess traffic from being dropped by the service provider. If traffic flows with multiple priorities exist in the network and congestion will occur, you can configure an assurance profile on the outgoing interface to the WAN to provide low latency for high priority traffic by assigning a high priority queue and more bandwidth to it.
Figure 12 AD-WAN QoS deployment
Application deep visibility (DPI)
Traditional packet inspection technology cannot examine packet information beyond Layer 4. As shown in Figure 13, it reads header information in packets, including source addresses, source port number, destination addresses, destination port number, and protocols.
Figure 13 Traditional inspection on header information of IP packets
Traditional packet inspection technology cannot identify the applications of the following services:
· Services with changeable ports, for example, BT/EDK service that allows users to set port numbers themselves.
· Tunneling services using legal ports. For example, VoIP traffic transmitted through tunnels using port 80, which is typically opened on the firewall.
· Services with constantly changing IP addresses.
· Interactive services using negotiated ports, for example, FTP, stream media, and VoIP.
Deep packet inspection (DPI), an advanced method of examining and managing network traffic, goes beyond packet headers. It identifies application types or content by examining the application layer payloads that cannot be detected by the traditional packet identification approaches. When IP packets, and TCP or UDP flows travel through a DPI-capable device, the DPI engine abstracts the payloads to reconstruct the application layer information, thus identifying the application layer protocols.
Figure 14 DPI analysis on application signatures
The AD-WAN controller provides content signature-based identification of application layer protocols. Upon receiving a packet, it compares the packet payload with signatures in the application signature database to identify the application protocol.
AD-WAN provides a system-defined APR signature database with signatures of over 3000 applications, covering most mainstream applications. It also allows enterprise users to define signatures as needed to identify specific applications. The APR signature database supports continuous online update and can adapt itself to IT upgrade in the future.
Resilient Intelligent Routing
Traditional WANs are unaware of network services. They rely only on routing protocols to select the optimal link to forward data traffic. If the networks provide links differentiated in quality, traditional routing protocols cannot select the most suitable link for an application based on the SLA requirements of that application. As a result, the networks cannot guarantee user experience.
When exceptions occur in a traditional network, the operation and maintenance personnel must manually resolve the issues. Manual operation and maintenance not only brings great operation and maintenance workload, but also introduces operation and maintenance risks, such as configuration deployment errors and deletion of critical configurations by mistake. In conclusion, traditional passive operation and maintenance cannot assure applications efficiently.
The AD-WAN branch solution provides the Resilient Intelligent Routing (RIR) feature to achieve distributed and intelligent routing for application traffic in real time. The feature uses the iNQA technology to detect the status of WAN links between the headquarters and each branch in real time. RIR uses the data collected by iNQA to restore the real network state and helps the devices to intelligently schedule traffic in a distributed manner in real time.
In order to meet the network requirements of applications and ensure best network experience for users, the AD-WAN branch solution supports RIR based on the following criteria:
· WAN selection policy—Specifies links in specific types of WAN networks in the policy. Links in each WAN network are assigned a priority. This policy is included in an SLA profile.
· Assurance profile—Supports flexible link selection based on the bandwidth requirements of applications. If the bandwidth usage of one link has reached the specified threshold, a network device will select another link for the application traffic.
· SLA profile—Supports link selection based on the link quality requirements of applications, including the requirements to link latency, jitter, and packet loss ratio. iNQA is used to detect the quality of links. A network device selects a link for the traffic of an application based on the iNQA link quality probe results so that the selected link can meet the quality requirements of that application.
· Time range—Supports time range-based policy deployment and link section. The time range can be Absolute, Recurring, or Anytime.
· A combination of the above listed criteria. For example, the administrator can use one of the following criteria combinations:
¡ Assurance policy and SLA profile.
¡ Assurance policy, SLA profile, and time range.
Figure 15 RIR summary
Investment protection
The AD-WAN branch solution is designed to help enterprises preserve the value of their existing network investments. Enterprises can deploy the AD-WAN branch solution in a brown field environment without having to replace the existing network devices.
The AD-WAN branch solution is supported by a wide range of H3C routers. These routers include all H3C SR6600, SR66-X, and MSR series routers. In addition, the solution is supported by F5000/F1000 series firewalls used as CPE security gateways of sites.
Figure 16 Network devices compatible with the AD-WAN branch solution
Solution deployment flow
The H3C AD-WAN branch solution focuses on automation, assurance of critical applications, and service availability. Its major features include zero-touch device deployment, automated network service provisioning, network and application visibility, and automated traffic distribution and engineering.
Figure 17 shows the solution deployment flow.
Figure 17 AD-WAN branch deployment flow
1. Network service deployment:
Design the network, for example, configure site, device, and network settings on the AD-WAN controller. Then generate the ZTP onboarding file (through USB drive or Email) for the target site based on the network design. After the devices in the site come onboard through zero-touch deployment, the controller automatically deploys underlay and overlay settings to the devices. In addition, the controller periodically collects device and link states through the control channel to present the network topology, device states, and link states.
2. Data collection to provide infrastructure visibility:
The controller collects and visualizes data to provide infrastructure visibility.
¡ Node and link visibility—The controller collects data from the network devices through NETCONF to prevent a holistic view of the network, including bandwidth usage of links, health states of the devices, and software versions of devices.
¡ Link quality—The controller uses iNQA to dynamically test links and present link quality based on link metrics including packet loss, latency, and jitter.
3. Application group configuration:
You define and assign applications that must be treated in the same way to an application group, and then configure an application group policy for the group.
¡ When you define applications, you can use their application signatures or packet matching criteria such as the IP five-tuple, DSCP, and VPN.
¡ When you configure the policy for an application group, you specify a set of service level agreement (SLA) profiles and their effective time ranges. An SLA contains service quality settings, such as minimum and maximum bandwidths and link quality.
The controller will automatically generate ACLs and rules to match and filter application packets and automatically enforce the policy to engineer the traffic in compliance with the SLA.
4. Path selection for applications:
With the built-in Resilient Intelligent Routing (RIR) feature, the devices automatically and independently select the best paths for applications in groups based on the application group policy and link affinity profiles issued by the controller.
Solution capabilities
ZTP
Zero touch provisioning (ZTP) allows you to provision new devices with Internet connectivity and controller connection settings. With these settings, the devices can establish bidirectional TCP sessions with the AD-WAN controller, obtain configurations from the controller, and come online automatically. This simplifies configuration file editing for provisioning, achieves configuration pooling, reduces automated deployment cost, and reduces risks of misoperations.
AD-WAN supports ZTP through a USB drive, a URL, a DHCP server, or a public cloud.
Figure 18 Zero touch provisioning
Using a USB drive
1. The network administrator imports information about devices to deploy, including device names and device serial numbers, to the AD-WAN controller.
2. The field deployer inserts the USB drive to the devices one by one for the devices to load information required for registration. The information includes WAN connectivity information (for example, PPPoE username and password), controller information (for example, controller address and password), and others.
Then, the devices attempt to register on the controller. After registration, the controller deploys remaining underlay network settings (for example, IPsec, management address, VPN service, and LAN service settings) to the devices automatically.
Using a URL
1. The network administrator imports information about devices to deploy, including device names and device serial numbers, to the AD-WAN controller.
2. The field deployer obtains the ZTP URL from the network administrator and sends the URL to the devices. The URL redirects to a script that contains WAN interface, network access, and VPN settings, and information about the controller.
Then, the devices obtain information through the URL and attempt to register on the controller. After registration, the controller deploys underlay network settings (for example, IPsec, management address, VPN service, and LAN service settings) to the devices automatically.
Using a DHCP server
1. The network administrator imports information about devices to deploy, including device names and device serial numbers, to the AD-WAN controller.
2. The network administrator predefines Option 253 on the DHCP server to carry the IP address of the AD-WAN controller.
3. The devices dynamically obtain the IP address of the WAN interface from the DHCP server based on the factory default settings.
4. The DHCP server assigns IP addresses to the devices, and carries Option 253 (containing the IP address of the AD-WAN controller) in the response packet.
5. Upon obtaining the IP addresses of the WAN interface and AD-WAN controller, the devices initiate new connections to the AD-WAN controller.
Then, the devices attempt to register on the controller.
6. After registration, the controller deploys underlay network settings (for example, IPsec, management address, and LAN service settings) to the devices automatically.
Using a public cloud
IMPORTANT: This section uses H3C Oasis cloud as an example. |
1. The network administrator imports information about devices to deploy, including device names and device serial numbers, to the AD-WAN controller.
2. The network administrator imports device information, including device serial numbers, controller address, and device registration password, to the Oasis platform.
The devices start up with the factory default settings and connect to the Oasis platform automatically. The Oasis platform deploys AD-WAN information to the devices. Then, the devices attempt to register on the controller.
After registration, the controller deploys underlay network settings (for example, IPsec, management address, and LAN service settings) to the devices automatically.
Automated service deployment
The AD-WAN solution supports automated deployment of VPN services, LAN services, and QoS services across the whole network. You can define applications based on the IP 5-tuple, DSCP, VPN information and application-layer packet characteristics. You can define policies based on route selection, bandwidth, and service quality requirements and the time range.
VPN service deployment
After a device comes online, the AD-WAN controller automatically deploys IPsec tunnel settings if the WAN interface of the device connects to the Internet. Meanwhile, the AD-WAN controller creates an SDWAN tunnel for each WAN link to provide consistent scheduling, irrespective of their link types.
LAN service deployment
The AD-WAN controller automatically deploys gateway settings for the LAN side, eliminating the need to configure branch routers one by one manually. If a Layer gateway exists in a branch, you only need to deploy IP addresses for the LAN interfaces on the branch router.
QoS service deployment
The AD-WAN controller provides the capability to deploy end-to-end QoS rapidly based on applications. You can deploy QoS services rapidly based on the required QoS in the WAN.
The QoS service supports global or interface-specific flow policy templates or CAR rate limiting template to enable QoS customization, reduce QoS configuration effort, and simplify O&M procedures.
Application policy deployment
AD-WAN supports defining applications based on IP 5-tuple and application characteristics. An application group can contain multiple applications. You can configure the expected route, packet loss, latency, and jitter for applications and apply an application policy to implement automatic route selection and traffic optimization.
This section includes the following aspects:
· SLA policy
You can configure service quality parameters (for example, latency, jitter, packet loss) for latency-sensitive services.
· Time range
You can use time ranges to implement different scheduling policies during different time ranges. There are four types of time ranges: Absolute, Recurring, and Anytime.
· RIR
The solution supports intelligent traffic engineering based on link bandwidth usage and application preference.
RIR implements traffic engineering as follows:
¡ When the bandwidth usage of a link exceeds the lower threshold and another optimal is available, RIR does not assign new application traffic to this link.
¡ When the bandwidth usage of a link exceeds the upper threshold, RIR assigns application traffic with low priority (associated with an SLA level) on this link to another link whose bandwidth usage is below the lower threshold.
RIR performs traffic engineering after a specific engineering period.
For the SeerEngine-SDWAN controller component, the application group is the smallest unit for scheduling. To analyze each application in an application group, use SeerAnalyzer. SeerAnalyzer also allows you to perform deep analysis on applications.
Network and application visibility
The H3C AD-WAN branch solution collects comprehensive data from the infrastructure layer by using many data collection techniques, including NetStream and Network Quality Analyzer (NQA). The solution aggregates, analyzes, and visualizes the collected data to present a holistic view of the network from different angles and at different levels. At a higher level, the solution can present the overall network topology. At a lower level, the solution can present node health states, link quality and bandwidth usage, and traffic statistics about applications.
· Topology visibility—Collects and visualizes topology data to display the underlay topology and the topology of each overlay.
· Node visibility—Uses various techniques to collect and visualize diverse node information. This information includes node running states, software versions, and statistics about CPUs, memory, disk usage, and temperature.
· Link visibility—Presents link bandwidth statistics, live traffic distribution on links, and history link data based on data collected through NetStream. The solution also presents the quality of links based on data collected dynamically from NQA about metrics such as packet loss, jitter, and latency.
· Multidimensional application visibility—Presents application-specific traffic statistics, service quality, and data paths based on data collected through NetStream and NQA.
· Issue visibility—Presents node, link, and application health states. The solution presents node issues such as loss of node connectivity, interface down events, persistent high CPU usage, persistent high memory usage, and overtemperature. The solution presents link issues such as poor link quality and link failure. The solution also presents application issues such as service quality lower than expectation.
Traffic engineering
AD-WAN supports link-based, quality-based, bandwidth-based, and time-range-based link selection and traffic engineering.
· Link-based—Selects the specified links for an application based on link priorities.
· Bandwidth-based—Selects links based on the bandwidth requirements of an application. When the bandwidth usage of an application exceeds a threshold, the device automatically selects links that meet the requirements of the application.
· Quality-based—Selects links based on the SLA of an application (such as latency, jitter, and packet loss).
· Time-range-based—Selects links based on the time range.
High availability
The AD-WAN branch solution provides high availability from multiple dimensions (as shown in Figure 19) and uses RIR to implement intelligent service scheduling without requiring controller intervention. The solution ensures service continuity and high reliability upon controller failures and improves robustness of the WAN.
Figure 19 Dimensions of high availability
High availability for controllers
The AD-WAN branch solution enables you to deploy AD-WAN controllers in the following modes for high availability:
· Local distributed cluster—A cluster consists of 3 + N servers. The northbound access of users to the controllers is distributed to the northbound service microservice through load balancing. The southbound connections between devices and controllers are load balanced across different cluster members through the southbound Websocket microservice. The critical microservices are deployed on multiple cluster members for redundancy. The cluster members share service data. When a cluster member fails, another cluster member can take over to ensure service continuity.
Figure 20 Local distributed cluster
· Remote disaster recovery—The cluster has three controllers that back up one another, and uses the ODL Akka clustering mechanism for data synchronization. A remote disaster recovery system refers to the primary and backup disaster recovery between two sites in different locations. Each site has a complete cluster system. When the remote disaster recovery system is running correctly, the primary site provides services and performs full backup for the database files to the backup site or file server regularly. When the primary site fails, such as site power failure, site network failure, or link failure between the site and the external networks, you can manually switch to the backup site from the Web interface. The backup site uses the database files to recover data and take over the services of the failed site to maintain service continuity.
Figure 21 Remote disaster recovery
The 3 + 3 remote disaster recovery system enables two clusters to be deployed in different locations. The two clusters maintain data consistency by synchronizing the database files. When the primary site fails, you can quickly switch to the backup site to recover the environment.
High availability for sites
The AD-WAN branch solution supports dual-gateway deployment for site nodes to implement high availability.
Dual-gateway deployment uses VRRP or runs a routing protocol to provide redundant gateway and forwarding services for LAN users. The two gateway devices share link quality, bandwidth, bandwidth usage, and link selection priority information. Based on the shared information, each device selects the optimal link for applications.
Figure 22 Gateway redundancy for a site
High availability for transmission links
The AD-WAN branch solution deploys multiple WAN links to transmit data between the headquarters and each branch site. If one device or link fails, traffic can be switched to another device or link. By using traffic engineering, WAN link redundancy can assign specific service traffic to specific links and can implement load balancing to improve data transmission efficiency.
High availability for site access
The AD-WAN branch solution supports connecting a branch site to multiple headquarters sites.
For branch sites and multiple headquarter sites in same WAN, the controller by default automatically deploys SDWAN tunnels between the branch and headquarter sites based on physical WAN links. Users at the branch side can access services from multiple headquarters sites. If the access device or link of a headquarters site fails, users can access services provided that the LAN connectivity is available across the headquarters sites.
Figure 23 Network extension to the public cloud
The AD-WAN branch solution provides the ability to extend the enterprise network into the public cloud for cloud-hosted applications.
In this solution, a public cloud is a site that hosts resources for a tenant in a virtual private cloud (VPC). Each tenant VPC deploys a virtual services router (VSR) to establish overlay tunnels over the Internet with other sites for the tenant users to access their VPC resources.
Figure 24 Network extension to the public cloud
WAN optimization
The AD-WAN branch solution supports the following WAN optimization features:
· WAAS—Provides transport-layer optimization services Transport Flow Optimization (TFO), Data Redundancy Elimination (DRE), and Lempel-Ziv (LZ) compression.
· Link bundling—Assigns packets of a flow to multiple links to improve bandwidth usage.
· Web cache—Enables a network node to caches a webpage the first time it is accessed by a user through HTTP or HTTPS. Then, the node responds to the subsequent requests for this webpage on behalf of the Web server before the cached data times out.
· Forward error correction—FEC is an error correction mechanism in which the sender transmits redundant packets for error correction along with data packets so the receiver can restore data lost on the link. The higher the redundancy level, the lower the transmission efficiency. FEC types include determined FEC (D-FEC) and adaptive FEC (A-FEC).
The WAN optimization features can resolve high delay issues, reduce bandwidth pressure, and improve response speed for better user experience. In addition, the features can ensure low packet loss rate by using more bandwidth, greatly reducing impacts on real-time applications (such as video conferencing) over low performance links.
Security service deployment
The AD-WAN branch solution integrates the AD-WAN controller with the security controller, provides unified web pages based on U-Center, and offers a wide range of security capabilities.
The two controllers provide different sets of capabilities and work in tandem.
· The AD-WAN controller provides device onboarding and basic network deployment capabilities.
· The security controller provides automated deployment for security zones and security policies, as well as automated security function deployment and unified management capabilities for CPEs at branch sites.
The following security services can be deployed:
· Stateful firewall.
· IPS.
· URL filtering.
· Anti-virus.
Network access policies
The AD-WAN branch solution provides the following network access policies for sites based on different network access control requirements:
· Centralized network access—Forwards Internet access traffic from a specific site to the dedicated gateway (redundancy supported) at the headquarters over the overlay network.
· Local network access—Uses the local gateway to access the Internet.
· Centralized and local network access for redundancy—Configures centralized network access, and configures centralized network access and local network access to back up each other.
· Hybrid network access—Forwards traffic of specific applications through the local gateway, and remaining traffic to the dedicated gateway in the headquarters.
Figure 25 Network access policies
Analysis for sites
The AD-WAN branch solution implements site-based and inter-site link and application statistics to provide WAN network status for customers and facilitate fault location.
· Site-based analysis—Provides performance monitoring analysis for a site from the aspects of health status of the site, links to other sites, and applications to other sites.
· Inter-site analysis—Provides data analysis between two sites, and helps customers obtain site status in the WAN network from a more specific perspective.
Application scenarios
Figure 26, shows the topology models supported by the H3C AD-WAN branch solution.
The solution supports multiple traffic models for organizations that must connect a large number of geographically distributed branches to the headquarters. The solution also enables automated deployment, network visibility, and ease of operations.
Figure 26 H3C AD-WAN branch solution topology models
Flat HQ-branch network
As shown in Figure 27, the headquarters communicate with branch sites and branch sites communicate with one another through SDWAN tunnels. To prevent failure of a CPE from interrupting forwarding, the headquarters typically deploy dual CPEs using technologies such as VRRP. CPEs can be deployed from the controller to simplify network deployment. The SDWAN tunnels between sites can be set up over multiple types of links, such as MPLS L3VPN, MSTP, Internet, and 4G/5G.
Figure 27 Flat HQ-branch network
Flat multi-HQs-branch network
As shown in Figure 28, SDWAN tunnels are established to enable communication between the headquarters sites, between the headquarters sites and branch sites, and between branch sites. To prevent failure of a CPE from interrupting forwarding, the headquarters typically deploy dual CPEs using technologies such as VRRP. The SDWAN tunnels between sites can be set up over multiple types of links, such as MPLS L3VPN, MSTP, Internet, and 4G/5G.
Figure 28 Flat multi-HQs-branch network
Multi-tiered network
As shown in Figure 29, the distribution devices set up the VPN or Internet links with the headquarters sites and branch sites in a large enterprise network. This network model saves the costs. The SDWAN tunnels between sites can be set up over multiple types of links, such as MPLS L3VPN, MSTP, Internet, and 4G/5G.
Figure 29 Multi-tiered network
Cloud management network
As shown in Figure 30, small- and medium-sized service providers provide SDWAN O&M services and user-side CPEs are purchased or rented. A unified controller supporting multi-tenant management provides SDWAN management services for different enterprises. An enterprise as a tenant leases SDWAN services provided by the service providers (self-purchased/leased devices + leased SDWAN software). In this scenario, the enterprise tenant can control the SDWAN services of all sites within the enterprise, but cannot see the services of other tenants.
Figure 30 Cloud management network
Service provider PoP network
Points Of Presence (POPs) are deployed at the edge of the service provider/managed service provider (MSP) backbone, and SDWAN services are deployed on the POPs for accessing enterprise sites. POP gateways are interconnected through overlay on top of the backbone network, and different POP point form a POP backbone network.
Through the SDN controller and Internet links or local low-cost VPNs, SDWAN tunnels are deployed between CPEs of enterprise branch sites and POP gateways to direct traffic from the enterprise branch sites to the POP network. The cross-location POP backbone network enables low-cost and high-quality interconnections between dispersed sites.
By combining the POP network with SDWAN technologies, service providers provide a new type of WAN and carrier-grade SDWAN services featuring high quality, low cost, rapid deployment, and reliability for cross-location access users.
Figure 31 Service provider PoP network
Multi-tiered service provider PoP network
As shown in Figure 32, service provider POPs are grouped into different regions, and POP gateways in different regions communicate via a regional POP gateway. This network model applies to inter-country or inter-region communication.
Figure 32 Multi-tiered service provider PoP network
Key technologies
The AD-WAN solution adopts various technologies and methods to implement application-driven WAN. This chapter describes the key technologies of AD-WAN, including the core ideas, critical running platforms, and important southbound and northbound APIs. All these technologies enable AD-WAN to be an open, flexible, easy-to-use, and reliable platform that is compatible with legacy networks. AD-WAN enables legacy networks to be smoothly migrated to the SDN architecture, and thus quickly meets changing application requirements of users.
Core idea of AD-WAN: SDN
As is well known, SDN, as the most popular idea and technology, has been widely applied and deployed in datacenter and campus networks. SDN empowers networks with flexible programmability, dynamic perception, and automatic orchestration, so that the networks can continuously adapt to application requirements. Though WAN networks are different from datacenter and campus networks in many aspects, they all exist to meet user requirements. Therefore, it is agreed that SDN should be introduced into WANs. SDN is an inevitable choice for the AD-WAN solution. After being introduced to enterprise WANs, SDN enables the WANs to be deployed and changed on demand and become new WANs with visibility, traffic engineering, manageability, and reliability.
AD-WAN architecture: Microservices
Microservice architecture
Microservices were introduced at a software architecture conference in Venice in May 2011 to describe common architectural design principles. Microservices have following features.
· A number of independent services that together form an application system.
· Each service is deployed separately and runs independently in its own process.
· The services are managed in a distributed way.
Microservices focus on independent deployment and independent services. Microservices are divided based on the low coupling and high cohesion principles.
Controller service architecture
As shown in Figure 33, the controller service architecture consists of the following components:
· Infrastructures—Containerized solution using K8s and Docker.
· Northbound load balancing—Nginx-based soft load balancing component.
· Southbound load balancing—Nginx-based soft load balancing component.
· WebSocket service—Maintain the connections between devices and controllers.
· Yang-tools—Necessary tools and libraries that provide Java runtime support for the language and data structures modeled by Yang. Yang tools support serialization and deserialization defined in IETF drafts and standards.
· Caching—Redis, a database developed by using the open source ANSI C language. Redis is available over the network, and it offers memory-based persistent logging. Redis saves data in the format of key-value pairs, and it provides APIs of multiple languages.
· Message queuing—Kafka, a distributed stream processing platform for publishing and subscribing to message streams, storing data streams in a fault-tolerant and persistent manner, and storing data streams when they arrive.
· Database—PostgreSQL database, an object-relational database management system (ORDBMS) with rich features. It is based on POSTGRES version 4.2 developed by the computer science department of University of California .
· Service orchestration service—Network deployment microservice.
· O&M service—O&M microservice.
· Device management service—Basic network element management microservice.
· Value-added services—Microservices providing application groups and policy management.
Figure 33 Controller service architecture
Microservice development by using Springboot and Dubbo
Microservices are developed by using Springboot and Dubbo for fast microservice development.
· Spring—Spring was created to address the complexity of enterprise application development by using basic JavaBeans to do things that were previously only possible with EJBs. Any Java application can benefit from Spring in terms of simplicity, testability, and loose coupling.
· Springboot—Springboot provides out-of-the-box setup for the Spring platform and third-party libraries (default settings are provided, and the package that holds the default configuration is the launcher). Most Springboot applications require very little Spring configuration.
· Dubbo—Apache Dubbo is a high-performance, lightweight open source service framework. It provides the following functions:
¡ Interface agent-oriented high-performance RPC calls.
¡ Intelligent fault tolerance and load balancing.
¡ Automatic registration and discovery of services.
¡ Scalability.
¡ Run-time traffic scheduling.
¡ Visual service governance and O&M.
Dubbo can provide microservice scheduling, visualization, horizontal expansion, and distributed management that are required by microservices.
Northbound APIs of AD-WAN: RESTful APIs
About REST & RESTful
Representational state transfer (REST) is a software architectural style that defines a set of constraints to be used for creating Web services. REST first appeared in the PhD thesis of Roy Fielding (the principal author of the HTTP specifications) in 2000.
REST is a software architectural style rather than a standard. REST defines a set of constraints and principles. Application programs and designing conforming to these constraints and principles are considered as RESTful.
Figure 34 RESTful constraints and principles
RESTful APIs have become the mainstream application programming interfaces
When REST constraints are integrated as an application, a simple, scalable, efficient, secure, and reliable architecture will be generated. A RESTful API is simple and lite and can directly transmit data through HTTP, which simplifies the client and server model implementation.
Among the three mainstream Web interaction services, REST is simpler than Simple Object Access Protocol (SOAP) and XML-RPC. REST uses a simpler and liter method to design and implement the URL processing and payload encoding.
In REST definitions, the server uses a fixed, unique URI to identify a resource (for example, application program object, database record, algorithm, or document) and the resource is open to clients. Then, the resource-centric Web applications can use URIs to open resources and provide resource operation services.
Northbound APIs provided by AD-WAN
The H3C AD-WAN controller provides rich northbound RESTful APIs for users to invoke and customize. The northbound RESTful APIs include the following categories:
Function category |
Function description |
Device management |
Allows you to add, delete, update, and query devices and device templates. |
Topology management |
Allows you to obtain the topology information, add, delete, update, and query paths, and import and query the status of links. |
Alarm management |
Allows you to obtain alarm information, acknowledge and update alarms, export alarms, and configure alarm sending/threshold/clearing settings. |
O & M |
Allows you obtain the current and history information about the link/tunnel quality and bandwidth. |
Resource management |
Allows you to add, delete, update, and query address pools. |
QoS |
Allows you to add, delete, update, and query QoS policies. |
Southbound APIs of AD-WAN: NETCONF
About NETCONF
Networks are increasing in complexity and capacity, as well as the density of the services deployed upon them. The traditional network management protocol SNMP cannot meet the network management requirements (especially the configuration management requirements) of complex networks. To solve this problem, IETF set up the NETCONF working group in May 2003 to make XML-based network configuration protocol NETCONF. The working group published the NETCONF protocols RFC 4741 through 4744 in December 2006.
Figure 35 NETCONF protocol architecture
Similar to most protocols, NETCONF adopts a layered structure, as shown in the figure above. The layered structure includes the content layer, operation layer, RPC layer, and application protocol layer. Each layer encodes the protocol data in a certain aspect and provides services to its upper layer. The layered architecture enables each layer to focus on only a specific aspect of the protocol and is easier to implement. Additionally, the layered structure decouples layers from each other and minimizes the influence of the internal implementation changes of a layer on the other layers.
· Content layer—Indicates the configuration data, which is different from SNMP MIBs. NETCONF uses XML to encode the configuration data and protocol messages. XML can represent a complex, modeled management object with internal logics. Additionally, XML is an international standard proposed by W3C, and is supported by many software vendors, which facilitates data exchange and development.
· Operation layer—Defines a set of base operations invoked as RPC methods with XML-encoded parameters. NETCONF base operations include data retrieval operations, configuration operations, lock operations, and session operations. The get and get-config operations retrieve device configuration and state information. The edit-config, copy-config, and delete-config operations configure device parameters. The lock and unlock operations lock and unlock device configuration to prevent multiple users from modifying the configuration concurrently. The close-session and kill-session operations are upper-layer operations used to terminate a NETCONF session.
· RPC layer—Provides a simple, transport-independent framing mechanism for encoding RPCs. The <rpc> and <rpc-reply> elements are used to enclose NETCONF requests and responses (data at the operation layer and the content layer). Typically, the <rpc> element encloses the message prompting that the data or configuration needed by the client is successfully configured. When the request packets from a client contain errors or the server fails to process the requests, the server encloses an <rpc-error> element containing the error details in the <rpc-reply> element and returns the error details to the client.
· Application protocol layer—NETCONF provides reliable, secure, connection-oriented transport protocols. On the transport layer, NETCONF defines RFC4742, RFC4743, and RFC4744 for NETCONF over SSH, NETCONF over SOAP, and NETCONF over BEEP, respectively. These schemes secure network connections through encryption and authentication.
NETCONF applications in AD-WAN
As the most widely used southbound protocol, NETCONF takes an important role in the AD-WAN solution. It helps the AD-WAN controllers to retrieve device configuration and configure devices. Due to the good scalability of NETCONF, you can flexibly define the management contents between controllers and devices as needed without long and tedious discussions about standards. With NETCONF, you can implement richer control actions on devices and fast deliver a solution meet the user requirements.
iNQA quality detection
About iNQA
Intelligent Network Quality Analyzer (iNQA) is an H3C-proprietary protocol that allows you to measure network performance quickly in large-scale IP networks. iNQA supports measuring packet loss and latency on forward, backward, and bidirectional flows. The packet loss data includes number of lost packets, packet loss rate, number of lost bytes, and byte loss rate. The measurement results help you to know when and where the packet loss occurs, the event severity level, and link latency.
Concepts
Figure 36 shows the important iNQA concepts including MP, collector, analyzer, and AMS.
Collector
The collector manages MPs, collects data from MPs, and reports the data to the analyzer.
Analyzer
The analyzer collects the data from collector instances and summarizes the data.
Target flow
A target flow is vital for iNQA measurement. You can specify a flow by using any combination of the following items: source IPv4 address/segment, destination IPv4 address/segment, protocol type, source port number, destination port number, and DSCP value. Using more items defines a more explicit flow and generates more accurate analysis data.
MP
MP is a logical concept. An MP counts statistics and generates data for a flow. To measure packet loss and latency on an interface on a collector, an MP must be bound to the interface.
An MP contains the following attributes:
· Measurement location of the flow.
¡ An ingress point refers to the point that the flow enters the network.
¡ An egress point refers to the point that the flow leaves the network.
¡ A middle point refers to the point between an ingress point and egress point.
· Flow direction on the measurement point.
A flow entering the MP is an inbound flow, and a flow leaving the MP is an outbound flow.
AMS
Configured on the analyzer, an AMS defines a measurement span for point-to-point performance measurement. You can configure multiple AMSs for an instance, and each AMS can be bound to MPs on any collector of the same instance. Therefore, iNQA can measure and summarize the data of the forward flow, backward flow, or bidirectional flows in any AMS.
Each AMS has an ingress MP group and egress MP group. The ingress MP group is the set of the ingress MPs in the AMS and the egress MP group is the set of the egress MPs.
Instance
The instance allows measurement on a per-flow basis. In an instance, you can configure the target flow, flow direction, MPs, and measurement interval.
On the collector and analyzer, create an instance of the same ID for the same target flow. An instance can be bound to only one target flow. On the same device, you can configure multiple instances to measure and collect the packet loss rate and latency of different target flows.
Flag bit
Flag bits, also called color bits, are used to distinguish target flows from unintended traffic flows.
iNQA uses ToS field bits 5 to 7 in the IPv4 packet header as the packet loss flag bit and latency flag bit.
Operating mechanism
iNQA uses the model of multi-point collection and single-point calculation. Multiple collectors collect and report the packet data periodically and one analyzer calculates the data periodically.
Before starting the iNQA packet loss measurement, make sure all collectors are time synchronized through NTP or PTP. Therefore, all collectors can use the same measurement interval to color the flow and report the packet statistics to the analyzer. As a best practice, the analyzer and all collectors are time synchronized to facilitate management and maintenance.
iNQA packet loss measurement mechanism
The number of incoming packets and that of outgoing packets in a network should be equal within a time period. If they are not equal, packet loss occurs in the network.
As shown in Figure 37, the network uses an external clock source and synchronizes time through NTP. The flow enters the network from MP 100, passes through MP 200, and leaves the network from MP 300. The devices where the flow passes are collectors and NTP clients, and the aggregation device is the analyzer and NTP server.
The iNQA packet loss measurement works as follows:
1. The analyzer synchronizes the time with all collectors through NTP.
2. The ingress MP on collector 1 identifies the target flow. It colors and decolors the packets in the flow alternatively at intervals, and periodically reports the packet statistics to the analyzer.
3. The middle MP on collector 2 identifies the target flow and reports the packet statistics to the analyzer periodically.
4. The egress MP on collector 3 identifies the target flow. It decolors the colored packets and reports the packet statistics to the analyzer periodically.
5. The analyzer calculates packet loss for the flow of the same period and same instance as follows:
Number of lost packets = Number of incoming packets on the MP – Number of outgoing packets on the MP
Packet loss rate = (Number of incoming packets on the MP – Number of outgoing packets on the MP) / Number of incoming packets on the MP
The analyzer calculates the byte loss in the similar way the packet loss is calculated.
For the end-to-end measurement, the data from the ingress MP and egress MP is used.
For the point-to-point measurement, the analyzer calculates the result on a per-AMS basis.
· In AMS 1: Packet loss = Number of packets at MP 100 – Number of packets at MP 200
· In AMS 2: Packet loss = Number of packets at MP 200 – Number of packets at MP 300
iNQA latency measurement mechanism
As shown in Figure 37, iNQA latency measurement works as follow:
1. Collector 1 records the send timestamp SndTime 1 by color the first packet within an interval.
2. Collector 2 identifies the colored packet to record the receive timestamp RcvTime 2 and the send timestamp SndTime 2.
3. Collector 3 identifies the colored packet to record the receive timestamp RcvTime 3 and the send timestamp SndTime 3.
4. The network latency in the interval is calculated based on the following expression:
Latency = RcvTime 2 + RcvTime 3 - SndTime 1 - SndTime 2
iNQA applications in AD-WAN
In the AD-WAN branch solution, iNQA flow detection is used to detect packet loss, latency, and jitter of each SD-WAN tunnel between sites based on connectivity detection and service flow detection.
SLA-based service quality evaluation
Introduction to SLA
A Service Level Agreement (SLA) is a contract between a service provider and a subscriber that defines what level of performance is expected from the service provider. In the AD-WAN branch solution, five pre-defined SLA profiles of different levels are available to apply to different applications.
Seven SLA levels are available with priorities from 1 through 7. Each SLA level defines default settings for service assurance parameters, and the settings are configurable.
Figure 38 SLA levels provided by the controller
Introduction to IP precedence and DSCP values
As shown in Figure 39, the ToS field in the IP header contains 8 bits. The first 3 bits (0 to 2) represent IP precedence from 0 to 7. According to RFC 2474, the ToS field is redefined as the differentiated services (DS) field. A DSCP value is represented by the first 6 bits (0 to 5) of the DS field and is in the range 0 to 63. The remaining 2 bits (6 and 7) are reserved.
SLA/DSCP in the AD-WAN branch solution
The AD-WAN controller provides seven system-defined SLA levels to assure different link quality requirements. Each SLA level defines default settings for service assurance parameters, and the settings are configurable.
After the AD-WAN initializes the network topology for a newly managed device, it deploys the iNQA configuration to the device, which sends probe packets over IP networks. Each application group has an SLA profile and DSCP value associated. Based on the SLA service requirements, DSCP value, and the iNQA probe result, the controller selects the best link for the application.
NetStream-based traffic statistics measurement
About NetStream
As the number of applications and services soars in the fast evolving world of network technologies, granular network management, accurate traffic accounting, and intelligent traffic analysis are required. NetStream is an accounting technology that provides statistics on a per-flow basis. It allows administrators to gain better understanding of the network access traffic details.
NetStream is an accounting technology that provides statistics on a per-flow basis. An IPv4 flow is defined by the following 7-tuple elements:
· Destination IP address.
· Source IP address.
· Destination port number.
· Source port number.
· Protocol number.
· ToS.
· Inbound or outbound interface.
A typical NetStream system includes the following elements:
· NetStream data exporter—A device configured with NetStream. The NDE provides the following functions:
¡ Classifies traffic flows by using the 7-tuple elements.
¡ Collects data from the classified flows.
¡ Aggregates and exports the data to the NSC.
· NetStream collector—A program running on an operating system. The NSC parses the packets received from the NDEs, and saves the data to its database.
· NetStream data analyzer—A network traffic analyzing tool. Based on the data in NSC, the NDA generates reports for traffic billing, network planning, and attack detection and monitoring. The NDA can collect data from multiple NSCs. Typically, the NDA features a Web-based system for easy operation.
NSC and NDA are typically integrated into a NetStream server.
Figure 40 NetStream system
NetStream uses Flink (a distributed computing engine) to parse and process network flows reported by devices in real time. It obtains application definitions from the analyzer and controller to identify applications based on the 5-tuple information. In addition, NetStream can visualize network flow measurements from the aspects of device interface, application group, application, VPN, and SRv6-TE policy.
NetStream in the AD-WAN branch solution
In the AD-WAN branch solution, the SeerAnalyzer uses NetStream to collect traffic data on the interfaces of the device, which provide more granular session-based service analysis.
SeerAnalyzer can provide the following analysis capabilities:
· Interface traffic analysis—Performs network traffic analysis for device interfaces. SeerAnalyzer can provide traffic ranking by network-wide device interfaces, present traffic trends on the interfaces, display network traffic details for each interface, and analyze the application traffic rank list, application traffic trends, and source host traffic rank list for the interfaces.
· Application group traffic analysis—You can add the applications of NetStream flows to application groups defined on the analyzer or controller. Application group traffic analysis can calculate the high-ranking application groups by network-wide traffic volume, and provide the traffic statistics and traffic trend graphs for the application groups. In addition, it can provide the application distribution, application traffic trend, and source host access rank list for each application group.
· Application traffic analysis—Calculates the traffic volume and rate metrics for network-wide applications, as well as the detailed information about each application, including 5-tuple (traffic path), communication quality, distribution status in the network, and rank list of the source hosts that have accessed the application.
· VPN traffic analysis—Performs traffic analysis and statistics based on VPN. It can provide the Top N VPNs by traffic volume, as well as the detailed information about each VPN, including application traffic rank list, application traffic trend graphs, and rank list of the source hosts that have accessed the VPN.
Figure 41 User log flow analysis
User log flow analysis collects and analyzes flow session log data for all devices in the network, and presents application and associated traffic information. You can implement user log flow quality monitoring and analysis based on the information.
NAT
Network Address Translation (NAT) translates an IP address in the IP packet header to another IP address. Typically, NAT is configured on gateways to enable private hosts to access external networks and external hosts to access private network resources such as a Web server.
Figure 42 Static NAT
NAT session logging
NAT session logging records NAT session information, including translation information and access information. You can configure NAT session logging and notifications to send NAT data to UDP port 9998 of the SA server. You can obtain NAT log data from the port.
NAT session log parsing and processing
Figure 43 NAT session log parsing flow
URL auditing
URL auditing collects and analyzes logs on network devices to monitor URLs accessed by users, restricting user online behaviors. The network devices refer to the network management routers configured with URL filtering policies. In the current software version, the device supports only the HTTP URL filtering.
Figure 44 URL access list
URL filtering
The device implements URL auditing for packets through URL filtering. URL filtering controls access to the Web resources by filtering the URLs that the users visit.
A URL filtering rule matches URLs based on the content in the URI or hostname field.
A URL is a reference to a resource that specifies the location of the resource on a network and a mechanism for retrieving it. The syntax of a URL is protocol://host [:port]/path/[;parameters][?query]#fragment. Figure 45 shows an example URL.
Table 1 describes the fields in a URL.
Table 1 URL field descriptions
Field |
Description |
protocol |
Transmission protocol, such as HTTP. |
host |
Domain name or IP address of the server where the indicated resource is located. |
[:port] |
Optional field that identifies the port number of the transmission protocol. If this field is omitted, the default port number of the protocol is used. |
/path/ |
String that identifies the directory or file where the indicated resource is stored. The path is a sequence of segments separated by zero or multiple forward slashes. |
[parameters] |
Optional field that contains special parameters. |
[?query] |
Optional field that contains parameters to be passed to the software for querying dynamic webpages. Each parameter is a <key>=<value> pair. Different parameters are separated by an ampersand (&). |
URI |
Uniform resource identifier that identifies a resource on a network. |
When a user sends HTTP packets to access a specific network resource, the device performs URL filtering for the HTTP packets.
Figure 46 URL filtering mechanism
URL data collection
After you specify the logging action for a URL category (with the category action logging command) or as a default action for a URL filtering policy (with the default-action logging command), a large number of logs might be generated. Log messages generated by the device are output to the device information center. The information center then sends the messages to designated destinations based on log output rules. You can enable the information center to output syslogs to the SA server.
URL data parsing and processing
Figure 47 URL data parsing and processing flow
WAN optimization features
The AD-WAN branch solution offers many WAN optimization features to optimize WAN link bandwidth and quality for better user experience.
WAN optimization helps optimize WAN links by addressing issues such as high latency and low bandwidth usage. The AD-WAN branch solution supports the following WAN optimization features:
· Wide Area Application Services (WAAS).
· Web cache.
· HDLC link bundling.
· Multi-path packet replication.
· Forward error correction (FEC).
WAAS
The Wide Area Application Services (WAAS) feature provides a set of WAN optimization services to resolve WAN issues such as high delay and low bandwidth. WAAS provides the following optimization services:
· Transport Flow Optimization—TFO optimizes TCP traffic without modifying packet header information.
· Data Redundancy Elimination—DRE reduces the size of transmitted data by replacing repeated data blocks with shorter indexes. A WAAS device synchronizes its data dictionary to its peer devices. A data dictionary stores mappings between repeated data blocks and indexes. Replacing repeated data blocks with indexes is called DRE compression. Replacing indexes with repeated data blocks is called DRE decompression.
· Lempel-Ziv compression—LZ compression is a lossless compression algorithm that uses a compression dictionary to replace repeated data in the same message. The compression dictionary is carried in the compression result. The sending device uses the sliding window technology to detect repeated data. Compared with DRE, LZ compression has a lower compression ratio. LZ compression does not require synchronization of compression dictionaries between the local and peer devices. This reduces memory consumption.
Web cache
Web cache enables the device to store files of specific types (such as apk, doc, and Microsoft patch) on the webpages that have been accessed by users through HTTP or HTTPS. Then, the device responds to subsequent requests for these webpages on behalf of the Web server by using the request URLs as indexes.
Figure 48 Web cache
Link bundling
Link bundling bundles multiple WAN links to increase bandwidth for bandwidth-intensive services. It allows a session to be established over a bundle of links.
Figure 49 Link bundling
Multi-path packet replication
The multi-path packet replication process including the following steps:
· Sender: ① Packet recognition ② Packet replication
· Recipient: ③ Packet deduplication ④ Packet reordering
Figure 50 Multi-path packet replication mechanism
FEC
FEC is an error correction mechanism in which the sender transmits redundant packets for error correction along with data packets so the receiver can restore data lost on the link.
FEC improves transmission quality for interactive applications such as interactive video services on poor quality links.
A-FEC is a FEC type that can adjust the number of generated redundant packets based on link quality detection.
Figure 51 FEC
RIR traffic engineering log analysis
The analyzer can analyze link selection information reported by the device, and notify the user of the link selection switchover reason, helping the user locate unqualified links and make adjustments accordingly. It displays the user-concerned tunnel, site, service network name, and output interface information before and after link switchovers.
Traffic engineering logs
RIR dynamically selects the most suitable links for traffic forwarding based on service requirements (for example, link quality and link bandwidth). RIR not only can select the optimal link from a specific type of transport network, but also can perform automatic link switchover when the current link becomes unqualified.
After enabling the device to report RIR logs, when traffic enters the RIR tunnel, the device reports RIR link selection information, including time, host (device IP), flow_id (5-tuple type), flow_detail (5-tuple information), and tunnel_num for the current link selection. The figure below shows two RIR links reported by the device.
To improve packet forwarding efficiency, the device does not repeatedly perform link selection for traffic of the same session. After the device performs link selection for traffic of a session, it forwards the subsequent traffic of that session according to the previous link selection result. When a device or link event occurs (configuration change, disconnected link, unqualified link, or unqualified bandwidth), the device reports an associated event, and performs link reselection. The figure below shows an event reported by the device (event_type 3 means unqualified link).
Traffic engineering log analysis
The analyzer analyzes link selection information reported by the device to determine whether a link scheduling event has occurred. If flows with the same 5-tuple traverse the same path with different tunnel numbers, a traffic engineering event has occurred. From the events reported by the device within the specified time range, select the one with the highest priority as the traffic engineering reason. If no even is found within the time range, the traffic engineering event belongs to type 5 (optimized engineering).
Figure 52 Traffic engineering log analysis flow
Event type distribution
The event type distribution pie chart displays the events that have occurred and the proportion of each event. The event types include disconnected link, unqualified link, unqualified bandwidth, configuration change, and optimized scheduling.
Traffic engineering details
The traffic engineering details displays the detailed information about service traffic before and after link switchover, including user-concerned service network name, site, and interface information. Users can make adjustments (for example, adjusts the delay, jitter, packet loss, and link preference settings) on the service traffic or links based on the information to preferentially forward key service traffic over high-preference links.
Traffic engineering statistics
The solution can collect statistics on traffic engineering events and display device IP, device name, application group name, and 5-tuple information about the events, enabling users to obtain services with high traffic engineering frequency.
Cloud connections
About cloud connections
A cloud connection is a management tunnel established between a local device and the Oasis server. It enables you to manage the local device from the Oasis server without accessing the network where the device resides.
Cloud connection establishment
As shown in Figure 53, the cloud connection between the device and the Oasis server is established as follows:
1. The device sends an authentication request to the Oasis server.
2. The Oasis server sends an authentication success packet to the device.
The device passes the authentication only if the serial number of the device has been added to the Oasis server. If the authentication fails, the Oasis server sends an authentication failure packet to the device.
3. The device sends a registration request to the Oasis server.
4. The Oasis server sends a registration response to the device.
The registration response contains the uniform resource locator (URL) used to establish a cloud connection.
5. The device uses the URL to send a handshake request (changing the protocol from HTTP to WebSocket) to the Oasis server.
6. The Oasis server sends a handshake response to the device to finish establishing the cloud connection.
After the WebSocket-based full-duplex connection is established, the cloud server can manage and maintain branch devices in different regions.
Figure 53 Establishing a cloud connection
Cloud connections in AD-WAN branch solution
The H3C AD-WAN controller operates as the cloud server to process the authentication and registration requests of branch devices in remote private networks. After a branch device establishes a WebSocket-based control channel to the controller, the controller can automatically perform operation, maintenance, and management on the device.
SDWAN EVPN
SDWAN network model
As shown in Figure 54, SDWAN contains the following components:
· Customer provided edge (CPE)—Edge device at a customer site.
· Route reflector (RR)—Used to reflect TTE information and private routes between CPEs.
· Transport network (TN)—Service provider WAN that connects sites. A transport network can be a service provider VPN or the Internet public network. A transport network is identified by its transport network ID or name. Transport networks are the fundamentals to construct SDWAN overlay network.
· Routing domain (RD)—Domain that contains transport networks that are reachable at Layer 3. SDWAN tunnels can be established only between CPEs or between CPEs and RRs in the same routing domain.
· Site ID—A site ID is a string of digits used to uniquely identify a site in the SDWAN network. The network controller allocates site IDs to all sites.
· Device ID—A device ID uniquely identifies an SDWAN-capable device (or SDWAN device) at a site. Typically, a site contains one or two SDWAN devices.
· System IP—Device system IP address allocated by an administrator. Typically, a loopback interface provides the system IP address.
· Interface ID—Device interface ID allocated by an administrator. On a device, tunnel interfaces are assign unique interface IDs.
· SDWAN tunnel—Point-to-multipoint logical channel established among SDWAN devices. One site transmits traffic to another site over an SDWAN tunnel. The physical outgoing interface of an SDWAN tunnel is the WAN interface on a CPE or RR, and the TNs to which the WAN interface belongs are in the same RD. The WAN interfaces at both ends of the tunnel can have underlay connectivity. Two sites can set up multiple tunnels over the TNs of different service providers.
· Secure Sockets Layer (SSL) connection—In the SDWAN network, a CPE and an RR establish an SSL connection to exchange TTE information for control channel establishment.
· Transport tunnel endpoint (TTE)—Endpoint that connects an SDWAN device to a transport network and endpoint of an SDWAN tunnel. Device TTE information includes the site ID, transport network ID (TN ID), private IP address, public IP address, and tunnel encapsulation mode. A TTE ID consists of a site ID, a device ID, and an interface ID, which makes the TTE ID unique on the SDWAN network. To simplify network management, SDWAN devices exchange TTE information to set up and maintain SDWAN tunnels dynamically.
TTE attributes are as follows:
¡ Site ID.
¡ Device ID.
¡ System IP.
¡ Site role, which can be RR, CPE, or NAT-transfer. A NAT-transfer device provides forwarding paths for the CPE devices that communicate over the public network through NAT traversal.
¡ Interface ID.
¡ TN name or ID.
¡ RD name or ID.
¡ Input bandwidth.
¡ Output bandwidth.
¡ Encapsulation type, which can only be UDP.
¡ NAT type, which can be the following:
- Full cone NAT.
- Restricted cone NAT.
- Port restricted cone NAT.
- Symmetric NAT.
- NO NAT.
- Unknown.
¡ Public IP address and port number after NAT.
¡ Private IP address and port number before NAT.
¡ IPsec SPI.
¡ IPsec authentication algorithm and key.
¡ IPsec encryption algorithm and key.
· TTE connection—Point-to-point logical connection established between TTEs. An SDWAN tunnel can hold multiple TTE connections.
SDWAN EVPN packet formats
SDWAN packets include control packets and data packets.
· SDWAN control packets are used for NAT traversal. The device uses SDWAN control packets to advertise its public IP address translated after NAT to remote devices. For more information about NAT traversal, see "SDWAN tunnel establishment with NAT traversal."
· SDWAN data packets are used to forward user packets.
Control packet format
As shown in Figure 55, an SDWAN control packet contains the following components:
· Data portion.
· IPsec header (optional).
· 12-byte SDWAN header.
· 8-byte outer UDP header. The destination port number in the UDP header is the SDWAN UDP port number. By default, the port number is 4799.
· 20-byte outer IP header.
The SDWAN header contains the following fields:
· Type—Type of the packet. The length for this field is 8 bits. For an SDWAN control packet, the value is 1.
· Subtype—Subtype of the control packet. The length for this field is 8 bits. If the value is 1, the packet is a NAT address probe request packet.
· Version—Version number of SDWAN protocol packets. The value is fixed at 0.
· Reserved—Reserved field. The value is fixed at 0.
· Length—Length of the SDWAN header. The length for this field is 16 bits. The value is fixed at 12 in the current software version.
· TTE ID—Identifies a TTE. The length for this field is 32 bits.
Figure 55 SDWAN control packet format
Data packet format
As shown in Figure 56, an SDWAN data packet contains the following components:
· Original data packet.
· IPsec header (optional).
· 12-byte SDWAN header.
· 8-byte outer UDP header. The destination port number in the UDP header is the SDWAN UDP port number. By default, the port number is 4799.
· 20-byte outer IP header.
The SDWAN header contains the following fields:
· Type—Type of the packet. The length for this field is 8 bits. For an SDWAN data packet, the value is 2.
· Protocol—Type of the inner data packet. The length for this field is 8 bits. If the value is 1, the packet is an IPv4 packet. If the value is 2, the packet is an IPv6 packet.
· Length—Length of the SDWAN header. The length for this field is 16 bits. The value is fixed at 12 in the current software version.
· VN ID—VN ID of the VPN instance to which the SDWAN data packet belongs. The length for this field is 32 bits. If the packet belongs to the public instance, the value for this field is all zeros.
· Flow ID—Flow ID of the SDWAN data packet. The length for this field is 32 bits. If no flow ID is marked, the value for this field is all zeros.
Figure 56 SDWAN data packet format
BGP extensions in SDWAN EVPN
BGP IPv4 tunnel-encap-ext address family
To support SDWAN, a new address family called BGP IPv4 tunnel-encap-ext address family is defined based on MP-BGP. This address family advertises TTE information, Including the site ID, TN ID, public IP address, private IP address, and tunnel encapsulation mode. CPEs use the information to set up data channels between each other. For more information about data channels, see "SDWAN EVPN channels."
BGP EVPN routes
In SDWAN tenant isolation scenario, a CPE advertises the private routes of a VPN instance to other CPEs by using IP prefix advertisement routes. To support SDWAN, the IP prefix advertisement routes are extended to carry the VN ID in the NLRI field to distinguish the private routes of different VPN instances.
SDWAN EVPN channels
SDWAN uses the following channels:
· Management channel—Established between a network device and the controller to deploy configuration and monitor device status.
· Control channel—Established between an RR and a CPE to advertise TTE information and private routes and maintain the underlay and overlay topologies.
· Data channel—Established between two CPEs to forward user data packets.
Management channel
A management channel is established between each CPE or RR and the controller for the following purposes:
· The controller deploys configuration such as basic network configuration, VPN service parameters, RIR configuration, and IPsec configuration over the management channel to the CPE or RR, a NETCONF channel for example.
· The CPE or RR reports O&M information such as alarms, logs, and network traffic statistics to the controller over the management channel, an HTTP channel for example.
Control channel
A control channel is established between an RR and a CPE to advertise TTE information and private routes. The establishment process is as follows:
1. The CPE and RR establish an SSL connection (control channel). The CPE is the SSL client, also referred to as the SDWAN client. The RR is the SSL server, also referred to as the SDWAN server.
2. The CPE and RR exchange TTE information by using SSL packets.
3. After receiving TTE information from each other, the CPE and RR compare the routing domain in the received TTE information with the routing domain in the local TTE information.
¡ If the routing domains are the same one, the CPE and RR establish an SDWAN tunnel (control channel) between them.
¡ If the routing domains are different, the CPE and RR do not establish an SDWAN tunnel between them.
4. The CPE and RR each automatically add the user network routes (UNRs) destined for the system IP address of the peer device to the local routing table. Router recursion is performed for the UNRs to provide a route for the SDWAN tunnel. In the route, the outgoing interface is the SDWAN tunnel interface, and the next hop is a TTE ID used for packet encapsulation. ECMP routes are created if a system IP address is associated with multiple TTE IDs.
5. The CPE and RR establish a BGP connection (control channel) by using their system IP addresses.
Data channel
A data channel is established between two CPEs to transmit data packets. The establishment process is as follows:
1. The CPEs advertises TTE routes to an RR over control channels.
2. The RR reflects the TTE routes of each CPE to the other CPE.
3. When each CPE receives TTE routes reflected by the RR, they compare the routing domain in the TTE routes with the routing domain in the local TTE information.
¡ If the routing domains are the same one, the CPEs establish an SDWAN tunnel between them.
¡ If the routing domains are different, the CPEs do not establish an SDWAN tunnel between them.
4. The CPEs each automatically add the user network routes (UNRs) destined for the system IP address of the peer device to the local routing table. Router recursion is performed for the UNRs to provide a route for the SDWAN tunnel.
To secure data transmission, use IPsec to encrypt data packets. For more information about IPsec, see "IPsec for SDWAN."
Route advertisement and traffic forwarding in SDWAN EVPN
Route advertisement
As shown in Figure 58, inter-site route advertisement in an SDWAN network includes the following processes:
1. Each site advertises private routes to its local CPE.
2. CPEs advertise routes to each other.
3. Each CPE advertises private routes received from another site to its local site.
Then, the sites have routes to reach one another.
Advertising private routes from a local site to a local CPE
The local site uses static routing, RIP, OSPF, IS-IS, EBGP, or IBGP to advertise the private routes of the local site to the local CPE. The routes are standard IPv4 or IPv6 routes.
Advertising routes from a local CPE to remote CPEs
1. When the local CPE learns private routes from the local site, it stores the routes to the routing table of the corresponding VPN instance.
2. The CPE adds RD and export targets to the standard IPv4 routes, converts the routes to BGP EVPN IP prefix advertisement routes, and advertises the routes to an RR. The next hop address of the routes is the system IP address of the local CPE.
3. The RR reflects the received IP prefix advertisement routes to remote CPEs.
4. When a remote CPE receives the IP prefix advertisement routes reflected by the RR, it matches the export targets in the IP prefix advertisement routes with the import targets of local VPN instances. If a matching VPN instance is found, the remote CPE accepts the IP prefix advertisement routes and adds the routes to the routing table of the VPN instance. In the routes, the outgoing interface is the incoming SDWAN tunnel interface, and the next hop is the system IP address of the advertiser CPE.
Advertising routes from remote CPEs to remote sites
The supported routing methods are the same as the routing methods for advertising routes from the local site to the local CPE. A remote site can use multiple methods to learn private routes from its CPE. The methods include static routing, RIP, OSPF, IS-IS, EBGP, and IBGP.
Data traffic forwarding
As shown in Figure 59, data traffic is forwarded in an SDWAN network as follows:
1. Device 1 forwards the packet sent by a host in site 1 and destined for 10.1.4.3 to CPE 1.
2. CPE 1 looks up the routing table of the corresponding VPN instance according to the incoming interface and destination address. In the matching route, the outgoing interface is an SDWAN tunnel interface and the next hop is the system IP address of CPE 2.
3. CPE 1 finds a remote TTE ID based on the route, encapsulates the packet based on the TTE connection information, and forwards the packet out of the matching SDWAN tunnel. The SDWAN encapsulation information is as follows:
¡ The source and destination IP addresses in the outer IP header are the source and destination IP addresses in the TTE connection information, respectively. The source IP address is the IP address of the physical outgoing interface for the SDWAN tunnel on CPE 1. The destination IP address is the IP address of the physical incoming interface for SDWAN packets on CPE 2.
¡ The outer source port number is the source UDP port number configured for the SDWAN packets.
¡ The VN ID in the SDWAN header is the VN ID of the VPN instance bound to the interface on which the packet is received by CPE 1.
4. The P device forwards the packet to CPE 2 based on the destination IP address of the packet.
5. CPE 2 decapsulates the packet, determines the VPN instance to which the packet belongs according to the VN ID in SDWAN encapsulation, determines the outgoing interface of the packet by looking up the routing table of this VPN instance, and forwards the decapsulated packet to Device 2.
6. Device 2 forwards the packet to the destination host according to the typical IP forwarding process.
Figure 59 Data traffic forwarding
VPN instance-based tenant isolation
As shown in Figure 60, SDWAN uses VPN instances to isolate tenants.
· Control plane—The EVPN routes exchanged between CPEs carry VN IDs to identify the private network routes of different tenants, and each VPN instance has its own forwarding table and routing table.
· Data plane—When a tenant accesses the network through an interface on a CPE, the CPE identifies the VPN to which the tenant belongs by the VPN instance associated with the interface, looks up the forwarding table of the VPN instance, adds SDWAN encapsulation to the tenant's packet, and forwards the packet to a remote CPE. The remote CPE looks up the forwarding table of the VPN instance and forwards the packet to the destination.
This isolation method enables multiple VPNs to share an SDWAN tunnel and reduces the number of tunnels and network resource consumption.
Figure 60 VPN instance-based tenant isolation
IPsec for SDWAN
This solution supports transmitting private data between user sites over the public links of service providers. To ensure confidentiality and integrity of data transmission over SDWAN tunnels set up over service provider links, this solution supports security protection for SDWAN packets by using IPsec.
IPsec provides two security mechanisms: authentication and encryption. The authentication mechanism enables the SDWAN data receiver to verify the identity of the data sender and whether the data has been tampered with during transmission. The encryption mechanism ensures data confidentiality by performing encryption to prevent data from being snooped during transmission.
IPsec provides the following security services for SDWAN in the IP layer:
· Confidentiality—The sender encrypts packets before transmitting them over the Internet, protecting the packets from being eavesdropped en route.
· Data integrity—The receiver verifies the packets received from the sender to make sure they are not tampered with during transmission.
· Data origin authentication—The receiver verifies the authenticity of the sender.
· Anti-replay—The receiver examines packets and drops outdated and duplicate packets.
Security association
A security association (SA) is an agreement negotiated between two communicating parties called IPsec peers. An SA includes the following parameters for data protection:
· Security protocols (AH, ESP, or both).
· Encapsulation mode (transport mode or tunnel mode).
· Authentication algorithm (HMAC-MD5, SM3, or HMAC-SHA1).
· Encryption algorithm (DES, 3DES, SM, or AES).
· Shared keys and their lifetimes.
An SA is unidirectional. At least two SAs are needed to protect data flows in a bidirectional communication. If two peers want to use both AH and ESP to protect data flows between them, they construct an independent SA for each protocol in each direction.
An SA is uniquely identified by a triplet, which consists of the security parameter index (SPI), destination IP address, and security protocol identifier. An SPI is a 32-bit number. It is transmitted in the AH/ESP header.
SA setup and exchange
Typically, IPsec uses automatic IKE negotiation to generate SAs and IKE to maintain SAs. This method requires IKE negotiation between any two peers, which is inefficient and wastes bandwidth. In an SDWAN network, a CPE locally generates IPsec SAs and exchanges IPsec SAs with RRs through SSL. Then, the CPE advertises its IPsec SAs to the RRs by using BGP IPv4 tnl-encap-ext routes, and the RRs reflect the SAs to other CPEs. Different TTE connections on the same SDWAN tunnel use the same IPsec SA to reduce IPsec negotiation load.
An SDWAN device updates IPsec SAs periodically to improve the security of the network.
Figure 61 IPsec SA setup and exchange
IPsec anti-replay
IPsec anti-replay protects networks against anti-replay attacks by using a sliding window mechanism called anti-replay window. This feature checks the sequence number of each received IPsec packet against the current IPsec packet sequence number range of the sliding window. If the sequence number is not in the current sequence number range, the packet is considered a replayed packet and is discarded.
IPsec packet de-encapsulation involves complicated calculation. De-encapsulation of replayed packets is not required, and the de-encapsulation process consumes large amounts of resources and degrades performance, resulting in DoS. IPsec anti-replay can check and discard replayed packets before de-encapsulation.
In an SDWAN network, BGP protocol packets and service data packets are received in a different order than their original order. The IPsec anti-replay feature drops service data packets as replayed packets, which impacts communications. If this happens, disable IPsec anti-replay checking or adjust the size of the anti-replay window as required.
NAT traversal in SDWAN
In an SDWAN network, users at enterprise branch sites often use private IP addresses to save public IP addresses. After translation from a private IP address to a public IP address through NAT, users can only access remote sites. If the CPEs and RRs cannot obtain one another's public IP address, they cannot set up SDWAN channels. To solve this problem, use static NAT or session traversal utilities for NAT (STUN) to obtain the public IP addresses and establish SDWAN tunnels between CPEs based on the public IP addresses.
Static NAT
An administrator can configure a static NAT mapping between a tunnel source IP address and port number (source UDP port number in SDWAN encapsulation) and a public IP address and port number on CPEs and RRs. Static NAT enables the CPEs and RRs to obtain public IP addresses and port numbers after NAT conversion without STUN.
STUN
STUN is a protocol that serves as a tool for other protocols in dealing with NAT traversal. STUN can determine the existence of a NAT device in the network and determine the IP address and port allocated by the NAT device to an endpoint.
STUN runs over UDP and uses port 3478 by default.
As shown in Figure 62, STUN uses a client-server model that consists of a STUN client and a STUN server. The STUN server and STUN client exchange STUN packets (Binding requests and Binding responses) to detect the IP address and port number assigned by a NAT device and the NAT type.
Figure 62 STUN network model
· STUN client—A STUN probe initiator that sends probe requests to a STUN server, determines the existence of a NAT device based on the response from the server, and obtains NAT information. In an SDWAN network, STUN clients are typically deployed on CPEs.
· STUN server—A STUN probe responder that receives probe requests from STUN clients and populates specific address and port in the responses. In an SDWAN network, a STUN server is typically deployed on an RR.
STUN mechanisms
In STUN, NAT mapping and NAT filtering are used to determine the type of NAT and therefore determine whether STUN works correctly.
NAT mapping maps an internal IP address and port number to an external IP address and port number for packets from the internal network to the external network. When an internal endpoint opens an outgoing session through a NAT device, the NAT device establishes a mapping between the internal IP address and port number and an external IP address and port number. Then, the NAT device forwards the packet based on the mapping.
The following NAT mapping types are supported:
· Endpoint-independent mapping (EIM)—The NAT device uses the same mapping for all packets sent from the same internal IP address and port number to any external IP address and port number.
· Address-dependent mapping (ADM)—The NAT device uses the same mapping for all packets sent from the same internal IP address and port number to the same external IP address, regardless of the external port number.
· Address and port-dependent mapping (APDM)—The NAT device uses the same mapping for all packets sent from the same internal IP address and port number to the same external IP address and port number.
NAT filtering filters out packets from the external network to the internal network. To protect the internal network from attacks, the NAT device will discard attack packets and forward normal packets.
The following NAT filtering types are supported:
· Endpoint-independent filtering (EIF)—The NAT device filters out only packets not destined to the internal IP address and port number (Endpoint (X:x), regardless of the external IP address and port source. The NAT device forwards any packets destined to X:x.
· Address-dependent filtering (ADF)—The NAT device filters out packets not destined to the internal address X:x. Additionally, the NAT device will filter out packets from Y:y destined for the internal endpoint X:x if X:x has not sent packets to Y:any port previously (independently of the port used by Y).
· Address and port-dependent filtering (APDF)—The NAT device filters out packets not destined to the internal address X:x. Additionally, the NAT device will filter out packets from Y:y destined for the internal endpoint X:x if X:x has not sent packets to Y:y previously.
A NAT type is a combination of a NAT mapping type and a NAT filtering method. The following NAT types are supported:
· Full cone NAT (EIM+EIF)—Maps all requests from the same internal IP address and port number (IP1:Port1) to the same external IP address and port number (IP:Port). Additionally, any external host can send a packet to the internal host by sending a packet to the mapped external address.
· Restricted full cone NAT (EIM+ADF)—Maps all requests from the same internal IP address and port number (IP1:Port1) to the same external IP address and port number (IP:Port). Unlike a full cone NAT, an external host (with IP address X) can send a packet to the internal host only if the internal host had previously sent a packet to IP address X.
· Port restricted full cone NAT (EIM+APDF)—Maps all requests from the same internal IP address and port number (IP1:Port1) to the same external IP address and port number (IP:Port). Unlike a restricted full cone NAT, an external host (with IP address and port number IP2:Port2) can send a packet to the internal host only if the internal host had previously sent a packet to IP address and port number IP2:Port2.
· Symmetric NAT (APDM+APDF)—Maps all requests from the same internal IP address and port number (IP1:Port1), to a specific destination IP address and port number, to the same external IP address and port number. If the same host sends a packet with the same source address and port, but to a different destination, a different mapping is used. Additionally, only the external host that receives a packet can send a UDP packet back to the internal host.
The STUN client and STUN server exchange STUN packets to detect the NAT mapping type and NAT filtering method, therefore determining the NAT type.
· NAT mapping type detection—Suppose the IP address and port number of the STUN server is Y1:YP1, and the alternate IP address and port number of the STUN server is Y2:YP2. The IP address and port number of the STUN client is X:XP. Figure 63 shows the process of NAT mapping type detection.
Figure 63 Process of NAT mapping type detection
· NAT filtering method detection—Suppose the IP address and port number of the STUN server is Y1:YP1, and the alternate IP address and port number of the STUN server is Y2:YP2. The IP address and port number of the STUN client is X:XP. Figure 64 shows the process of NAT filtering method detection.
Figure 64 Process of NAT filtering method detection
SDWAN tunnel establishment with NAT traversal
Typically, use CPEs as STUN clients and use the RR as the STUN server. The clients exchange packets with the server to identify whether NAT devices exist in the SDWAN network. If NAT devices exist, the clients obtain their public IP addresses and port numbers translated after NAT. After a CPE (STUN client) obtains its public IP address and port number translated after NAT, it uses this public IP address to establish an SDWAN tunnel with the other CPE.
Figure 65 SDWAN tunnel establishment with NAT traversal
If two CPEs cannot establish a direct data channel between them, you must deploy a NAT transfer on the transport network for the CPEs to communicate with each other. A data channel is established between each CPE and the NAT transfer. Data traffic forwarded between CPEs is first sent to the NAT transfer through the data channels and then forwarded by the NAT transfer to other CPEs through the data channels.
Figure 66 SDWAN tunnel establishment with NAT traversal (deployed with a NAT transfer)
Table 2 Data channel compatibility for different NAT types
CPE 1 NAT type |
CPE 2 NAT type |
Support for CPE-CPE direct tunnels |
NAT transfer required for CPE intercommunication |
Non-NAT |
Full cone NAT |
√ |
× |
Non-NAT |
Port restricted full cone NAT or restricted full cone NAT |
√ |
× |
Non-NAT |
Symmetric NAT |
√ |
× |
Non-NAT |
Unknown type |
√ |
× |
Non-NAT |
Static NAT |
√ |
× |
Full cone NAT |
Full cone NAT |
√ |
× |
Full cone NAT |
Port restricted full cone NAT or restricted full cone NAT |
√ |
× |
Full cone NAT |
Symmetric NAT |
√ |
× |
Full cone NAT |
Unknown type |
√ |
× |
Full cone NAT |
Static NAT |
√ |
× |
Port restricted full cone NAT or restricted full cone NAT |
Port restricted full cone NAT or restricted full cone NAT |
√ |
× |
Port restricted full cone NAT or restricted full cone NAT |
Symmetric NAT |
× |
√ |
Port restricted full cone NAT or restricted full cone NAT |
Unknown type |
× |
√ |
Port restricted full cone NAT or restricted full cone NAT |
Static NAT |
√ |
× |
Symmetric NAT |
Symmetric NAT |
× |
√ |
Symmetric NAT |
Unknown type |
× |
√ |
Symmetric NAT |
Static NAT |
√ |
× |
Unknown type |
Unknown type |
× |
√ |
RIR-SDWAN
About RIR-SDWAN
As shown in Figure 67, based on link preference, link quality, and link bandwidth, RIR can select SDWAN tunnels to transmit various service flows among the CPEs.
Figure 67 RIR-SDWAN application scenario
Concepts
Flow template
A flow template defines link selection policies for a type of service flow. A flow ID uniquely identifies a flow template.
The device applies the link selection policies under a flow template to the service flow marked with the flow ID of the flow template.
The device supports using QoS policies to mark flow IDs for service flows. After QoS identifies the service of a packet based on the quintuple of the packet, it assigns a flow ID to the packet. Then, RIR will perform link selection for the packet based on the flow template that uses the flow ID.
The flow ID is marked only in the RIR process, and it will not be added to any outgoing packets.
Link types
RIR-SDWAN links are SDWAN tunnels. Each SDWAN tunnel is attached to a transport network, and is uniquely identified by the transport network name or ID. RIR-SDWAN uses the transport network names used by SDWAN tunnels to distinguish links.
Two CPEs exchange TTE information for establishing an SDWAN tunnel between them. As shown in Figure 68, the RR establishes two SDWAN tunnels over TN 1 and TN 2 to the CPE.
Figure 68 Links in an RIR-SDWAN network
RIR link selection
Preference-based link selection
RIR can select links based on the link preference. You can assign a preference to a link based on factors such as the service requirements, the link conditions, and the link cost. RIR preferentially selects links with higher preference.
RIR-SDWAN supports assigning a link preference to an SDWAN tunnel by its transport network name in flow template view.
You can assign the same preference value to different links in the same flow template. RIR selects a link for a type of service flows from the links in the flow template in descending order of link preference. If the links with the highest preference cannot meet the service requirements, RIR tries the links with the second highest preference, and so forth to the links with the lowest preference.
If the flow template has two or more links with the same preference, RIR performs link selection based on RIR link load sharing criteria.
Quality-based link selection
RIR-SDWAN defines two link detection mechanisms:
· Link connectivity probe operation—Uses keepalive detection to check the connectivity of each link.
With SDWAN keepalive detection, a device sends keepalive requests to a peer at a specified interval and waits for keepalive replies. If no keepalive reply is received within a time period (keepalive request interval × maximum number of consecutive keepalive replies not received by the device), the TTE connection between the device and its peer is identified as unreachable, and the TTE connection will not be used to forward traffic.
· Link quality probe operation—Uses iNQA to measure the latency, jitter, and packet loss rate on a per-TTE connection basis.
RIR-SDWAN performs differentiated quality evaluation for services based on SLA. An SLA contains a set of link quality evaluation thresholds, including the link delay threshold, jitter threshold, and packet loss threshold.
RIR-SDWAN computes Comprehensive Quality Indicator (CQI) values to evaluate link quality.
· If the probe result of a metric (delay, jitter, or packet loss rate) is lower than or equal to the associated quality threshold in the SLA, the CQI value for the metric is 100.
· If the probe result of a metric is higher than the associated quality threshold in the SLA, the CQI value for the metric is calculated with the formula: (metric threshold × 100) / probe result of the metric.
· The overall CQI value is calculated with the formula: (x × Ds+ y × Js + z × Ls) / (x + y + z).
In this formula, x, y, and z represent the weight values of delay, jitter, and packet loss rate, respectively (the weight values are in the range of 0 to 10, and cannot be all 0). Ds, Js, and Ls represent the CQI values for delay, jitter, and packet loss rate.
To avoid frequent link switchovers, the device uses the approximate overall CQI value to evaluate link quality. The approximate overall CQI value is a multiple of 5 that is smaller than and closest to the overall CQI value. For example, if the overall CQI value is 82.5, the approximate overall CQI value is 80.
If you configure a quality policy for a specify type of service flows, a CPE obtains the link quality result based on the link probe result and SLA. Then it performs quality-based link selection based on the quality result. The device determines the quality of a candidate link for service flows as follows:
· If the approximate overall CQI value is smaller than 100, the link does not meet the service quality requirements.
· If the approximate overall CQI value is equal to 100, the link meets the service quality requirements.
If no quality policy is configured for the service flows, link quality is not considered in link selection, and the link meets the service quality requirements.
Bandwidth-based link selection
Bandwidth-based link selection not only can select links that meet the service bandwidth requirements, but also can load share service traffic among multiple links. This manner can avoid a link from being overwhelmed or congested.
A device can select a suitable link for service traffic based on the following bandwidths:
· The used bandwidth of the link or the attached physical output interface.
· The total bandwidth of the link or the attached physical output interface.
· The per-session expected bandwidth.
In an SDWAN network, the bandwidth of a link refers to the bandwidth of the attached tunnel interface. The bandwidth of the link-attached physical output interface refers to the bandwidth of the physical output interface that sends tunneled packets.
The device uses sessions as the minimum granularity and performs bandwidth-based link selection to achieve refined link bandwidth management. A session is uniquely defined by a quintuple including the source IP address, destination IP address, source port, destination port, and transport layer protocol.
When the device selects links for traffic of a session, it first performs bandwidth detection based on the per-session expected bandwidth that is obtained in real time or manually configured. A link is qualified in the bandwidth detection if it meets the following requirements:
· For the link-attached physical output interface, the used bandwidth plus the per-session expected bandwidth is less than 80% of the total bandwidth.
· For the link, the used bandwidth plus the per-session expected bandwidth is less than 80% of the total bandwidth.
When different sessions of the same service use the same link selection policy, the results might be different.
Load balancing
· Per-session weight-based link selection mode—RIR global link load balancing mode that takes effect on all RIR flows. This mode can distribute the sessions of the same flow to different links according to the weights of the links. RIR selects only one link to transmit a session.
· Per-session periodic link adjustment mode—RIR global link load balancing mode that takes effect on all RIR flows. This mode not only can distribute the sessions of the same flow to different links, but also can periodically adjust links for the sessions. Within one adjustment period, RIR selects only one link to transmit a session.
· Per-packet mode—Flow-specific link load balancing mode that takes effect only on traffic of a flow. This mode can distribute the same session to different links for transmission.
The per-packet mode takes precedence over the per-session modes.
RIR-SDWAN working mechanisms
Link selection mechanisms
When a device receives a packet, it uses QoS to mark the packet with a flow ID based on the quintuple and performs a routing table lookup to identify whether routes are available to forward the packet. If routes are available, the device performs link selection for the packet. If no route is available, the device drops the packet.
RIR uses the following workflow to select a link to forward a packet:
1. Selects the flow template that has the same flow ID as the packet.
2. Selects the most suitable link from the links in the flow template by using the following criteria in order:
a. Preference-based link selection.
b. Quality tolerant link selection.
c. Bandwidth tolerant link selection.
3. If a link is found suitable, RIR returns the link selection result and stops searching other links. If no link is found suitable for a criterion, RIR uses the next criterion to select links. If RIR fails to find a suitable link by using all criteria, it determines that no link is suitable and returns the link selection result.
4. If no suitable link is found, the device performs forwarding based on the routing table. If a suitable link is found, RIR forwards the packet based on the link selection result.
After finishing link selection, the device forwards the subsequent packets of the same session based on the optimal link information.
Figure 69 RIR link selection workflow
Preference-based link selection
RIR preferentially selects links that meet both the quality and bandwidth requirements for a service flow. A device uses the following process to examine links with the same preference:
1. The device examines all links with the preference and identifies whether a link forms ECMP routes with other links. If a link forms ECMP routes with other links, the device further identifies whether the link is a primary link that meets both the quality and bandwidth requirements of the service.
¡ If yes, the device adds the link to the available suitable link list of that preference.
¡ If no, the device further identifies whether the link is a primary link that meets the bandwidth requirements of the service.
- If yes, the device adds the link to the quality tolerant link list for quality tolerant link selection. Then, the device continues to examine other links with the same preference.
- If no, the device continues to examine other links with the same preference.
If a link does not form ECMP routes with other links, the device continues to examine other links with the same preference.
2. When the device finishes examining all links with the preference, it identifies how many suitable links are available for the service flow.
¡ If only one suitable link is available, the device selects that link as the optimal link.
¡ If multiple suitable links are available, the device selects one or multiple optimal links from them based on the link load balancing mode. In a per-session load balancing mode, the device selects only one link as the optimal link of a session. In the per-packet load balancing mode, the device can select multiple links as the optimal links of a session.
¡ If no suitable link is available, the device examines the links that have a preference value lower than the links with the current preference.
If no primary links in the flow template are suitable, the device determines that no optimal primary link is found for the service flow.
For more information about identifying whether a link meets the quality requirements, see "Quality tolerant link selection" and "Bandwidth-based link selection."
Figure 70 Preference-based link selection workflow
Quality tolerant link selection
If preference-based link selection fails to select a suitable link, a device performs quality tolerant link selection.
The links that meet the quality tolerant link selection criterion are those added to the quality tolerant link list during preference-based primary and backup link selection. These links do not meet the quality requirements of the service, but they meet the bandwidth requirements of the service. Quality tolerant link selection selects a link from the links that meet only the bandwidth requirements of the service.
The device selects the link with the highest approximate overall CQI value as the optimal link. If multiple links have the highest approximate overall CQI value, the device selects one or multiple optimal links from them based on the link load balancing mode.
Bandwidth tolerant link selection
If quality tolerant link selection still cannot find a suitable link for a service flow, a device performs bandwidth tolerant link selection. Bandwidth tolerant link selection selects one link from ECMP routes in the flow template as the optimal link.
If multiple links are available, the device selects one or multiple optimal links from them based on the link load balancing mode.
Link selection delay and suppression
To improve packet forwarding efficiency, a device does not repeatedly perform link selection for traffic of the same session. After the device performs link selection for traffic of a session, it forwards the subsequent traffic of that session according to the previous link selection result. Link reselection is triggered when any link in the session's flow template has one of the following changes:
· The quality of a link becomes qualified from unqualified or the quality of a link becomes unqualified from qualified.
· The bandwidth usage of a link has reached 90% of the maximum bandwidth.
· The bandwidth usage of a link-attached physical output interface has reached 90% of the maximum bandwidth.
To avoid frequent link selection caused by link flapping, RIR defines a link selection delay and link selection suppression period.
After the device performs link selection, it starts the link selection suppression period if the period has been configured. Within the link selection suppression period, the device does not perform link reselection, but it maintains the link state data. When the link selection suppression period ends, the link selection delay timer starts. If the link state still meets the conditions that can trigger link reselection when the delay timer expires, the device performs link reselection. If the link state changes to not meet the conditions that can trigger link reselection within the delay time, the device does not perform link reselection.
RIR application in the AD-WAN branch solution
In the AD-WAN branch solution, the AD-WAN controller and RIR-enabled devices provide distributed intelligent routing capabilities not relying on the controller.
Figure 71 RIR application in the AD-WAN branch solution
As shown in Figure 71, RIR is applied to the AD-WAN branch solution as follows:
1. The controller defines different flow templates for different service applications. A flow template contains flow policies for an application, including the SLA profile, WAN selection policy, and per-session expected bandwidth.
2. The controller deploys MQC to the inbound direction of interfaces at the LAN side on the devices. By matching application packets to a quintuple or signature, the devices mark the packets with the corresponding flow ID.
3. The devices detect the quality and connectivity of each outgoing links in real time based on the parameters issued by the controller.
4. For incoming application traffic, the devices select links and adjust links in real time based on the iNQA probe results and the application link selection policies deployed by the controller.
DPI-based application recognition
Introduction to DPI
Deep packet inspection (DPI), an advanced method of examining and managing network traffic, goes beyond packet headers. It identifies application types or content by examining the application layer payloads that cannot be detected by the traditional packet identification approaches. When IP packets, and TCP or UDP flows travel through a DPI-capable device, the DPI engine abstracts the payloads to reconstruct the application layer information, thus identifying the application layer protocols.
Figure 72 DPI analysis on application signatures
Introduction to APR used in DPI
APR uses the following methods to recognize an application protocol:
· Port-based application recognition (PBAR)—Maps a port to an application protocol and recognizes packets of the application protocol according to the port-protocol mapping.
· Network-based application recognition (NBAR)—Uses predefined or user-defined NBAR rules to match packet contents to recognize the application protocols of packets that match the applied object policy.
NBAR can recognize the system-defined and user-configured application types.
APR in AD-WAN branch solution
The AD-WAN controller provides content signature-based identification of application layer protocols. Upon receiving a packet, it compares the packet payload with signatures in the application signature database to identify the application protocol.
AD-WAN provides a system-defined APR signature database with signatures of over 3000 applications, covering most mainstream applications. It also allows enterprise users to define signatures as needed to identify specific applications. The APR signature database supports continuous online update and can adapt itself to IT upgrade in the future.
Application cases
Solution deployment for a supermarket enterprise
The customer service challenges include:
· Scattered store locations require time-consuming, high-cost onsite deployment support of professionals.
· Heavy workload to manually deploy services such as IPsec VPN and QoS.
· Difficulty in locating faults without unified management, as well as a lack of visualized management for traditional networks.
To address these challenges, this solution uses the following design:
· Use zero-touch deployment to fast onboard store businesses.
· Use AD-WAN to manage WAN and LAN network resources, greatly reducing service deployment workload.
· Provide visualized interfaces to monitor device status and link quality for stores to facilitate fault location.
Solution deployment for a large enterprise
A large enterprise has a wide range of businesses including manufacturing and distribution of consumer products, real estate, and infrastructure and public utilities. The enterprise has 17 primary profit centers, six listed companies in Hong Kong, and more than 1000 branches.
The enterprise is facing the following business challenges:
· Lack of a unified platform to provide multitenancy and hierarchical management of multiple branches.
· Complicated configuration for load sharing and assurance of key business traffic over the two VPN links and one Internet link between each branch and the headquarters.
To address these challenges, this solution uses the following design:
· Deploy AD-WAN to provide visualized management of the branches.
· Use RIR and QoS to provide intelligent link selection and bandwidth allocation for services such as manufacturing, video conferencing, and Internet access.
Solution deployment for a joint-stock bank
The Shandong branch of a bank has 8 counter service outlets (including the branch business department) and 13 self-service banks in Jinan city. It also has subbranches in Weifang and Yantai.
The customer service challenges include:
· Limited bandwidth that might result in poor video conferencing experience and stuck transaction pages.
· Waste of bandwidth caused by inadequate, complex load sharing among the links for manufacturing, office, surveillance, and video conferencing.
· Complicated network maintenance workload without unified management of branch networks.
To address these challenges, this solution uses the following design:
· Configure differentiated intelligent traffic scheduling policies to provide guaranteed network access services for critical applications.
· Identify applications and monitor VPN link bandwidth usage in real time. Dynamically distribute traffic across VPN links to guarantee critical applications sufficient bandwidth.
· Deploy a unified management platform for the ease of network operations and assurance.