Traffic Scheduling and Routing for SDN-Based Data Center Networks
Project Background
The booming Internet is driving the development of data center networks. However, the rapid growth of network traffic and changing service demands have brought tremendous load and pressure to network operations. Traditional solutions that focus on link speed and capacity expansion can no longer effectively solve these challenges. The improvement of network efficiency is the foundation of the sustainable development of data center networks.
Project goal: Traffic scheduling and routing, as a core function of the network, determine the transmission path of data flows in the network. They have become a research topic of current interest in academia and industry, and are also the goal of this project.
The current routing protocols widely used in data center networks are the shortest path first algorithm and equal-cost multi-path routing. Both of them consider the static mapping of data flows to paths, ignoring link load conditions and data flow requirements, which inevitably lead to link congestion. This project studies a set of routing schemes that can sense network status and make adaptive adjustments to optimize network resources and performance.
Project Contribution 1
In the context of software defined network (SDN), the data center can be equipped with a centralized controller to perform overall network scheduling by collecting global information. Based on this condition, this project first proposes a zero-queuing-delay scheduling and routing scheme for data center networks with fat tree topology. Specifically, the flows are finely scheduled at the edge routers/switches by using the solutions of edge coloring for some bipartite graphs. Therefore flows do not need to be queued at the intermediate switches once they enter the network.
Project contribution 2
A fully centralized algorithm requires a lot of information exchange, and the complexity increases significantly as the network scale increases. Therefore, this research further considers some partially centralized schemes, and designs low-complexity, low-latency, and high-throughput scheduling schemes by making full use of the cache capability of the internal switches of the network.
Project contribution 3
Centralized or partially-centralized solutions are all based on network modeling. However, the real network is very complex and the traffic demand is changing all the time, making it difficult to model the network accurately. In recent years, the rapid development of machine learning technology has brought more possibilities for routing algorithm design. This project uses reinforcement learning techniques to propose a hop-by-hop routing scheme that can adapt to network changes. Instead of making complex assumptions and modeling of the network, the method uses real data to train the model. It explores the relationship between routing selection and network feedback such as delay, and learns a globally optimal routing strategy.
Next step
This project will conduct both theoretical and practical research. On the theoretical side, we will continue to explore the use of multi-agent reinforcement learning (MARL) to solve routing and traffic engineering problems. In particular, we aim to design suitable reward mechanisms for MARL to accelerate learning, and design cooperation mechanisms among agents to stabilize convergence. On the practical side, we consider the deployment of MARL-based routing algorithms in real network scenarios, especially in large-scale networks.
Team members
Yi Chen, Siliang Zeng, Xingfei Xu, Xuan Mai, Quanzhi Fu, Xuhong Cai