The Chinese University of Hong Kong, Shenzhen
Abstract: Gradient descent (GD) type optimization methods are the standard instrument to train artificial neural networks (ANNs) with rectified linear unit (ReLU) activation. Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains -- even in the simplest situation of the plain vanilla GD optimization method with random initializations -- an open problem to prove (or disprove) the conjecture that the true risk of the GD optimization method converges in the training of ANNs with ReLU activation to zero as the width/depth of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity. In this talk we prove this conjecture in the situation where the probability distribution of the input data is absolutely continuous with a piecewise polynomial density with respect to the continuous uniform distribution, where the probability distributions for the random initializations of the ANN parameters are standard normal distributions, and where the target function under consideration is continuous and piecewise polynomial.
University of Illinois at Urbana-Champaign
Abstract: GAN (generative adversarial net) training is known to be challenging, and suffers from various issues such as mode collapse. One major challenge is that the GAN optimization problem is a non-convex-non-concave min-max problem. As a result, most recent studies focused on local analysis. In this talk, we discuss how to perform a global analysis of GANs. We prove that a class of Separable-GANs (SepGAN), including the popular JS-GAN (Jenson-Shannon-GAN) and hinge-GAN, have exponentially many sub-optimal strict local Stackleberg equilibrium; in addition, they are mode-collapse patterns. We prove that relativistic pairing GANs (RpGANs) haver no sub-optimal strict local Stackleberg equilibrium. RpGAN can be viewed as an unconstrained variant of W-GAN: RpGAN keeps the "pairing" idea of W-GAN, but adds an upper bounded shell function (e.g. logistic loss in JS-GAN). The empirical benefit of RpGANs has been demonstrated by practitioners in, e.g., ESRGAN and realnessGAN, and we provide additional experiments to reveal that its benefit is at least partially due to a better landscape. More specifically, our landscape theory predicts that RpGANs has a bigger advantage over SepGANs for high resolution data (e.g. LSUN 256*256), imbalanced data and narrower nets (1/2 or 1/4 width), and our experiments on real data verify these predictions. Reference: Towards a better global landscape of GANs (NeurIPS 2020 oral).
Fudan University
Abstract: With the advancing of technology, in brain science we have accumulated data ranging from molecular, cellular, network and tissue scales. All these data sets are typically dynamic and exhibit rich spatio-temporal patterns. In my talk, we will first introduce statistical methods to tackle such data, revealing both correlations and causalities. Examples are presented for various mentaldiseases and cognitive deficits. We then go a step further on how to extract dynamical patterns from the complex data. Finally, we will address some future challenges we are currently facing.
Southwest Jiaotong University
Abstract: Vehicular radar sensing and communications are the two primary means of using radio frequency (RF) signals in transportation systems, especially in mmWave band. To get the most use out of scarce spectrum, it is desirable to design a system to accommodate both radar and communications functions, using the same platform and spectral resources. There are three types of waveform designs for such systems, i.e., communications waveform-based design, radar waveform-based design, and joint waveform design. This talk shall discuss the related waveform design requirements, and present several new types of waveforms and their applications in vehicular radar sensing and communications.
Tsinghua University
Abstract: Deep generative models (DGMs) provide a set of powerful tools to learn the distribution of high-dimensional images, which can be used for multiple tasks, including sample generation, semi-supervised learning and continual learning. However, the learning of DGMs can be unstable or inefficient due to the highly nonlinear functions defined by neural networks. In this talk, I will present some recent progress on reliably and efficiently learning deep generative models, including generative adversarial networks, normalizing flows and energy-based models. I will also present some application examples on semi-supervised learning and continual learning.
Peking University
Abstract: It is well-known that standard neural networks, even with a high classification accuracy, are vulnerable to small ℓ∞-norm bounded adversarial perturbations. Although many attempts have been made, most previous works either can only provide empirical verification of the defense to a particular attack method, or can only develop a certified guarantee of the model robustness in limited scenarios. In this paper, we seek for a new approach to develop a theoretically principled neural network that inherently resists ℓ∞ perturbations. In particular, we design a novel neuron that uses ℓ∞-distance as its basic operation (which we call ℓ∞-dist neuron), and show that any neural network constructed with ℓ∞-dist neurons (called ℓ∞-dist net) is naturally a 1-Lipschitz function with respect to ℓ∞-norm. This directly provides a rigorous guarantee of the certified robustness based on the margin of prediction outputs. We also prove that such networks have enough expressive power to approximate any 1-Lipschitz function with robust generalization guarantee. Our experimental results show that the proposed network is promising. Using ℓ∞-dist nets as the basic building blocks, we consistently achieve state-of-the-art performance on commonly used datasets: 93.09% certified accuracy on MNIST (ϵ=0.3), 79.23% on Fashion MNIST (ϵ=0.1) and 35.10% on CIFAR-10 (ϵ=8/255).
Huawei Technologies Co., Ltd.
Abstract: What kinds of problems we are facing as Wi-Fi technology involves in Homes, enterprises, campus, industries and so on under the trend of F5G+X. What’s the exact challenges of Wi-Fi algorithm we need to cope with in order to guarantee the high bandwidth, low latency, and wide coverage.
Southeast University
Abstract: Simultaneous localization and mapping (SLAM) during communication is emerging. This technology promises to provide information on propagation environments and transceivers' location, thus creating several new services and applications for the Internet of Things and environment-aware communication. Using crowdsourcing data collected by multiple agents appears to be much potential for enhancing SLAM performance. However, the measurement uncertainties in practice and biased estimations from multiple agents may result in serious errors. In this talk, we introduce our recent studies on developing a robust SLAM method with measurement plug-and-play and crowdsourcing mechanisms to address the above problems. First, we divide measurements into different categories according to their unknown biases and realize a measurement plug-and-play mechanism by extending the classic belief propagation (BP)-based SLAM method. The proposed mechanism can obtain the time-varying agent location, radio features, and corresponding measurement biases (such as clock bias, orientation bias, and received signal strength model parameters), with high accuracy and robustness in challenging scenarios without any prior information on anchors and agents. Next, we establish a probabilistic crowdsourcing-based SLAM mechanism, in which multiple agents cooperate to construct and refine the radio map in a decentralized manner. Our study presents the first BP-based crowdsourcing that resolves the “double count” and “data reliability” problems through the flexible application of probabilistic data association methods. Numerical results reveal that the crowdsourcing mechanism can further improve the accuracy of the mapping result, which, in turn, ensures the decimeter-level localization accuracy of each agent in a challenging propagation environment. Finally, we point out open research questions for further study.
The Chinese University of Hong Kong, Shenzhen
Abstract: What is the minimum car fleet size of a ride-hailing platform required to serve a given transport demand? In a sharing economy do prices yield unique results? Do crowdsourced platforms operate as efficiently as the ones where operations are managed centrally?
To answer these and other related questions we will consider an abstract setting of resource allocation games between a continuum of semi Markov decision processes. While most theoretical questions remain open for arbitrary discounting, in this talk we present the basic theory for the average case, i.e., when discounting approaches zero. As it turns out in this case, the existence of stationary equilibria and the investigation of their properties rest on an Eisenberg-Gale type of convex program. While equilibria may fail to be Pareto efficient, the price of anarchy equals 2 in the homogenous case (and ∞ more generally).
In the specific case of ride-hailing platforms we will deal with the issues of price selection, dynamic pricing and fleet size.
Georgia Institute of Technology
Abstract: The talk will discuss how AI works in drug discovery, including the current situation and challenges faced by the biopharma industry, industrial solutions, research progress in biological computing, and frontier exploration of AI + biopharma.
Linköping University
Abstract: Current 5G deployments are almost exclusively based on “Massive MIMO” technology, where each base station is equipped with an array of many antenna-integrated radios. Massive MIMO increases the capacity per cell since each base station can serve many users at the same time and frequency, with the help of narrow beamforming.
The widespread adoption of Massive MIMO in 5G is a great achievement for a technology that, ten years ago‚ was generally believed to be too bulky, overly power-hungry, and only useful in niche scenarios. Despite the word “massive”, 5G base stations are physically small from the user’s viewpoint. The major vendors have products that weigh less than 20 kg and fit in the same towers as conventional base station antennas.
What if we would build truly physically large antenna arrays in the future? In this plenary, we take a look at the benefits such large antenna arrays can have for future wireless communication systems. In particular, we will discover how the near-field effects of large arrays can be much different from the near-field effects of individual antennas. To study these phenomena, we need to go back to the fundamental electromagnetic wave equations and remove some of the simplifications that we are used to making. The new insights that can be obtained by doing so are useful when designing both active antenna arrays and passive reconfigurable intelligent surfaces.
EECS, UC Berkeley
Abstract: This work proposes a new computational framework for learning an explicit generative model for real-world datasets. More specifically, we propose to learn a closed-loop transcription between a multi-class multi-dimensional data distribution and a linear discriminative representation (LDR) in the feature space that consists of multiple independent linear subspaces. We argue that the optimal encoding and decoding mappings sought can be formulated as the equilibrium point of a two-player minimax game between the encoder and decoder. A natural utility function for this game is the so-called rate reduction, a simple information-theoretic measure for distances between mixtures of subspace-like Gaussians in the feature space. Our formulation draws inspiration from closed-loop error feedback from control systems and avoids expensive evaluating and minimizing approximated distances between arbitrary distributions in either the data space or the feature space. To a large extent, this new formulation unifies the concepts and benefits of Auto-Encoding and GAN and naturally extends them to the settings of learning a both discriminative and generative representation for multi-class and multi-dimensional real-world data. Our extensive experiments on many benchmark imagery datasets demonstrate tremendous potential of this new closed-loop formulation: we notice that the so learned features of different classes are explicitly mapped onto approximately independent principal subspaces in the feature space; and diverse visual attributes within each class are modeled by the independent principal components within each subspace. This work opens many deep mathematical problems regarding learning submanifolds in high-dimensional spaces as well as suggests potential computational mechanisms about how memory can be formed through a purely internal close-loop process.
This is joint work with Xili Dai, Shengbang Tong, Mingyang Li, Ziyang Wu, Kwan Ho Ryan Chan, Pengyuan Zhai, Yaodong Yu, Michael Psenka, Xiaojun Yuan, Heung-Yeung Shum.
Shanghai University of Finance and Economics
Abstract: In recent years, data driven technologies have been proven to be useful and widely applied in firms’ daily operations management. Various disciplines have contributed to the advances of this area, however people from different backgrounds often work in separate groups and stages. They often have different or even split opinions on solution methods, both in research and in practices.
In this talk, we present several case studies and share our observations on the challenges in this field based on our experiences. Overall, mitigation of knowledge from different disciplines based on the characteristics of the problem are crucial, and scenarios that appear to be similar may require totally different approaches
Huawei Technologies Co., Ltd.
Abstract: The emergence of machine learning for enhancing and redefining wireless communications has shown great potentials in breaking the bottlenecks of throughput, latency, reliability, sensing capability, and power efficiency. Wireless intelligent communication is a new and cross-disciplinary area which brings in key challenges to system framework, algorithm design and computing architecture. This talk will discuss research progress in this realm, and more importantly several challenging math problems faced by industrial solutions of wireless intelligent communication.
Technical University of Berlin
Abstract: Coded Caching is a network coding scheme originally proposed by Maddah Ali and Niesen for a system where a server is connected to users via a common shared bottleneck link, and users have local cache memory. The scheme has been proved to be essentially information theoretically optimal. Since the first foundational works, an enormous body of literature has explored coded caching for different network topologies, in conjunction with physical layer models such as multiuser MIMO, and including additional constraints such as privacy of the demands. In this talk, we shall review the basic results, illustrate some relevant recent results, and discuss the applicability of coded caching in realistic scenarios, and in particular as an overlay application layer scheme to be implemented on general routing networks (above IP).