Peking University
Abstract: The stochastic gradient descent algorithm (SGD) is the most popular training algorithm in machine learning. In this lecture, we discuss some basic theoretical issues of this algorithm, including.
the convergence behavior
the optimal learning rate and batch size
in the over-parametrized regime, the selection of the particular global minimum that SGD converges to
Stanford University
Abstract: In this talk, I will describe a few developments from my research team on solving convex and nonconvex optimization problems. They include a semidefinite programming approach to compute the optimal diagonal preconditioners of linear systems; a potential-reduction algorithm for (conic) Linear Programming and a general unconstrained nonlinear optimization algorithm based on a dimension-reduced trust-region method. While their preliminary theoretical analyses would be shown, the focus of the talk is on their practical performances where various numerical results in Machine/Deep Learning and Data/Information Sciences.
Shenzhen Research Institute of Big Data
Abstract: Intelligent Reflecting Surface (IRS) is a digitally controlled metasurface that can be densely deployed in wireless networks to reconfigure the propagation channels by dynamically tuning signal reflections. IRS is able to not only significantly improve the network spectral and energy efficiency for communications, but also greatly enhance the performance for other emerging applications such as wireless power transfer, sensing and localization, etc. The existing research on IRS has mainly considered wireless systems with single-IRS reflections at the link level, which does not reveal the full potential of IRS for future wireless networks. In this talk, we will focus on the main design challenges for efficiently integrating IRSs to wireless networks, including IRS reflection optimization, channel acquisition and optimal deployment, with an emphasis on double-/multi-IRS reflections. In particular, a new graph-based optimization framework is introduced to design the optimal beam routes enabled by multi-IRS reflections for maximizing the capacity of multi-user communications.
Shanghai Jiao Tong University
Abstract: Multimedia signals are becoming our major source of imformation. The average screen time of teens and adults reaches 9 to 10 hours per day. However, quality those multimedia signals are often negatively affected by the enviroment, devices as well as operations, during acquirement, compression, transmission and display stages. Perceptual quality assessment algorithms aim to gauge viewers' QoE (quality of experience) and thus providing optimization objectives for the multimedia signal processing systems. This talk will introduce challenges of multimedia perceptual quality assessment and some recent efforts in the expanation of several subjective visual effects using free energy model , catastrophe model and etc. Some open directions of the area will also be dicussed.
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
Abstract: Recent years have witnessed a rapidly growing interest in applying machine learning (ML) to medical imaging. Magnetic resonance imaging (MRI), as a powerful imaging modality for both scientific research and clinical diagnosis, has benefited a lot from its combination with ML in accelerating its imaging speed. Different from natural image restoration problems, ML based MR imaging involves special domain knowledge such as complex-valued、Fourier encoded, multi-contrast and multi-coiled, etc, which need to be exploited and combined with data-driven approaches. This talk will introduce the compelling challenges for such physics-driven ML approaches for fast and smart MRI imaging and our proposed possible solutions, ranging from learning theory ,imaging electronics to different imaging application scenarios. Some open questions have also been raised, which are critical for the future reliable and interpretable medical imaging systems.
Beihang University
Abstract: We develop a stochastic alternating structure-adapted proximal (s-ASAP) gradient descent method for solving the block optimization problems. By deploying some state-of-the-art variance reduced gradient estimators (rather than full gradient) in stochastic optimization, the s-ASAP method is applicable to nonconvex consensus optimization problems whose objectives are the sum of a finite number of Lipschitz continuous functions. The sublinear convergence rate of s-ASAP method is built upon the proximal point theory. Furthermore, the linear convergence rate of s-ASAP method can be attainable under some mild conditions on objectives, e.g., the error bound and the Kurdyka-Lojasiewicz (KL) property. Preliminary numerical simulations on some applications in image processing demonstrate the compelling performance of the proposed method.
University of Virginia
Abstract: In this talk, I will review recent work on the use of low-rank tensor models in multivariate probability, density estimation, supervised learning, and combinatorial optimization. We have recently shown that it is possible to learn high-order but low-rank multivariate distributions from low-order marginals, and that every multivariate categorical distribution can be generated by a (so-called) "naive" Bayes model. As it turns out, many real-life datasets can be fitted using distributions of very low rank. We have also proposed viewing sampling and supervised learning / system identification problems through the lens of low-rank tensor completion, which affords parsimonious modeling and sample-efficient learning with identification guarantees. Our most recent work explores the interplay between tensors and combinatorial optimization: it shows that every NP-complete problem can be cast as an instance of computing the minimum element of a tensor from its (two) rank-one factors. This exemplifies the modeling power of very low-rank tensors, and it also opens the door to a continuous multilinear problem relaxation whose empirical performance on the classic partition problem and other combinatorial optimization problems appears to be promising.
Shenzhen Research Institute of Big Data
Abstract: In this talk, I will give an introduction on a new method for topology optimization based on threshold dynamics method. Applications to linear elasticity, fluid network and porous media problems will be discussed.
Massachusetts Institute of Technology
Abstract: Reinforcement learning (RL) has had tremendous successes in many artificial intelligence applications. Many of the forefront applications of RL involve multiple agents, e.g., playing chess and Go games, autonomous driving, and robotics. Unfortunately, classical RL framework is inappropriate for multi-agent learning as it assumes an agent’s environment is stationary and does not take into account the adaptive nature of opponent behavior. In this talk, I focus on stochastic games for multi-agent reinforcement learning in dynamic environments and develop independent learning dynamics for stochastic games: each agent is myopic and chooses best-response type actions to other agents’ strategies independently, meaning without any coordination with her opponents. There has been limited progress on developing convergent best-response type independent learning dynamics for stochastic games. I will present our recently proposed independent learning dynamics that guarantee convergence in stochastic games, including for both zero-sum and identical-interest settings. Along the way, I will also reexamine some classical and recent results from game theory and RL literatures, to situate the conceptual contributions of our independent learning dynamics and the mathematical novelties of our analysis.
University of Electronic Science and Technology of China
Abstract: In recent years, data-driven deep learning methods have made great progresses with the support of computing power, and have been widely used in natural language processing, image processing, computer vision, and medical image analysis. However, due to the complexity of deep neural networks and the highly non-convex optimization of weight coefficients, deep learning algorithms inevitably lack good robustness and generalization, especially lack of interpretability. This limits the practical application of deep learning methods in medical image analysis. Compared with data-driven methods, knowledge-based mathematical modeling can effectively establish concise and transparent algorithms. This report focuses on medical image segmentation, and introduces a series of knowledge-based medical image segmentation and associated mathematical models and algorithms for gray-scale inhomogeneity correction. These methods have a solid theoretical foundation, and the mathematical models and algorithms are concise and transparent, so they have ideal interpretability. Their experimental results also verify the effectiveness of these algorithm and the advantages of segmentation accuracy.
Tsinghua-Berkeley Shenzhen Institute
Abstract: Distributed learning is an important topic in information theory, and is recently an active research area in machine learning. However, it is challenging to characterize the fundamental limit of distributed learning problems with communication constraint. Most of the current information theoretical works focused on applying random coding to obtain achievability results, where the optimality is hardly to be verified. Moreover, random coding schemes are computationally difficult to be applied in real federated learning scenarios. In this talk, we investigate the distributed hypothesis testing problem in AWGN channels. To address the computational issue, we propose to focus on coding schemes based on the empirical distributions instead of the original data. Under such formulation, we further propose a coding strategy based on the mixture of decode-and-forward and amplify-and-forward, where the achievable detection error exponent can be characterized and interpreted by information geometry. Moreover, we demonstrate the optimality of such an achievable error exponent by a genie-aided approach. Finally, we characterize the necessary amount of power to achieve the optimal error exponent.
Northwestern Polytechnical University
Abstract: Voice communication and human-machine interaction systems are facing more and more challenging environments where there exists not only strong noise, but reverberation, echo, and competing sources as well. How to acquire and deliver high-fidelity acoustic signals in such complicated environments has become a challenging problem, which involves the use of microphone and loudspeaker arrays and many acoustic signal processing technologies. In this talk, I will present a brief overview of the basic principles of sensing and processing of speech signals as well as the state-of-the-art in the field. I will then focus on discussing important problems such as high gain beamforming with small microphone and loudspeaker arrays.
Weizmann institute of Science
Abstract: Deep neural networks provide unprecedented performance gains in many real-world problems in signal and image processing. Despite these gains, the future development and practical deployment of deep networks are hindered by their black-box nature, i.e., a lack of interpretability and the need for very large training sets.
On the other hand, signal processing and communications have traditionally relied on classical statistical modeling techniques that utilize mathematical formulations representing the underlying physics, prior information and additional domain knowledge. Simple classical models are useful but sensitive to inaccuracies and may lead to poor performance when real systems display complex or dynamic behavior. Here we introduce various approaches to model based learning which merge parametric models with optimization tools and classical algorithms leading to efficient, interpretable networks from reasonably sized training sets. We will consider examples of such model-based deep networks to image deblurring, image separation, super resolution in ultrasound and microscopy, efficient communication systems, and finally we will see how model-based methods can also be used for efficient diagnosis of COVID19 using X-ray and ultrasound.
Sun Yat-sen University
Abstract: This talk will discuss various notions of hyperbolicities in complex geometry, with focus on our recent works from the perspective of topology.
Carnegie Mellon University
Abstract: With its vast potential to tackle some of the world’s most pressing problems, reinforcement learning (RL) is applied to transportation, manufacturing, security, and healthcare. As RL has started to shift towards deployment at a large scale, its rapid development is coupled with as much risk as benefits. Before consumers embrace RL-empowered services, researchers are tasked with proving their trustworthiness. In this talk, I will overview trustworthy reinforcement learning in three aspects: robustness, safety, and generalization. I will introduce taxonomies, definitions, methodologies, and popular benchmarks in each category. I will also share the lessons we learned and my outlook for future research directions.
Shenzhen Research Institute of Big Data
Abstract: Federated Learning (FL) is a promising privacy-preserving distributed learning paradigm but suffers from high communication cost when training large-scale machine learning models. Sign-based methods, such as SignSGD, have been proposed as a biased gradient compression technique for reducing the communication cost. However, sign-based algorithms could diverge under heterogeneous data, which thus motivated the development of advanced techniques, such as the error-feedback method and stochastic sign-based compression, to fix this issue. Nevertheless, these methods still suffer from slower convergence rates. Besides, none of them allows multiple local SGD updates like FedAvg. In this paper, we propose a novel noisy perturbation scheme with a general symmetric noise distribution for sign-based compression, which not only allows one to flexibly control the tradeoff between gradient bias and convergence performance but also provides a unified viewpoint to existing stochastic sign-based methods. More importantly, we propose the very first sign-based FedAvg algorithm (z-SignFedAvg). Theoretically, we show that z-SignFedAvg achieves a faster convergence rate than existing sign-based methods and, under the uniformly distributed noise, can enjoy the same convergence rate as its uncompressed counterpart. Extensive experiments are conducted to demonstrate that the z-SignFedAvg can achieve competitive empirical performance on real datasets.