April 29, 2022 | |||
---|---|---|---|
Time | Speaker | Title | Chair |
8:40--9:00 | Zhi-Quan Luo | Welcome Speech | Xiao-Ping Wang |
9:00--9:35 | Xue-Cheng Tai | Simplified energy landscape for modularity using total variation for data analysis | |
9:35--10:10 | Dinggang Shen | 人工智能在PET重建、分析及临床中的应用 | |
10:10--10:30 | Break | ||
10:30--11:05 | Michael Kwok-Po Ng | Hyperspectral Image Processing | Xiang Wan |
11:05--11:40 | S. Kevin Zhou | Traits and Trends of AI in Medical Imaging | |
11:40--13:30 | Lunch Break | ||
13:30-14:05 | Bin Dong | Data- and Task-Driven CT Imaging by Deep Learning | Zhonghua Qiao |
14:05--14:40 | Jian Sun | Deep Learning in Non-Euclidean Space | |
14:40--15:15 | Hui Ji | Self-supervised deep learning for Inverse Problems in Imaging | |
15:15--15:45 | Break | ||
15:45--16:20 | Chenglong Bao | Unsupervised Deep Learning Meets Chan-Vese Mode | Ming Yan |
16:20-16:55 | Dong Wang | The iterative convolution-thresholding method for image segmentation | |
16:55-17:30 | Hao Liu | Elastica Models for Color Image Regularization |
Hong Kong Baptist University
Abstract: Networks capture pairwise interactions between entities and are frequently used in applications such as social networks, food networks, and protein interaction networks, to name a few. Communities, cohesive groups of nodes, often form in these applications, and identifying them gives insight into the overall organization of the network. One common quality function used to identify community structure is modularity. In Hu et al. [SIAM J. Appl. Math., 73 (2013), pp. 2224-2246], it was shown that modularity optimization is equivalent to minimizing a particular nonconvex total variation (TV) based functional over a discrete domain. They solve this problem- assuming the number of communities is known-using a Merriman-Bence-Osher (MBO) scheme. We show that modularity optimization is equivalent to minimizing a convex TV-based functional over a discrete domain-again, assuming the number of communities is known. Furthermore, we show that modularity has no convex relaxation satisfying certain natural conditions. We therefore find a manageable nonconvex approximation using a Ginzburg-Landau functional, which provably converges to the correct energy in the limit of a certain parameter. We then derive an MBO algorithm that has fewer hand-tuned parameters than in Hu et al. and that is seven times faster at solving the associated diffusion equation due to the fact that the underlying discretization is unconditionally stable. Our numerical tests include a hyperspectral video whose associated graph has 2.9\times 107 edges, which is roughly 37 times larger than what was handled in the paper of Hu et al.
ShanghaiTech University
Shanghai United Imaging Intelligence Co., Ltd.
Abstract: 本讲座会讲述人工智能在PET重建、分析及临床应用中的工作,包括快速低剂量 PET 图像重建、基于多模态/分子 PET 的老年痴呆症诊断,以及PET-CT 在癌症自动检测中的应用。
The University of Hong Kong
Abstract: In this talk, I share some of my recent results in hyperspectral image processing based on model-based optimization methods, e.g., denoising, demosaicing, and destriping, etc. Also learning-based models for hyperspectral image denoising are studied and reported.
Suzhou Institute for Advanced Research
University of Science and Technology of China, Suzhou China
Abstract: Artificial intelligence or deep learning technologies have gained prevalence in solving medical imaging tasks. In this talk, we first review the traits that characterize medical images, such as multi-modalities, heterogeneous and isolated data, sparse and noisy labels, imbalanced samples. We then point out the necessity of a paradigm shift from "small task, big data" to "big task, small data". Finally, we illustrate the trends of AI technologies in medical imaging and present a multitude of algorithms that attempt to address various aspects of “big task, small data”:
Annotation-efficient methods that tackle medical image analysis without many labelled instances, including one-shot or label-free inference approaches.
Universal models that learn “common + specific” feature representations for multi-domain tasks to unleash the potential of ‘bigger data’, which are formed by integrating multiple datasets associated with tasks of interest into one use.
“Deep learning + knowledge modeling" approaches, which combine machine learning with domain knowledge to enable start-of-the-art performances for many tasks of medical image reconstruction, recognition, segmentation, and parsing.
Peking University
Abstract: In this talk, I will start with a brief review of the dynamics and optimal control perspective on deep learning. Then, I will present some of our recent studies on how this perspective may help us to advance CT imaging and image-based diagnosis further. Specifically, I will focus on how to combine the wisdom from mathematical modeling with ideas from deep learning. Such combination leads to new data-driven/task-driven image reconstruction models and new data-driven scanning strategies for CT imaging, and with a potential to be generalized to other imaging modalities. I will also briefly discuss how machine learning may advance computational imaging and some of the challenge we are facing.
Xi’an Jiaotong University
Abstract: The traditional deep networks are commonly defined in Euclidean space, either in the 3D / 2D image space or sequential data space. However, in realistic scenario, the data maybe irregular or distributed on manifold / graph. In such cases, the traditional deep network does not fully take advantages of the underlying data structure in non-Euclidean space. Along this research direction, in this talk, I will introduce the research backgrounds, advances in research on geometric deep learning approach in the non-Euclidean space, with applications to 3D object recognition, image segmentation and domain adaptation.
National University of Singapore
Abstract: In last few years, deep learning has become a prominent tool for solving many challenging problems in imaging science. Most existing methods are supervised over a dataset with ground-truth images to learn how to predict a truth image from the collected measurement. Such a prerequisite on the access to many ground-truth images limits the wider applicability of deep learning in many domains, e.g. medicine and science. Recently, there is an increasing interest on powerful deep learning methods for solving inverse imaging problems without requiring the access to truth images. In this talk, we will introduce a general self-supervised deep learning framework for solving inverse imaging problems, where the main ingredient is the neutralization of Bayesian inference to inverse problems and data augmentation techniques for handling noisy labels. Extensive experiments showed that the proposed self-supervised deep learning method can compete well against existing supervised-learning-based solutions to many tasks in imaging science, e.g., image denoising, compressed sensing, and phase retrieval.
Tsinghua University
Abstract: The Chan-Vese (CV) is a classical method in image segmentation that imposes constant assumptions on each region of interest. However, this assumption does not hold in complex scenes leading to inferior results. In this talk, we propose an unsupervised image segmentation approach that integrates the CV model with deep neural networks for improving the original CV model’s segmentation accuracy. Based on the variational inference framework, we will discuss two typical settings, including single image segmentation and dataset segmentation. Finally, experiments validate the effectiveness of our model.
The Chinese University of Hong Kong, Shenzhen
Abstract: In this talk, we will introduce an efficient iterative method for image segmentation which can be applied into general models including both region based and edge based
variational models. In the method, the segments are represented by its indicator functions and the objective functionals are then approximated by heat kernel convolutions using indicator functions in a concave form. The approximate problem can then be solved efficiently by sequential linear programming. In each iteration, one only needs to evaluate some convolutions followed by thresholding. Numerical experiments show that the method can achieve hundreds of times acceleration compared to level set-based methods for a wide range of image segmentation models. Theoretical guarantee on the convergence will also be discussed if time permits.
Hong Kong Baptist University
Abstract: Models related to the Euler’s elastica energy have proven to be useful for many applications including image processing. Extending elastica models to color images and multichannel data is a challenging task, as stable and consistent numerical solvers for these geometric models often involve high order derivatives. Like the single channel Euler’s elastica model and the total variation (TV) models, geometric measures that involve high order derivatives could help when considering image formation models that minimize elastic properties. In the past, the Polyakov action from high energy physics has been successfully applied to color image processing. Here, we introduce an addition to the Polyakov action for color images that minimizes the color manifold curvature. The color image curvature is computed by applying of the Laplace–Beltrami operator to the color image channels. When reduced to gray-scale images, while selecting appropriate scaling between space and color, the proposed model minimizes the Euler’s elastica operating on the image level sets. We also present an operator-splitting method to minimize the proposed functional. The efficiency and robustness of the proposed method are demonstrated by systematic numerical experiments.
City University of Hong Kong
Abstract: Hyperspectral images often have hundreds of spectral bands of different wavelengths captured by aircraft or satellites. Identifying detailed classes of pixels becomes feasible due to the enhancement in spectral and spatial resolution of hyperspectral images. In this work, we propose a novel framework that utilizes both spatial and spectral information for classifying pixels in hyperspectral images. The method consists of three stages. In the first stage, the pre-processing stage, the Nested Sliding Window algorithm is used to reconstruct the original data by enhancing the consistency of neighboring pixels, and then Principal Component Analysis is used to reduce the dimension of data. In the second stage, Support Vector Machines are trained to estimate the pixel-wise probability map of each class using the spectral information from the images. Finally, a smoothed total variation model is applied to smooth the class probability vectors by ensuring spatial connectivity in the images. We demonstrate the superiority of our method against three state-of-the-art algorithms on six benchmark hyperspectral data sets with 10 to 50 training labels for each class. The results show that our method gives the overall best performance in accuracy. Especially, our gain in accuracy increases when the number of labeled pixels decreases. Therefore, our method is of great practical significance since expert annotations are often expensive and difficult to collect.
The University of Hong Kong
Abstract: Unsupervised domain adaptation (UDA) in semantic segmentation is a fundamental yet promising task relieving the need for laborious annotation. In the first part, I introduce a novel UDA pipeline that unifies image-level alignment and category-level feature distribution regularization in a coarse-to-fine manner. Experimental results show that our proposed pipeline improves the generalization capability of the final segmentation model and significantly outperforms previous state-of-the-art methods. Given the ability to exploit long-term dependencies, transformers are promising to help atypical convolutional neural networks to overcome their inherent shortcomings of spatial inductive bias. However, most recently proposed transformer-based segmentation approaches simply treated transformers as assisted modules to help encode global context into convolutional representations. To address this issue, in the second part, I introduce nnFormer (i.e., not-another transFormer), a 3D transformer for volumetric medical image segmentation. nnFormer not only exploits the combination of interleaved convolution and self-attention operations, but also introduces local and global volume-based self-attention mechanism to learn volume representations. Moreover, nnFormer proposes to use skip attention to replace the traditional concatenation/summation operations in skip connections in U-Net like architecture. Experiments show that nnFormer significantly outperforms previous transformer-based counterparts by large margins on three public datasets. Compared to nnUNet, nnFormer produces significantly lower HD95 and comparable DSC results.
University of Electronic Science and Technology of China
Abstract: 医学影像分割在基于医学影像的医学研究和临床应用中都有十分广泛和重要的应用,在许多医学影像的应用中都是最关键也是最困难的一步。医学图像质量的问题,包括噪音,目标边界模糊,灰度不均匀,以及复杂的图像内容,都会给图像分割带来困难。对于不同成像方式以及不同部位的影像,通常需要选择不同的分割算法。即使对同样的图像,不同用户需要分割的目标可能不一样,所采用的分割算法也可能不一样,很难有统一的分割算法。基于变分原理的数学模型与算法已经被广泛应用于医学影像分割,但这些方法一般都要解一个非凸能量泛函的极小化问题,而用户所要的解往往都不是全局最小,而是某个局部最小。为了克服这个困难,可以在图像分割的数学模型中融入相关领域知识(如成像知识、解剖学知识,等等),从而达到提高分割的精度与鲁棒性的目的。本报告以心脏磁共振图像的心室分割和大脑磁共振图像的脑组织分割为例,介绍报告人多年来研究的基于成像知识和解剖学知识的医学影像分割的数学模型,以及近期研究成果。
Nanjing University
Abstract: Deformable image registration is a widely used technique in the field of computer vision and medical image processing. Basically, the task of deformable image registration is to find the displacement field between the moving image and the fixed image. Many variational models are proposed for deformable image registration, under the assumption that the displacement field is continuous and smooth. However, displacement fields may be discontinuous, especially for medical images with intensity inhomogeneity, pathological tearing slices, slide motions, heavy noises and overlapped tissues. In the mathematical theory of elastoplasticity, when the displacement fields are possibly discontinuous, a suitable framework for describing the displacement fields is the space of functions of bounded deformation (BD). Inspired by this, we propose some novel deformable registration models, called the BD and BGD (bounded generalized deformation) models, which allow discontinuities of displacement fields in images. The BD and BGD models are formulated in variational frameworks by supposing the displacement fields to be functions of BD or BGD. The existence of solutions of these models is proven. Efficient and rapid algorithms are proposed. Numerical experiments on 2D images show that the BD and BGD models outperform the classical demons model, the log-domain diffeomorphic demons model, and the state-of-the-art vectorial total variation model. Numerical experiments on two public 3D databases show that the target registration error of the BD model is competitive compared with morethan ten other models. This is joint work with Ziwei Nie, Chen Li and Hairong Liu.
Chinese Academy of Sciences
Abstract: Medical big data mainly includes electronic health record data, medical image data, gene information data, etc. Among them, medical image data accounts for the vast majority of medical data at this stage. How to apply medical big data to clinical practice? This is a problem of great concern to medical and computer researchers, and Brain-Inspired computing and deep learning provide a good answer. Combined with the latest research progress of medical image big data analysis and the work of our research group in the field of medical image big data analysis, this report introduces the application of Brain-Inspired computing and deep learning in the field of medical big data analysis and early disease diagnosis.
The Chinese University of Hong Kong, Shenzhen
Abstract: 3D human digitalization is now paid much more attention as the Metaverse gains traction. One of the challenge yet important problem is to recover 3D geometry based on consumer-level sensors such as RGB cameras. Although the reconstruction of the human face and human body has been widely studied, the 3D digitization of clothing has received little attention. This is mainly due to 1) it lacks 3D data of garments; 2) clothes have very high geometric complexity. In this talk, I will introduce our recent research progress on 3D garment reconstruction from single-view images. The related works have been published at ECCV 2020 (Oral) and CVPR 2022.
Southern University of Science and Technology
Abstract: Face alignment is a prerequisite in many computer vision tasks, such as face recognition, facial expression recognition, face verification, face reconstruction and face reenactment. Most existing face alignment methods can only deal with the specific annotation scheme adopted by the training dataset of interest, but cannot flexibly accommodate multiple annotation schemes. To address this problem, we propose an innovative, flexible and consistent cross-annotation face alignment framework, LDDMM-Face, the key contribution of which is a deformation layer that naturally embeds facial geometry in a diffeomorphic way.
Michigan State University
Abstract: The low-rank and sparse matrix decomposition from noisy observations is known as robust principal component analysis (RPCA), and it has applications in signal, image, and video processing. Most existing algorithms can be classified into two categories: SVD-type methods and matrix decomposition methods. SVD-type methods are slow when the matrix is big because of the low speed of SVD; while matrix decomposition methods require an accurate estimation of the true rank. In this talk, I will present a new way to combine both methods with an upper bound on the rank. It is faster than SVD-type methods because SVD is applied on small matrices; it is not sensitive on the rank estimation than matrix decomposition methods. I will use two numerical examples to demonstrate the benefits of this approach.
Sun Yat-sen University
Abstract: Deep learning based models rely on a large amount of ideal labeled data for its high recognition performance, which require huge labor costs. Semi-supervised learning based methods aim to study the effective use of massive unlabeled data and a small amount of labeled ones to upgrade existing models and improve their performance and generalization. Existing semi-supervised learning based methods usually stipulate that the labeled and unlabeled training data come from the same large data set, i.e., conforms to the closed scene assumption. However, in the application of actual open scenarios, there may be differences in appearance distribution between unlabeled data and labeled data, unlabeled data may contain unknown categories, and the distribution of samples of different categories may be unbalanced. In this talk, I will first briefly introduce the inherent unity of methods designed in domain adaptation, open set recognition, noisy labeling and semi-supervised learning; and then present our recent efforts on feature norm based domain adaptation method, self-supervised open-set semi-supervised learning, and contrastive learning based noise label discrimination and correction. Finally, I will discuss the key issues still to be solved in open scene semi-supervised learning and possible research directions.
Sichuan University
Abstract: 近年来,机器学习在计算机视觉、图像处理领域已经取得了丰硕的成果,但是在医学成像领域,主要的应用研究仍然集中在图像分析,在成像领域的研究才刚刚起步。同时,这些技术在推广过程中,面临泛化性、可解释性等一系列问题。本报告主要介绍针对这些问题,课题组在该领域的一系列工作及研究进展,主要包括低剂量CT重建、快速MRI重建等相关工作,最后将展望基于机器学习的智能成像方法未来发展的趋势。
Shenzhen Research Institute of Big Data
Abstract: Cued Speech (CS) is a communication system for the deaf or hearing impaired in which the speaker assists lipreading by handshapes (encoding consonants) and hand positions (encoding vowels) to provide complementary representations of lip visemes at the phoneme level. Compared with lip-reading and sign language, it ensures a more complete visual information expression in a simpler and clearer way, which is suitable for human-computer interaction. The extraction of lips, hand shape, and hand position features is a key step in the automatic recognition of CS, but the extraction of hand shape features is one of the difficulties in this field. In addition, the lack of well-annotated CS data limits its recognition effect. First, we propose a method based on self-supervised contrastive learning to learn the feature representation of handshape to reduce the model's dependence on data annotation. Then, combining Bi-LSTM and self-attention network, the model further learns continuous hand shape features with temporal and contextual information. Lastly, we propose a novel automatic CS recognition method based on cross-modal knowledge distillation. It transfers the audio speech information extracted from the teacher model to the CS (student model). A multi-task joint loss function based on the multi-task homoscedastic uncertainty theory is investigated to automatically learn the inter-task correlation coefficient. Furthermore, we build the first British English CS dataset, which includes a total of 390 sentences from five speakers. Experimental results show that our model outperforms the state-of-the-art results of the previous work.
The Chinese University of Hong Kong
Abstract: Blind image deblurring is a challenging task in imaging science where we need to estimate the latent image and blur kernel simultaneously. To get a stable and reasonable deblurred image, proper prior knowledge of the latent image and the blur kernel is urgently required. In this talk, we address several of our recent attempts related to image deblurring. Indeed, different from the recent works on the statistical observations of the difference between the blurred image and the clean one, we first report the surface-aware strategy arising from the intrinsic geometrical consideration. This approach facilitates the blur kernel estimation due to the preserved sharp edges in the intermediate latent image. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods on deblurring the text and natural images. Moreover, we discuss the Quaternion-based method for color image restoration. After that, we extend the quaternion approach for blind image deblurring.
The Hong Kong Polytechnic University
Abstract: In this work, we propose an active contour model with a local variance force (LVF) term that can be applied to multi-phase image segmentation problems. With the LVF, the proposed model is very effective in the segmentation of images with noise. To solve this model efficiently, we represent the regularization term by characteristic functions and then design a minimization algorithm based on a modification of the iterative convolution-thresholding method (ICTM), namely ICTM-LVF. This minimization algorithm enjoys the energy-decaying property under some conditions and has highly efficient performance in the segmentation. To overcome the initialization issue of active contour models, we generalize the inhomogeneous graph Laplacian initialization method (IGLIM) to the multi-phase case and then apply it to give the initial contour of the ICTM-LVF solver. Numerical experiments are conducted on synthetic images and real images to demonstrate the capability of our initialization method, and the effectiveness of the local variance force for noise robustness in the multi-phase image segmentation.
Shanghai Jiaotong University
Abstract: Image reconstruction from down-sampled and corrupted measurements, such as fast MRI and low dose CT, is mathematically ill-posed inverse problem. Deep neural network (DNN) has been becoming a prominent tool in the recent development of medical image reconstruction methods. In this talk, I will introduce two work on incorporating classical image reconstruction method and deep learning methods. In the first work, in order to address the intractable inversion of general inverse problems, we propose to train a network to refine intermediate images from classical reconstruction procedure to the ground truth, i.e. the intermediate images that satisfy the data consistence will be fed into some chosen denoising networks or generative networks for denoising and removing artifact in each iterative stage. In the first work, we proposed a multi-scale DNN for sparse view CT reconstruction, which directly learns an interpolation scheme to predict the complete set of 2D Fourier coefficients in Cartesian coordinates from the given measurements in polar coordinates. In the second work, we proposed an unsupervised deep learning method for LDCT image reconstruction, which does not require any external training data. The proposed method is built on a re-parameterization technique for Bayesian inference via deep network with random weights, combined with additional total variational~(TV) regularization. The experiments on both sparse CT and low dose CT problem show that the proposed method provided state-of-the-art performance.
Tsinghua University
Abstract: Surface reconstruction from 3D point cloud is an important problem in computer graphic and has lots of applications. Usually the surface is reconstructed by iso-surface of a sign distance function. In this talk, I will present several methods to compute the sign distance function from famous Gauss formula. The key observation is that the indicator function can be given explicitly from Gauss formula. Based on this explicit integral formula, we develop three methods for surface reconstruction.
(1) for point cloud with normal, the sign distance function is obtained directly by a modified Gauss formula.
(2) for point cloud without normal, the sign distance function is given by solving the integral equation given by Gauss formula.
(3) based on the guidance of Gauss formula, we can construct a neural network to learn the sign distance function.
The performance of these Gauss formula based methods are compared with state-of-the-art methods by extension experiments.
Tsinghua University
Abstract: In this talk, we focus on image restoration/segmentation/super-resolution. We propose some new models for image restoration/segmentation/super-resolution. We then employ the tailored-finite-point method (TFPM), to solve the associated equation based on variational principle. Numerical experiments are presented to demonstrate the effectiveness of the proposed models and its features. f our model.
The Chinese University of Hong Kong
Abstract: Computational Quasiconformal (CQC) Geometry studies the deformation pattern between shapes. It has found important applications in imaging science, such as image registration, image analysis and image segmentation. With the advance of deep learning techniques, the incorporation of CQC theories to deep neural network can further improve the performance for these imaging tasks in both the efficiency and accuracy. In this talk, I will give an overview on how CQC and deep learning can play an important role in image processing.
Nankai University
Abstract: In this talk, I will present a very recent work, a sparse feature segmentation network for tasks where the target objects are sparsely distributed and the background is hard to be mathematically modeled. We start from an image decomposition model with sparsity regularization, and propose a deep unfolding network, namely IDNet, based on an iterative solver, scaled alternating direction method of multipliers (scaled-ADMM). The IDNet splits raw inputs into double feature layers. Then a new task-oriented segmentation network is constructed, dubbed as IDmUNet, based on the proposed IDNets and a mini-UNet. This IdmUNet combines the advantages of mathematical modeling and data driven approaches. Firstly, our approach has mathematical interpretability and can achieve favorable performance with far fewer learnable parameters. Secondly, our IDmUNet is robust in a simple end-to-end training with explainable behaviors. In the experiments of retinal vessel segmentation (RVS), IDmUNet produces the state-of-the-art results with only 0.07m parameters, whereas SA-UNet, one of the latest variants of UNet, contains 0.54m and the original UNet 31.04m.Moreover, the training procedure of our network converges faster without overfitting phenomenon. The paper is now posted on arXiv.org.
The Hong Kong University of Science and Technology
Abstract: Artificial intelligence, especially deep learning with large-scale annotated datasets, has dramatically advanced the recognition performance in many domains including speech recognition, visual recognition and natural language processing. Despite its breakthroughs in above domains, its application to medical image analysis remains yet to be further explored, where large-scale fully and high-quality annotated datasets are not easily accessible. In this talk, I will share our recent progress on developing label-efficient deep learning methods by leveraging an abundance of weakly-labeled and/or unlabeled datasets for medical image analysis, with versatile applications to disease diagnosis, lesion detection and segmentation.
The Chinese University of Hong Kong, Shenzhen
Abstract: Recently, AlphaFoldV2 achieved the great success in protein structure analysis by taking advantage of appealing 3D vision technologies. On the other hand, the rapid growth and development of medical big data necessitate applying newly appealing deep learning methods for accurate and accelerated medical data analysis. Considering current medical big data is usually multimodal, noisy and limited to small number of available high-quality training samples, conventional machine learning algorithms seem to be less robust and effective. Under this particular circumstance, my research mainly focuses on how to exploit deep learning methods to assist medical big data analysis. To be more specific, in this talk, I am going to present my recent works for 3D vision analysis and its applications for medical big data. At the end of my talk, I would also like to introduce some other undergoing AI interdisciplinary topics in my research and future plans as well.
Shenzhen Research Institute of Big Data
Abstract: Recently deep neural networks (DNNs) have witnessed great successes in computer vision and image processing. However, neural network models are threatened by adversarial attacks which synthesize imperceptible perturbations to mislead DNN models. In this talk, I will present a series of robust neural network models for image understanding and processing task. These DNN models are robust against adversarial attacks. Besides, I will introduce an application of adversarial training in semi-supervised medical image detection.