Regularized Deep Networks

Strengthening AI With Domain Knowledge

Research Highlights ofInformation Processing & Algorithms Laboratory

The Information Processing and Algorithms Laboratory (iPAL) is directed by Prof. Vishal Monga. Graduate research in iPAL broadly encompasses signal and image processing theory and applications with a particular focus on capturing practical real-world constraints via convex optimization theory and algorithms.

RECENT PROJECTS

CVPRW'19 WINNER!Dense Scene Information Estimation Network For Dehazing

This project proposed a scene information estimation network for dehazing for challenging benchmark image datasets of NTIRE'19 and NTIRE'18. The proposed networks At-DH' and AtJ-DH' can outperform state-of-the-art alternatives, especially when recovering images corrupted by dense hazes.

CVPRW'19 RUNNER-UP!Dense '123' Color Enhancement Dehazing Network

A DenseNet based dehazing network focusing on the recovery of the color information that comprises of: a common DenseNet based feature encoder whose output branches into three distinct DensetNet based decoders to yield estimates of the R, G and B color channels of the image.

Deep Wavelet Coefficients Prediction for Super-resolution

Recognizing that a wavelet transform provides a “coarse” as well as “detail” separation of image content, we design a deep CNN to predict the “missing details” of wavelet coefficients of the low-resolution images to obtain the Super-Resolution (SR) results, which we name Deep Wavelet Super-Resolution (DWSR). Out network is trained in the wavelet domain with four input and output channels respectively. The input comprises of 4 sub-bands of the low resolution wavelet coefficients and outputs are residuals (missing details) of 4 sub-bands of high resolution wavelet coefficients. Wavelet coefficients and wavelet residuals are used as input and outputs of our network to further enhance the sparsity of activation maps. A key benefit of such a design is that it greatly reduces the training burden of learning the network that reconstructs low frequency details. The output prediction is added to the input to form the final SR wavelet coefficients. Then the inverse 2d discrete wavelet transformation is applied to transform the predicted details and generate the SR results. We show that DWSR is computationally simpler and yet produces competitive and often better results than state-of-the-art alternatives.

Orthogonally Regularized Deep Networks

We propose a novel network structure for learning the SR mapping function in an image transformation domain, specifically discrete cosine transformation (DCT). The DCT is integrated into the network structure as a convolutional DCT (CDCT) layer which is trainable while maintaining its orthogonality properties with the orthogonality constraints. This orthogonally regularized deep SR network (ORDSR) simplifies the manifold of the SR task by taking advantage of DCT domain. Moreover, the CDCT layer generates the DCT frequency maps allowing the ORDSR to focus on reconstructing the fine details from the LR input.

Deep Image Super-resolution Via Natural Image Priors

We explore the use of image structures and physically meaningful priors in deep structures in order to achieve bet- ter performance. We address the problem of super- resolution from a deep learning standpoint when abundant training is not available. We propose to regularize deep struc- tures with prior knowledge about the images so that they can capture more structural information from the same limited data.

Deep MR Image Super-Resolution Using Structural Priors

Unlike regular optical imagery, for MR image super-resolution generous training is often unavailable. We therefore propose the use of image priors, namely a low-rank structure and a sharpness prior to enhance deep MR image super-resolution. Our contributions are then incorporating these priors in an analytically tractable fashion in the learning of a convolutional neural network (CNN) that accomplishes the super-resolution task. Experiments performed on two publicly available magnetic resonance (MR) brain image databases exhibit promising results particularly when training imagery is limited.

Simultaneous Decomposition and Classification Network

We propose a Simultaneous Decomposition and Classification Network (SDCN) to eliminate noise interference, enhancing the classification accuracy. The network contains of two sub- networks: the decomposition sub-network handles denoising, while the classification sub-network discriminates targets from confusers. Importantly, both sub-networks are jointly trained as an end-to-end model. Experimental results show that the proposed network significantly outperforms a network without decomposition and SRC-related methods.

Deep Networks With Shape Priors For Nucleus Detection

Nuclei detection has been a topic of enduring interest with promising recent success shown by deep learning methods. These methods train for example convolutional neural networks (CNNs) with a training set of input images and known, labeled nuclei locations. Many of these methods are supplemented by spatial or morphological processing. We develop a new approach that we call Shape Priors with Convolutional Neural Networks (SP-CNN) to perform significantly enhanced nuclei detection.

Deep Retinal Image Segmentation Under Geometrical Priors

Vessel segmentation of retinal images is a key diagnostic capability in ophthalmology. This problem faces several challenges including low contrast, variable vessel size and thickness, and presence of interfering pathology such as micro-aneurysms and hemorrhages. Early approaches addressing this problem employed hand-crafted filters to capture vessel structures, accompanied by morphological post-processing. More recently, deep learning techniques have been employed with significantly enhanced segmentation accuracy. We propose a novel domain enriched deep network that consists of two components: 1) a representation network that learns geometric features specific to retinal images, and 2) a custom designed computationally efficient residual task network that utilizes the features obtained from the representation layer to perform pixel-level segmentation. The representation and task networks are {\em jointly learned} for any given training set. To obtain physically meaningful and practically effective representation filters, we propose two new constraints that are inspired by expected prior structure on these filters: 1) orientation constraint that promotes geometric diversity of curvilinear features, and 2) a data adaptive noise regularizer that penalizes false positives. Multi-scale extensions are developed to enable accurate detection of thin vessels.

Group Based Deep Shared Feature Learning for Fine-grained Image Classification

Fine-grained image classification has emerged as a significant challenge because objects in such images have small inter-class visual differences but with large variations in pose, lighting, and viewpoints, etc. We present a new deep network architecture that explicitly models shared features and removes their effect to achieve enhanced classification results. Experiments on benchmark datasets show that GSFL-Net can enhance classification accuracy over the state of the art with a more interpretable architecture.

Email
ipal.psu@gmail.com