Undecimated 2D Dual Tree Complex Wavelet Transforms

Dr Paul Hill, Dr Alin Achim and Professor Dave Bull

This work introduces two undecimated forms of the 2D Dual Tree Complex Wavelet Transform (DT-CWT) which combine the benefits of the Undecimated Discrete Wavelet Transform (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

The Discrete Wavelet Transform (DWT) is a spatial frequency transform that has been used extensively for analysis, denoising and fusion within image processing applications. It has been recognised that although the DWT gives excellent combined spatial and frequency resolution, the DWT suffers from shift variance. Various adaptations to the DWT have been developed to produce a shift invariant form. Firstly, an exact shift invariance has been achieved using the Undecimated Discrete Wavelet Transform (UDWT). However, the UDWT variant suffers from a considerably overcomplete representation together with a lack of directional selectivity. More recently, the Dual Tree Complex Wavelet Transform (DT-CWT) has given a more compact representation whilst offering near shift invariance. The DT-CWT also offers improved directional selectivity (6 directional subbands per scale) and complex valued coefficients that are useful for magnitude / phase analysis within the transform domain. This paper introduces two undecimated forms of the DT-CWT which combine the benefits of the UDWT (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

This image illustrates the three different 2D Dual Tree Complex Wavelet Transforms

nddtcwt

 

Matlab code download 

Implementations of three complex wavelet transforms can be downloaded below as mex matlab files. They have been compiled in 32bit and 64bit windows and 64bit linux formats. If you need an alternative format please mail me at paul.hill@bristol.ac.uk. Code updated 16/7/2014.

Please reference the following paper if you use this software

Hill, P. R., N. Anantrasirichai, A. Achim, M. E. Al-Mualla, and D. R. Bull. “Undecimated Dual-Tree Complex Wavelet Transforms.” Signal Processing: Image Communication 35 (2015): 61-70.

The paper is available here: http://www.sciencedirect.com/science/article/pii/S0923596515000715

A previous paper is here:

Hill, P.; Achim, A.; Bull, D., “The Undecimated Dual Tree Complex Wavelet Transform and its application to bivariate image denoising using a Cauchy model,” Image Processing (ICIP), 2012 19th IEEE International Conference on , vol., no., pp.1205,1208, Sept. 30 2012-Oct. 3 2012.

 

Matlab Code Usage

Forward Transform: NDxWav2DMEX
Backward Transform: NDixWav2DMEX
Useage:  w = NDxWav2DMEX(x, J, Faf, af, nondecimate);
     y = NDixWav2DMEX(w, J, Fsf, sf, nondecimate);
x,y - 2D arrays
J - number of decomposition 
Faf{i}: tree i first stage analysis filters 
af{i}:  tree i filters for remaining analysis stages
Fsf{i}: tree i first stage synthesis filters 
sf{i}:  tree i filters for remaining synthesis stages
Nondecimated: 0 (default) for original decimated version, 1 for completely decimated version, 2 for decimation of just first level.
w – wavelet coefficients
w{a}{b}{c}{d} - wavelet coefficients
        a = 1:J (scales)
        b = 1 (real part); b = 2 (imag part)
        c = 1,2; d = 1,2,3 (orientations)
w{J+1}{a}{b} - lowpass coefficients
        a = 1,2; b = 1,2 
 
 
Example of Usage: 
 
  % Original Decimated Version
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB;
  [af, sf] = dualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af,0);
  y = NDixWav2DMEX(w, J, Fsf, sf,0);
  err = x - y;
  max(max(abs(err)))
 
  % Decimated Version 1 (no decimation)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = NDAntonB2; %(Must use ND filters for both)
  [af, sf] = NDdualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af, 1);
  y = NDixWav2DMEX(w, J, Fsf, sf, 1);
  err = x - y;
  max(max(abs(err)))
 
  %Decimated Version 2 (decimation on only first level)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB; 
  [af, sf] = NDdualfilt1; %(Must use ND filters for just these)
  w = NDxWav2DMEX(x, J, Faf, af, 2);
  y = NDixWav2DMEX(w, J, Fsf, sf, 2);
  err = x - y;
  max(max(abs(err)))
 
% SIZE LIMITS
% (s/J^2) must be bigger than 5 (where s is both height and width)
% Height and width must be divisible by 2^J for fully decimated version
% Height and width must be divisible by 2 for nondecimated version 2


Hardware-accelerated Video Fusion

This projects aim at producing a low-power demonstrator for real-time video fusion using a hybrid SoC device that combines a low-power Cortex A9 multi-core processor and a FPGA fabric. The methodology involves using a fusion algorithm developed at Bristol based on Complex dual-tree wavelet transforms.  These transforms work in forward and inverse mode together with configurable fusion rules to offer high quality fusion output.

The complex dual-tree wavelet transforms represents  around 70% of total complexity. The wavelet accelerator designed at Bristol removes this complexity and accelerates the whole application by a factor of x4.  It also has a significant positive impact in overall energy. There is a negligible increase in power due to the fact that the fabric works in parallel with the main processor. Notice that if the optimization criteria is not performance or energy but power then the processor and fabric could reduce its clock frequency and voltage and obtain a significant reduction in power for the same energy and performance levels.

This project has built a system extended with frame capturing capabilities using thermal and visible light cameras.  In this link you can see the system working in our labs : hardware accelerated video fusion

This project has been funded by the Technology Strategy Board under their energy-efficient computers program with Qioptiq Ltd as industrial collaborator.

This research will be presented and demonstrated at FPL 2015, London in September.

Video super-resolution

Motion compensated video super-resolution is a technique that uses the sub-pixel shifts between multiple low resolution images of the same scene to create higher resolution frames with improved quality. An important concept is that due to the sub-pixel displacements of picture elements in the low resolution frames, it is possible to obtain high frequency content beyond the Nyquist limit of the sampling equipment. Super-resolution algorithms exploit the fact that as objects move in front of the camera sensor, picture elements captured in the camera pixels might not be visible in the next frame if the movement of the element does not extend to the next pixel. Super-resolution algorithms track and position these additional picture elements in the high-resolution frame. The resulting video quality is significantly improved compared with techniques that only exploit the information in one low-resolution frame to create one high resolution frame.

Super-Resolution techniques can be applied to many areas, including intelligent personal identification, medical imaging, security, surveillance and can be of special interest in applications that demand low-power and low-cost sensors. The key idea is that increasing the pixel size improves the signal to noise ratio and reduces the cost and power of the sensor.  Larger pixels enable more light to be collected and in addition the blur introduced by diffraction is reduced. Diffraction is a bigger issue with smaller pixels, so again sensors with larger pixels will perform better, giving sharper images with higher contrast in the fine details, especially in low-light conditions.

Benefits include that increasing the pixel size means that fewer pixels can be located in the sensor and this reduces the sensor resolution.  The low-resolution sensor needs to process and transmit a lower amount of information which results in lower power and cost.  Super-resolution algorithms running in the receiver side can then be used to recover high-quality and high-resolution videos maintaining a constant frame rate.

Overall, super-resolution enables the system that captures and transmits the video data to be based on low-power and low-cost components while the receiver still obtains a high-quality video stream.

This project has been sponsored by the Centre for Defence Enterprise and DSTL under the Generic Enablers for Low-Size, Weight, Power and Cost (SWAPC) Intelligence, Surveillance, Target Acquisition and Reconnaissance (ISTAR) program.

Click to see some examples :

1:  before  car number plate in and after super-resolution car number plate SR

2:  before vehicles in and after super-resolution vehicles SR

and learn about the theory behind the algorithm:  Chen, J, Nunez-Yanez, JL & Achim, A 2014, ‘Bayesian video super-resolution with heavy-tailed prior models’. IEEE Transactions on Circuits and Systems for Video Technology, vol 24., pp. 905-914

Generic motion based object segmentation for assisted navigation

CASBliP – Computer Aided System for the Blind

casIn the CASBliP project, a robust approach to annotating independently moving objects captured by head mounted stereo cameras that are worn by an ambulatory (and visually impaired) user is proposed. Initially, sparse optical flow is extracted from a single image stream, in tandem with dense depth maps. Then, using the assumption that apparent movement generated by camera egomotion is dominant, flow corresponding to independently moving objects (IMOs) is robustly segmented using MLESAC. Next, the mode depth of the feature points defining this flow (the foreground) are obtained by aligning them with the depth maps. Finally, a bounding box is scaled proportionally to this mode depth and robustly fit to the foreground points such that the number of inliers is maximised. The system runs at around 8 fps and has been tested by visually impaired volunteers.

For more information, see CASBliP – Computer Aided System for the Blind.

Human pose estimation using motion

Ben Daubney, David Gibson, Neill Campbell

Currently we are researching how to extract human pose from a sparse set of moving features. This work is inspired from psychophisical experiments using thehumanpose Moving Light Display (MLD), where it has been shown that a small set of moving points attached to the key joints of a person could convey a wealth of information to an observer about the person being viewed, such as their mood or gender. Unlike the typical MLD’s used in the physchophysics community ours are automatically generated by applying a standard feature tracker to a sequence of images.

The result is a set of features that are far more noisy and unreliable than those tradtionally used. The purpose of this research is to try to better understand how the temporal dimension of a sequence of images can be exploited at a much lower level than currently used to estimate pose.

Analysis of moth camouflage

mothcam

David Gibson, Neill Campbell

A half million pound BBSRC collaboration with Biological sciences and experimental Psychology, the aim of this project is to develop a computational theory of animal camouflage, with models specific to the visual systems of birds and humans. Moths have been chosen for this study as they are a particularly good demonstrators of a wide range of cryptic and disruptive camouflage in nature. Using psychophysically plausible low-level image features, learning algorithms are used to determine the effectiveness of camouflage examples. The ability to generate and process large numbers of camouflage examples enables predictive computational models to be created and compared to the performance of human and bird subjects. Such comparisons will give insights into what aspects of moth camouflage are important for avoiding detection and recognition by birds and humans and thereby, give insight into the mechanisms being employed by bird and human visual systems