Video super-resolution

Motion compensated video super-resolution is a technique that uses the sub-pixel shifts between multiple low resolution images of the same scene to create higher resolution frames with improved quality. An important concept is that due to the sub-pixel displacements of picture elements in the low resolution frames, it is possible to obtain high frequency content beyond the Nyquist limit of the sampling equipment. Super-resolution algorithms exploit the fact that as objects move in front of the camera sensor, picture elements captured in the camera pixels might not be visible in the next frame if the movement of the element does not extend to the next pixel. Super-resolution algorithms track and position these additional picture elements in the high-resolution frame. The resulting video quality is significantly improved compared with techniques that only exploit the information in one low-resolution frame to create one high resolution frame.

Super-Resolution techniques can be applied to many areas, including intelligent personal identification, medical imaging, security, surveillance and can be of special interest in applications that demand low-power and low-cost sensors. The key idea is that increasing the pixel size improves the signal to noise ratio and reduces the cost and power of the sensor.  Larger pixels enable more light to be collected and in addition the blur introduced by diffraction is reduced. Diffraction is a bigger issue with smaller pixels, so again sensors with larger pixels will perform better, giving sharper images with higher contrast in the fine details, especially in low-light conditions.

Benefits include that increasing the pixel size means that fewer pixels can be located in the sensor and this reduces the sensor resolution.  The low-resolution sensor needs to process and transmit a lower amount of information which results in lower power and cost.  Super-resolution algorithms running in the receiver side can then be used to recover high-quality and high-resolution videos maintaining a constant frame rate.

Overall, super-resolution enables the system that captures and transmits the video data to be based on low-power and low-cost components while the receiver still obtains a high-quality video stream.

This project has been sponsored by the Centre for Defence Enterprise and DSTL under the Generic Enablers for Low-Size, Weight, Power and Cost (SWAPC) Intelligence, Surveillance, Target Acquisition and Reconnaissance (ISTAR) program.

Click to see some examples :

1:  before  car number plate in and after super-resolution car number plate SR

2:  before vehicles in and after super-resolution vehicles SR

and learn about the theory behind the algorithm:  Chen, J, Nunez-Yanez, JL & Achim, A 2014, ‘Bayesian video super-resolution with heavy-tailed prior models’. IEEE Transactions on Circuits and Systems for Video Technology, vol 24., pp. 905-914

Perceptual Quality Metrics (PVM)

RESEARCHERS

Dr. Fan (Aaron) Zhang

INVESTIGATOR

Prof. David Bull, Dr. Dimitris Agrafiotis and Dr. Roland Baddeley

DATES

2012-2015

FUNDING

ORSAS and EPSRC

SOURCE CODE 

PVM Matlab code Download.

INTRODUCTION

It is known that the human visual system (HVS) employs independent processes (distortion detection and artefact perception – also often referred to near-threshold supra-threshold distortion perception) to assess video quality for various distortion levels. Visual masking effects also play an important role in video distortion perception, especially within spatial and temporal textures.

Algorithmic diagram for PVM.
It is well known that small differences in textured content can be tolerated by the HVS. In this work, we employ the dual-tree complex wavelet transform (DT-CWT) in conjunction with motion analysis to characterise this tolerance within spatial and temporal textures. The DT-CWT has been found to be particularly powerful in this context due to its shift invariance and orientation selectivity properties. In highly distorted video content, for compressed material, blurring is one of the most commonly occuring artefacts. This is detected in our approach by comparing high frequency subband coefficients from the reference and distorted frames, also facilitated by the DT-CWT. This is motion-weighted in order to simulate the tolerance of the HVS to blurring in content with high temporal activity. Inspired by the previous work of Chandler and Hemamiand Larson and Chandler, thresholded differences (defined as noticeable distortion) and blurring artefacts are non-linearly combined using a modified geometric mean model, in which the proportion of each component is adaptively tuned. The performance of the proposed video metric is assessed and validated using the VQEG FRTV Phase I and the LIVE video databases, and shows clear improvements in correlation with subjective scores, over existing metrics such as PSNR, SSIM, VIF, VSNR, VQM and MOVIE, and in many cases over STMAD.

RESULTS

Figure: Scatter plots of subjective DMOS versus different video metrics on the VQEG database.
Figure: Scatter plots of subjective DMOS versus different video metrics on the LIVE video database.

REFERENCE

  1. A Perception-based Hybrid Model for Video Quality Assessment F. Zhang and D. Bull, IEEE T-CSVT, June 2016.
  2. Quality Assessment Methods for Perceptual Video Compression F. Zhang and D. Bull, ICIP, Melbourne, Australia, September 2013.

 

Parametric Video Coding

RESEARCHERS

Dr. Fan (Aaron) Zhang

INVESTIGATOR

Prof. David Bull, Dr. Dimitris Agrafiotis and Dr. Roland Baddeley

DATES

2008-2015

FUNDING

ORSAS and EPSRC

INTRODUCTION

In most cases, the target of video compression is to provide good subjective quality rather than to simply produce the most similar pictures to the originals. Based on this assumption, it is possible to conceive of a compression scheme where an analysis/synthesis framework is employed rather than the conventional energy minimization approach. If such a scheme were practical, it could offer lower bitrates through reduced residual and motion vector coding, using a parametric approach to describe texture warping and/or synthesis.

methodDiagram-1200x466

Instead of encoding whole images or prediction residuals after translational motion estimation, our algorithm employs a perspective motion model to warp static textures and utilises texture synthesis to create dynamic textures. Texture regions are segmented using features derived from the complex wavelet transform and further classified according to their spatial and temporal characteristics. Moreover, a compatible artefact-based video metric (AVM) is proposed with which to evaluate the quality of the reconstructed video. This is also employed in-loop to prevent warping and synthesis artefacts. The proposed algorithm has been integrated into an H.264 video coding framework. The results show significant bitrate savings, of up to 60% compared with H.264 at the same objective quality (based on AVM) and subjective scores.

RESULTS

 

 

REFERENCE

  1. Perception-oriented Video Coding based on Image Analysis and Completion: a Review. P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand, Signal Processing: Image Communication, July 2012.
  2. A Parametric Framework For Video Compression Using Region-based Texture Models. F. Zhang and D. Bull, IEEE J-STSP, November 2011.

SPHERE – a Sensor Platform for HEalthcare in a Residential Environment

Sphere Project main objective is to develop a multimodality sensing platform, based on low-cost devices: ranging from on-body sensors, environmental sensors and video based sensors. The SPHERE platform is aiming at efficiently tackling the problem of healthcare monitoring at home. Its vision is not to develop fundamentally-new sensors for individual health conditions but rather to impact all these healthcare needs simultaneously through data-fusion and pattern-recognition from a common platform of non-medical/environmental sensors at home. The system will be general-purpose, low-cost and hence scalable. Sensors will be entirely passive, requiring no action by the user and hence suitable for all patients including the most vulnerable. A central hypothesis is that deviations from a user’s established pattern of behaviour in their own home have particular, unexploited, diagnostic value.

Computer Vision in Sphere: WP2 (Vision Team)

The main objectives of WP2 consist of developing an efficient, real-time multi-camera system for activity monitoring in the home environment. The system will be based on low cost cameras and depth sensors to estimate client’s position and to analyse their movements to extract features for use for action understanding and activity recognition.

The vision team are developing a video based action recognition and multi-user tracking system for the house environment. This solution will allow the system to estimate the activity/inactivity level of the user during their daily life. The platform has been tested in SPHERE’s house and integrated with the other sensor systems; providing a unique multisensory system for data collection. On-going video work includes a collaboration with respiratory physicians in Bristol developing and validating video-based systems for monitoring breathing.

vision

Related Projects

Online quality assessment of human movements from skeleton data
The objective of this project is to evaluate the quality of human movements from visual information. This has use in a broad range of applications, such as diagnosis and rehabilitation.

Real Time RGB-D tracker: DS-KCF 
The objective of this project is develop a real time RGB-D tracker based on Kernelised Correlation Filters

WP2 Publications

2015
  • Massimo Camplani, Sion Hannuna,  Majid Mirmehdi, Dima Damen, Adeline Paiement, Lili Tao, Tilo Burghardt. Real-time RGB-D Tracking with Depth Scaling Kernelised Correlation Filters and Occlusion Handling. British Machine Vision Conference, September 2015.
  • N. Zhu, T. Diethe, M. Camplani, L. Tao, A. Burrows, N. Twomey, D. Kaleshi, M. Mirmehdi, P. Flach, I. Craddock, Bridging eHealth and the Internet of Things: The SPHERE Project, IEEE Intelligent Systems, (to appear).
  • A Multi-modal Sensor Infrastructure for Healthcare in a Residential Environment. P. Woznowski, X. Fafoutis, T. Song, S. Hannuna, M. Camplani, L. Tao, A. Paiement, E. Mellios, M. Haghighi, N. Zhu, G. Hilton, D. Damen, T. Burghardt, M. Mirmehdi, R. Piechocki, D. Kaleshi and I. Craddock. IEEE International Conference on Communications (ICC), Workshop on ~ICT-enabled services and technologies for eHealth and Ambient Assisted Living.
2014
  • A. Paiment, L. Tao, S. Hannuna, M. Camplani, D. Damen and M. Mirmehdi, Majid (2014). Online quality assessment of human movement from skeleton data. British Machine Vision Conference (BMVC), Nottingham, UK

Object Modelling From Sparse And Misaligned 3D and 4D Data

Object modelling from 3D and 4D sparse and misaligned data has important applications in medical imaging, where visualising and characterising the shape of, e.g., an organ or tumor, is often needed to establish a diagnosis or to plan surgery. Two common issues in medical imaging are the presence of large gaps between the 2D image slices which make a dataset, and misalignments between these slices, due to patient’s movements between their respective acquisitions. These gaps and misalignments make the automatic analysis of the data particularly challenging. In particular, they require interpolation and registration in order to recover a complete shape of the object. This work focuses on the integrated registration, segmentation and interpolation of such sparse and misaligned data. We developed a framework which is flexible enough to model objects of various shapes, from data having arbitrary spatial configuration and from a variety of imaging modalities (e.g. CT-scan, MRI).

ISISD: Integrated Segmentation and Interpolation of Sparse Data

We present a new, general purpose, level set framework which can handle sparse data, by simultaneously segmenting the data and interpolating automatically its gaps. In this new framework, the level set implicit function is interpolated by Radial Basis Functions (RBFs), and its interface can propagate in a sparse volume, using data information where available, and RBF based interpolation of its speeds in the gaps. Any segmentation criteria may be used, thus allowing the framework to process any imaging modalities. Different modalities can be handled simultaneously due to the method interpolating the level set contour rather than the image intensities. This new level set framework benefits from a better robustness to noise in the images, and can segment sparse volumes by integrating the shape of the objects in the gaps.

More details and results may be found here.

The method is described in:

  • Adeline Paiement, Majid Mirmehdi, Xianghua Xie, Mark Hamilton, Integrated Segmentation and Interpolation of Sparse Data. IEEE Transactions on Image Processing, Vol. 23, Issue 1, pp. 110-125, 2014.

IReSISD: Integrated Registration, Segmentation and Interpolation of Sparse Data

A new registration method, Registration_SA_LAalso based on level set, has been developed and integrated to the previous RBF interpolated level set framework. Thus, the new framework can correct misalignments in the data, at the same time as it segments and interpolates it. The integration of all three processes of registration, segmentation and interpolation into a same framework allows them to benefit from each others. Notably registration exploits the shape information provided by the segmentation stage, in order to be robust to local minima and to limited intersections between the images of a dataset.

More details and results may be found here.

The method is described in:

  • Adeline Paiement, Majid Mirmehdi, Xianghua Xie, Mark Hamilton, Registration and Modeling from Spaced and Misaligned Image Volumes. Submitted to IEEE Transactions on Image Processing.

The tables in the article are reported in the graphs below:

stack1

stack2

slice1slice2

slice3

slice4

jaccard

Published Work

  1. Adeline Paiement, Majid Mirmehdi, Xianghua Xie, Mark Hamilton, Integrated Segmentation and Interpolation of Sparse DataIEEE Transactions on Image Processing, Vol. 23, Issue 1, pp. 110-125, 2014.
  2. Adeline Paiement, Majid Mirmehdi, Xianghua Xie, Mark Hamilton, Simultaneous level set interpolation and segmentation of short- and long-axis MRI. Proceedings of Medical Image Understanding and Analysis (MIUA) 2010, pp. 267–272. July 2010. – PDF, 173 Kbytes.

Download Software

The latest version of the code for ISISD and IReSISD can be downloaded here (Version 1.3).

Earlier versions: