Optimal presentation duration for video quality assessment

Video content distributors, codec developers and researchers in related fields often rely on subjective assessments to ensure that their video processing procedures result in satisfactory quality. The current 10-second recommendation for the length of test sequences in subjective video quality assessment studies, however, has recently been questioned. Not only do sequences of this length depart from modern cinematic shooting styles, the use of shorter sequences would enable substantial efficiency improvements to the data collection process. This project, therefore, aims to explore the impact upon viewer rating behaviour of using different length video sequences and the consequent savings that could be made in time, labour and money .

Publications:

 

Felix Mercer Moss, Ke Wang, Fan Zhang, Roland Baddeley and David R. Bull, On the optimal presentation duration for subjective video quality assessment, IEEE Transactions on Circuits and Systems for Video Technology, Volume PP, Issue 99, July 2015.

Felix Mercer Moss, Chun-Ting Yeh, Fan Zhang, Roland Baddeley and David R. Bull, Support for reduced presentation durations in subjective video quality assessment, Signal Processing: Image Communication, Volume 48, October 2016, Pages 38-49.
sampleHamster-1200x675

 

Undecimated 2D Dual Tree Complex Wavelet Transforms

Dr Paul Hill, Dr Alin Achim and Professor Dave Bull

This work introduces two undecimated forms of the 2D Dual Tree Complex Wavelet Transform (DT-CWT) which combine the benefits of the Undecimated Discrete Wavelet Transform (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

The Discrete Wavelet Transform (DWT) is a spatial frequency transform that has been used extensively for analysis, denoising and fusion within image processing applications. It has been recognised that although the DWT gives excellent combined spatial and frequency resolution, the DWT suffers from shift variance. Various adaptations to the DWT have been developed to produce a shift invariant form. Firstly, an exact shift invariance has been achieved using the Undecimated Discrete Wavelet Transform (UDWT). However, the UDWT variant suffers from a considerably overcomplete representation together with a lack of directional selectivity. More recently, the Dual Tree Complex Wavelet Transform (DT-CWT) has given a more compact representation whilst offering near shift invariance. The DT-CWT also offers improved directional selectivity (6 directional subbands per scale) and complex valued coefficients that are useful for magnitude / phase analysis within the transform domain. This paper introduces two undecimated forms of the DT-CWT which combine the benefits of the UDWT (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

This image illustrates the three different 2D Dual Tree Complex Wavelet Transforms

nddtcwt

 

Matlab code download 

Implementations of three complex wavelet transforms can be downloaded below as mex matlab files. They have been compiled in 32bit and 64bit windows and 64bit linux formats. If you need an alternative format please mail me at paul.hill@bristol.ac.uk. Code updated 16/7/2014.

Please reference the following paper if you use this software

Hill, P. R., N. Anantrasirichai, A. Achim, M. E. Al-Mualla, and D. R. Bull. “Undecimated Dual-Tree Complex Wavelet Transforms.” Signal Processing: Image Communication 35 (2015): 61-70.

The paper is available here: http://www.sciencedirect.com/science/article/pii/S0923596515000715

A previous paper is here:

Hill, P.; Achim, A.; Bull, D., “The Undecimated Dual Tree Complex Wavelet Transform and its application to bivariate image denoising using a Cauchy model,” Image Processing (ICIP), 2012 19th IEEE International Conference on , vol., no., pp.1205,1208, Sept. 30 2012-Oct. 3 2012.

 

Matlab Code Usage

Forward Transform: NDxWav2DMEX
Backward Transform: NDixWav2DMEX
Useage:  w = NDxWav2DMEX(x, J, Faf, af, nondecimate);
     y = NDixWav2DMEX(w, J, Fsf, sf, nondecimate);
x,y - 2D arrays
J - number of decomposition 
Faf{i}: tree i first stage analysis filters 
af{i}:  tree i filters for remaining analysis stages
Fsf{i}: tree i first stage synthesis filters 
sf{i}:  tree i filters for remaining synthesis stages
Nondecimated: 0 (default) for original decimated version, 1 for completely decimated version, 2 for decimation of just first level.
w – wavelet coefficients
w{a}{b}{c}{d} - wavelet coefficients
        a = 1:J (scales)
        b = 1 (real part); b = 2 (imag part)
        c = 1,2; d = 1,2,3 (orientations)
w{J+1}{a}{b} - lowpass coefficients
        a = 1,2; b = 1,2 
 
 
Example of Usage: 
 
  % Original Decimated Version
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB;
  [af, sf] = dualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af,0);
  y = NDixWav2DMEX(w, J, Fsf, sf,0);
  err = x - y;
  max(max(abs(err)))
 
  % Decimated Version 1 (no decimation)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = NDAntonB2; %(Must use ND filters for both)
  [af, sf] = NDdualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af, 1);
  y = NDixWav2DMEX(w, J, Fsf, sf, 1);
  err = x - y;
  max(max(abs(err)))
 
  %Decimated Version 2 (decimation on only first level)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB; 
  [af, sf] = NDdualfilt1; %(Must use ND filters for just these)
  w = NDxWav2DMEX(x, J, Faf, af, 2);
  y = NDixWav2DMEX(w, J, Fsf, sf, 2);
  err = x - y;
  max(max(abs(err)))
 
% SIZE LIMITS
% (s/J^2) must be bigger than 5 (where s is both height and width)
% Height and width must be divisible by 2^J for fully decimated version
% Height and width must be divisible by 2 for nondecimated version 2


Perceptual Quality Metrics (PVM)

RESEARCHERS

Dr. Fan (Aaron) Zhang

INVESTIGATOR

Prof. David Bull, Dr. Dimitris Agrafiotis and Dr. Roland Baddeley

DATES

2012-2015

FUNDING

ORSAS and EPSRC

SOURCE CODE 

PVM Matlab code Download.

INTRODUCTION

It is known that the human visual system (HVS) employs independent processes (distortion detection and artefact perception – also often referred to near-threshold supra-threshold distortion perception) to assess video quality for various distortion levels. Visual masking effects also play an important role in video distortion perception, especially within spatial and temporal textures.

Algorithmic diagram for PVM.
It is well known that small differences in textured content can be tolerated by the HVS. In this work, we employ the dual-tree complex wavelet transform (DT-CWT) in conjunction with motion analysis to characterise this tolerance within spatial and temporal textures. The DT-CWT has been found to be particularly powerful in this context due to its shift invariance and orientation selectivity properties. In highly distorted video content, for compressed material, blurring is one of the most commonly occuring artefacts. This is detected in our approach by comparing high frequency subband coefficients from the reference and distorted frames, also facilitated by the DT-CWT. This is motion-weighted in order to simulate the tolerance of the HVS to blurring in content with high temporal activity. Inspired by the previous work of Chandler and Hemamiand Larson and Chandler, thresholded differences (defined as noticeable distortion) and blurring artefacts are non-linearly combined using a modified geometric mean model, in which the proportion of each component is adaptively tuned. The performance of the proposed video metric is assessed and validated using the VQEG FRTV Phase I and the LIVE video databases, and shows clear improvements in correlation with subjective scores, over existing metrics such as PSNR, SSIM, VIF, VSNR, VQM and MOVIE, and in many cases over STMAD.

RESULTS

Figure: Scatter plots of subjective DMOS versus different video metrics on the VQEG database.
Figure: Scatter plots of subjective DMOS versus different video metrics on the LIVE video database.

REFERENCE

  1. A Perception-based Hybrid Model for Video Quality Assessment F. Zhang and D. Bull, IEEE T-CSVT, June 2016.
  2. Quality Assessment Methods for Perceptual Video Compression F. Zhang and D. Bull, ICIP, Melbourne, Australia, September 2013.

 

Parametric Video Coding

RESEARCHERS

Dr. Fan (Aaron) Zhang

INVESTIGATOR

Prof. David Bull, Dr. Dimitris Agrafiotis and Dr. Roland Baddeley

DATES

2008-2015

FUNDING

ORSAS and EPSRC

INTRODUCTION

In most cases, the target of video compression is to provide good subjective quality rather than to simply produce the most similar pictures to the originals. Based on this assumption, it is possible to conceive of a compression scheme where an analysis/synthesis framework is employed rather than the conventional energy minimization approach. If such a scheme were practical, it could offer lower bitrates through reduced residual and motion vector coding, using a parametric approach to describe texture warping and/or synthesis.

methodDiagram-1200x466

Instead of encoding whole images or prediction residuals after translational motion estimation, our algorithm employs a perspective motion model to warp static textures and utilises texture synthesis to create dynamic textures. Texture regions are segmented using features derived from the complex wavelet transform and further classified according to their spatial and temporal characteristics. Moreover, a compatible artefact-based video metric (AVM) is proposed with which to evaluate the quality of the reconstructed video. This is also employed in-loop to prevent warping and synthesis artefacts. The proposed algorithm has been integrated into an H.264 video coding framework. The results show significant bitrate savings, of up to 60% compared with H.264 at the same objective quality (based on AVM) and subjective scores.

RESULTS

 

 

REFERENCE

  1. Perception-oriented Video Coding based on Image Analysis and Completion: a Review. P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand, Signal Processing: Image Communication, July 2012.
  2. A Parametric Framework For Video Compression Using Region-based Texture Models. F. Zhang and D. Bull, IEEE J-STSP, November 2011.

Parametric Video Compression

This project presents a novel means of video compression based on texture warping and synthesis. Instead of encoding whole images or prediction residuals after translational motion estimation, our algorithm employs a perspective motion model to warp static textures and utilises texture synthesis to create dynamic textures. Texture regions are segmented using features derived from the complex wavelet transform and further classified according to their spatial and temporal characteristics. Moreover, a compatible artefact-based video metric (AVM) is proposed with which to evaluate the quality of the reconstructed video. This is also employed in-loop to prevent warping and synthesis artefacts. The proposed algorithm has been integrated into an H.264 video coding framework. The results show significant bitrate savings, of up to 60% compared with H.264 at the same objective quality (based on AVM) and subjective scores.

It is currently a very exciting and challenging time for video compression. The predicted growth in demand for bandwidth, especially for mobile services will be driven by video applications and is probably greater now than it has ever been. David Bull (VI-Lab), Dimitris Agrafiotis (VI-Lab) and Roland Baddeley (Experimental Psychology) have won a new £600k EPSRC research grant to investigate perceptual redundancy in, and new representations for digital video content. With EPSRC funding and collaboration with BBC and HHIFraunhofer Berlin, the team will investigate video compression schemes where an analysis/synthesis framework replaces the conventional energy minimisation approach. A preliminary coding framework of this type has been created by Zhang and Bull where scene content is modelled, using computer graphic techniques to replace target textures at the decoder. This approach is already producing world-leading results and has the potential to create a new content-driven framework for video compression, where region-based parameters are combined with perceptual quality metrics to inform and drive the coding processes.

Published Work

  1. P. Ndjiki-Nya, D. Doshkova, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand, Perception-oriented video coding based on image analysis and completion: A review, Signal Processing: Image Communication, Volume 27, Issue 6, July 2012, Pages 579–594Link
  2. Zhang, F and Bull D.R., A Parametric Framework For Video Compression Using Region-based Texture Models’, IEEE Journal on Selected Areas in Signal processing (Special Issue), Vol. 5, No. 7, November 2011, pp1378-92. Link
  3. Ierodiaconou, S.; Byrne, J.; Bull, D.R.; Redmill, D.; Hill, P.; Unsupervised image compression using graphcut texture synthesis, Image Processing (ICIP), 2009 16th IEEE International Conference on, 2009, Page(s): 2289 – 2292
  4. Byrne, J.; Ierodiaconou, S.; Bull, D.; Redmill, D.; Hill, P.; Unsupervised image compression-by-synthesis within a JPEG framework, Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, 2008 , Page(s): 2892 – 2895

Marie Skłodowska-Curie Actions : PROVISION

Creating a ‘Visually’ Better TomorrowPROVISION team photo

PROVISION is a network of leading academic and industrial organisations in Europe comprising of international researchers working on the problems plaguing most video coding technologies of the day. The ultimate goal is to make noteworthy technical advances and further improvements to the existing state-of-the-art techniques of compression video material.

The project shall not only aim to enhance broadcast and on-demand video material, but also produce a new generation of scientists equipped with research and soft skills needed by industry, academia and society by large. In line with the principles laid down by Marie Skłodowska-Curie actions of the European Commission, PROVISION is a great example of an ensemble of researchers with varied geographical and academic backgrounds all channelling their joint effort towards creating a technologically, or more specifically a ‘visually’ better tomorrow

Provision website, Provision facebook page