BVI-DVC: A Training Database for Deep Video Compression

Di Ma, Fan Zhang and David Bull

ABSTRACT

Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with relatively limited content coverage. In this paper, a new extensive and representative video database, BVI-DVC, is presented for training CNN-based coding tools. BVI-DVC contains 800 sequences at various spatial resolutions from 270p to 2160p and has been evaluated on ten existing network architectures for four different coding tools. Experimental results show that the database produces significant improvements in terms of coding gains over three existing (commonly used) image/video training databases, for all tested CNN architectures under the same training and evaluation configurations.

SOURCE EXAMPLES

 

[DOWNLOAD] all videos from University of Bristol Research Data Storage Facility.

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing:

[1] Di Ma, Fan Zhang and David Bull, “BVI-DVC: A Training Database for Deep Video Compression“, arXiv:2003.13552, 2020.

[2] Di Ma, Fan Zhang and David Bull,”BVI-DVC“, 2020.

 

 

ICME2020 Grand Challenge: Encoding in the Dark

source: El Fuente test sequence, Netflix

Sponsors

The awards will be sponsored by Facebook and Netflix.

 

Low light scenes often come with acquisition noise, which not only disturbs the viewers, but it also makes video compression challenging. These types of videos are often encountered in cinema as a result of an artistic perspective or the nature of a scene. Other examples include shots of wildlife (e.g. Mobula rays at night in Blue Planet II), concerts and shows, surveillance camera footage and more. Inspired by all the above, we are organising a challenge on encoding low-light captured videos. This challenge intends to identify technology that improves the perceptual quality of compressed low-light videos beyond the current state of the art performance of the most recent coding standards, such as HEVC, AV1, VP9, VVC, etc. Moreover, this will offer a good opportunity for both experts in the fields of video coding and image enhancement to address this problem. A series of subjective tests will be part of the evaluation, the results of which can be used in a study of the tradeoff between artistic direction and the viewers’ preferences, such as mystery movies and some investigation scenes in the film.

Participants will be requested to deliver bitstreams with pre-defined maximum target rates for a given set of sequences, a short report describing their contribution and a software executable for running the proposed methodology and then can reconstruct the decoded videos by the given timeline. Participants are also encouraged to submit a paper for publication in the proceedings, and the best performers shall be prepared to present a summary of the underlying technology during the ICME session. The organisers will cross-validate and perform subjective tests to rank participant contributions.

Please, find here:

 

Important Dates:

    • Expression of interest to participate in the Challenge: 29/11/2019 10/12/2019
    • Availability of test sequences for participants: 01/12/2019 upon registration
    • Availability of anchor bitstreams and software package: 15/12/2019
    • Submission of encoded material: 13/03/2020 27/03/2020
    • Submission of Grand Challenge Papers: 13/03/2020 27/03/2020

Host: The challenge will be organised by Dr. Nantheera Anantrasirichai, Dr. Paul Hill, Dr. Angeliki Katsenou, Ms Alexandra Malyugina, and Dr. Fan Zhang, Visual Information Lab, University of Bristol, UK.

Contact: alex.malyugina@bristol.ac.uk

FRQM: A Frame Rate Dependent Video Quality Metric

Fan Zhang, Alex Mackin and David Bull

ABSTRACT

This page introduces the work of an objective quality metric (FRQM), which characterises the relationship between variations in frame rate and perceptual video quality. The proposed method estimates the relative quality of a low frame rate video with respect to its higher frame rate counterpart, through temporal wavelet decomposition, subband combination and spatiotemporal pooling. FRQM was tested alongside six commonly used quality metrics (two of which explicitly relate frame rate variation to perceptual quality), on the publicly available BVI-HFR video database, that spans a diverse range of scenes and frame rates, up to 120fps. Results show that FRQM offers significant improvement over all other tested quality assessment methods with relatively low complexity.

PROPOSED ALGORITHM

SOURCE CODE DOWNLOAD

[DOWNLOAD] Matlab code

REFERENCE

[1] Fan Zhang, Alex Mackin, and David, R. Bull, “A Frame Rate Dependent Video Quality Metric based on Temporal Wavelet Decomposition and Spatiotemporal Pooling. “, IEEE ICIP, 2017.

 

Terrain analysis for biped locomotion

Numerous scenarios exist where it is necessary or advantageous to classify surface material at a distance from a moving forward-facing camera. Examples include the use of image based sensors for assessing and predicting terrain type in association with the control or navigation of autonomous vehicles. In many real scenarios, the upcoming terrain might not just be flat but may also be oblique and vehicles may need to change speed and gear to ensure safe and clean motion.

Blur-robust texture features

Videos captured with moving cameras, particularly those attached to biped robots, often exhibit blur due to incorrect focus or slow shutter speed. Blurring effects generally alter the spatial and frequency characteristics of the content and this may reduce the performance of a classifier. Robust texture features are therefore developed to deal with this problem. [Matlab Code]

Terrain classification from body-mounted cameras during human locomotion

A novel algorithm for terrain type classification based on monocular video captured from the viewpoint of human locomotion is introduced. A texture-based algorithm is developed to classify the path ahead into multiple groups that can be used to support terrain classification. Gait is taken into account in two ways. Firstly, for key frame selection, when regions with homogeneous texture characteristics are updated, the frequency variations of the textured surface are analysed and used to adaptively define filter coefficients. Secondly, it is incorporated in the parameter estimation process where probabilities of path consistency are employed to improve terrain-type estimation [Matlab Code]. Figures below show the proposed process of terrain classification for tracked regions and a result. [PDF]

Label 1 (green), Label 2 (red) and Label 3 (blue) correspond to the areas classified as hard surfaces, soft surfaces and unwalkable areas, respectively. The size of the circle indicates probabilities – bigger implies higher confidence of classification.

Planar orientation estimation by texture

The gradient of a road or terrain influences the appropriate speed and power of a vehicle traversing it. Therefore, gradient prediction is necessary if autonomous vehicles are to optimise their locomotion. A novel texture-based method for estimating the orientation of planar surfaces under the basic assumption of homogeneity has been developed for scenarios that only a single image source exists, which also includes where a region of interest is too further to employ a depth estimation technique.

References

  • Terrain classification from body-mounted cameras during human locomotion. N. Anantrasirichai, J. Burn and David Bull. IEEE Transactions on Cybernetics. [PDF] [Matlab Code].
  • Projective image restoration using sparsity regularization. N. Anantrasirichai, J. Burn and David Bull. ICIP 2013. [PDF] [Matlab Code]
  • Robust texture features for blurred images using undecimated dual-tree complex wavelets. N. Anantrasirichai, J. Burn and David Bull. ICIP 2014. [PDF] [Matlab Code]
  • Orientation estimation for planar textured surfaces based on complex wavelets. N. Anantrasirichai, J. Burn and David Bull. ICIP 2014. [PDF]
  • Robust texture features based on undecimated dual-tree complex wavelets and local magnitude binary patterns. N. Anantrasirichai, J. Burn and David Bull. ICIP 2015. [PDF]

Mitigating the effects of atmospheric turbulence on surveillance imagery

Various types of atmospheric distortion can influence the visual quality of video signals during acquisition. Typical distortions include fog or haze which reduce contrast, and atmospheric turbulence due to temperature variations or aerosols. An effect of temperature variation is observed as a change in the interference pattern of the light refraction, causing unclear, unsharp, waving images of the objects. This obviously makes the acquired imagery difficult to interpret.

This project introduced a novel method for mitigating the effects of atmospheric distortion on observed images, particularly airborne turbulence which can severely degrade a region of interest (ROI). In order to provide accurate detail from objects behind the distorting layer, a simple and efficient frame selection method is proposed to pick informative ROIs from only good-quality frames. We solve the space-variant distortion problem using region-based fusion based on the Dual Tree Complex Wavelet Transform (DT-CWT). We also propose an object alignment method for pre-processing the ROI since this can exhibit significant offsets and distortions between frames. Simple haze removal is used as the final step. We refer to this algorithm as CLEAR (for code please contact me) (Complex waveLEt fusion for Atmospheric tuRbulence). [PDF] [VIDEOS]

Atmospheric distorted videos of static scene

Mirage (256×256 pixels, 50 frames). Left: distorted sequence. Right: restored image. Download PNG

carheathaze

Download other distorted sequences and references [here].

Atmospheric distorted videos of moving object

Left: Distorted video. Right: Restored video. Download PNG

References

  • Atmospheric turbulence mitigation using complex wavelet-based fusion. N. Anantrasirichai, Alin Achim, Nick Kingsbury, and David Bull. IEEE Transactions on Image Processing. [PDF] [Sequences] [Code: please contact me]
  • Mitigating the effects of atmospheric distortion using DT-CWT fusion. N. Anantrasirichai, Alin Achim, David Bull, and Nick Kingsbury. In Proceedings of the IEEE International Conference on Image Processing (ICIP 2012). [PDF] [BibTeX]
  • Mitigating the effects of atmospheric distortion on video imagery : A review. University of Bristol, 2011. [PDF]
  • Mitigating the effects of atmospheric distortion. University of Bristol, 2012. [PDF]

Computer Assisted Analysis of Retinal OCT Imaging

Texture-preserving image enhancement for Optical Coherence Tomography

This project developed novel image enhancement algorithms for retinal optical coherence tomography (OCT). These images contain a large amount of speckle causing them to be grainy and of very low contrast. To make these images valuable for clinical interpretation, our method offers speckle removal, while preserving useful information contained in each retinal layer starts with multi-scale despeckling based on a dual-tree complex wavelet transform (DT-CWT). The OCT image is further enhanced through a smoothing process that uses a novel adaptive-weighted bilateral filter (AWBF). This offers the desirable property of preserving texture within the OCT image layers. The enhanced OCT image is then segmented to extract inner retinal layers that contain useful information for eye research. Our layer segmentation technique is also performed in DT-CWT domain. Finally we also developed an OCT/fundus image registration algorithm which is helpful when two modalities are used together for diagnosis and for information fusion.

Figure below shows B-scans of retinal OCT images at ONH (top) and macula (bottom). Left: raw OCT images show grainy texture. Middle: despeckled images using with Cauchy Model*. Right: enhanced images using AWBF. [CODE] [PDF]

compareEnhOCT

Texture analysis on Ocular imaging for Glaucoma disease regression

The project analysed texture in the OCT image layers on retinal disease glaucoma. An automated texture classification method for glaucoma detection has been developed. Methodology for classification and feature extraction based on robust principle component analysis of texture descriptors was established. Also, the technique using multi-modal information fusion which incorporates data from visual field measurements with OCT and retinal fundus photography was developed. [PDF]

References