Learning-optimal Deep Visual Compression

David Bull, Fan Zhang and Paul Hill

INTRODUCTION

Deep Learning systems offer state-of-the-art performance in image analysis, outperforming conventional methods. Such systems offer huge potential across military and commercial domains including: human/target detection and recognition and spatial localization/mapping. However, heavy computational requirements limit their exploitation in surveillance applications, particularly airborne, where low-power embedded processing and limited bandwidth are common constraints.

Our aim is to explore deep learning performance whilst reducing processing and communication overheads, by developing learning-optimal compression schemes trained in conjunction with detection networks.

ACKNOWLEDGEMENT

This work has been funded by DASA Advanced Vision 2020 Programme.

 

 

BVI-DVC: A Training Database for Deep Video Compression

Di Ma, Fan Zhang and David Bull

INTRODUCTION

Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with relatively limited content coverage. In this work, a new extensive and representative video database, BVI-DVC is presented for training CNN-based video compression systems, with specific emphasis on machine learning tools that enhance conventional coding architectures, including spatial resolution and bit depth up-sampling, post-processing and in-loop filtering. BVI-DVC contains 800 sequences at various spatial resolutions from 270p to 2160p and has been evaluated on ten existing network architectures for four different coding tools. Experimental results show that this database produces significant improvements in terms of coding gains over three existing (commonly used) image/video training databases under the same training and evaluation configurations. The overall additional coding improvements by using the proposed database for all tested coding modules and CNN architectures are up to 10.3% based on the assessment of PSNR and 8.1% based on VMAF.

SOURCE EXAMPLES

PERFORMANCE

This database has been compared to other three commonly used datasets for training ten popular network architectures which are employed in four different CNN-based coding modules (in the context of HEVC). The figure below shows the average coding gains in terms of BD-rates on JVET test sequences over original HEVC.

USEFUL LINKS

[DOWNLOAD] all videos of this database.

[README] before using the database and for copyright permissions.

If there is any issue regarding this database, please contact fan.zhang@bristol.ac.uk

REFERENCE

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing:

[1] Di Ma, Fan Zhang and David Bull, “BVI-DVC: A Training Database for Deep Video Compression“, arXiv:2003.13552, 2020.

[2] Di Ma, Fan Zhang and David Bull,”BVI-DVC“, 2020.

ICME2020 Grand Challenge: Encoding in the Dark

source: El Fuente test sequence, Netflix

Sponsors

The awards will be sponsored by Facebook and Netflix.

 

Low light scenes often come with acquisition noise, which not only disturbs the viewers, but it also makes video compression challenging. These types of videos are often encountered in cinema as a result of an artistic perspective or the nature of a scene. Other examples include shots of wildlife (e.g. Mobula rays at night in Blue Planet II), concerts and shows, surveillance camera footage and more. Inspired by all the above, we are organising a challenge on encoding low-light captured videos. This challenge intends to identify technology that improves the perceptual quality of compressed low-light videos beyond the current state of the art performance of the most recent coding standards, such as HEVC, AV1, VP9, VVC, etc. Moreover, this will offer a good opportunity for both experts in the fields of video coding and image enhancement to address this problem. A series of subjective tests will be part of the evaluation, the results of which can be used in a study of the tradeoff between artistic direction and the viewers’ preferences, such as mystery movies and some investigation scenes in the film.

Participants will be requested to deliver bitstreams with pre-defined maximum target rates for a given set of sequences, a short report describing their contribution and a software executable for running the proposed methodology and then can reconstruct the decoded videos by the given timeline. Participants are also encouraged to submit a paper for publication in the proceedings, and the best performers shall be prepared to present a summary of the underlying technology during the ICME session. The organisers will cross-validate and perform subjective tests to rank participant contributions.

Please, find here:

 

Important Dates:

    • Expression of interest to participate in the Challenge: 29/11/2019 10/12/2019
    • Availability of test sequences for participants: 01/12/2019 upon registration
    • Availability of anchor bitstreams and software package: 15/12/2019
    • Submission of encoded material: 13/03/2020 27/03/2020
    • Submission of Grand Challenge Papers: 13/03/2020 27/03/2020

Host: The challenge will be organised by Dr. Nantheera Anantrasirichai, Dr. Paul Hill, Dr. Angeliki Katsenou, Ms Alexandra Malyugina, and Dr. Fan Zhang, Visual Information Lab, University of Bristol, UK.

Contact: alex.malyugina@bristol.ac.uk