Learning-optimal Deep Visual Compression

David Bull, Fan Zhang and Paul Hill

INTRODUCTION

Deep Learning systems offer state-of-the-art performance in image analysis, outperforming conventional methods. Such systems offer huge potential across military and commercial domains including: human/target detection and recognition and spatial localization/mapping. However, heavy computational requirements limit their exploitation in surveillance applications, particularly airborne, where low-power embedded processing and limited bandwidth are common constraints.

Our aim is to explore deep learning performance whilst reducing processing and communication overheads, by developing learning-optimal compression schemes trained in conjunction with detection networks.

ACKNOWLEDGEMENT

This work has been funded by DASA Advanced Vision 2020 Programme.

 

 

A Simulation Environment for Drone Cinematography

Fan Zhang, David Hall, Tao Xu, Stephen Boyle and David Bull

INTRODUCTION

Simulations of drone camera platforms based on actual environments have been identified as being useful for shot planning, training and re­hearsal for both single and multiple drone operations. This is particularly relevant for live events, where there is only one opportunity to get it right on the day.

In this context, we present a workflow for the simulation of drone operations exploiting realistic background environments constructed within Unreal Engine 4 (UE4). Methods for environmental image capture, 3D reconstruction (photogrammetry) and the creation of foreground assets are presented along with a flexible and user-friendly simulation interface. Given the geographical location of the selected area and the camera parameters employed, the scanning strategy and its associated flight parameters are first determined for image capture. Source imagery can be extracted from virtual globe software or obtained through aerial photography of the scene (e.g. using drones). The latter case is clearly more time consuming but can provide enhanced detail, particularly where coverage of virtual globe software is limited.

The captured images are then used to generate 3D background environment models employing photogrammetry software. The reconstructed 3D models are then imported into the simulation interface as background environment assets together with appropriate foreground object models as a basis for shot planning and rehearsal. The tool supports both free-flight and parameterisable standard shot types along with programmable scenarios associated with foreground assets and event dynamics. It also supports the exporting of flight plans. Camera shots can also be designed to pro­vide suitable coverage of any landmarks which need to appear in-shot. This simulation tool will contribute to enhanced productivity, improved safety (awareness and mitigations for crowds and buildings), improved confidence of operators and directors and ultimately enhanced quality of viewer experience.

DEMO VIDEOS

Boat.mp4

Cyclist.mp4

REFERENCES

[1] F. Zhang, D. Hall, T. Xu, S. Boyle and D. Bull, “A Simulation environment for drone cinematography”, IBC 2020.

[2] S. Boyle, M. Newton, F. Zhang and D. Bull, “Environment Capture and Simulation for UAV Cinematography Planning and Training”,  EUSIPCO, 2019

BVI-SR: A Study of Subjective Video Quality at Various Spatial Resolutions

Alex Mackin, Mariana Afonso, Fan Zhang, and David Bull

ABSTRACT

BVI-SR contains 24 unique video sequences at a range of spatial resolutions up to UHD-1 (3840p). These sequences were used as the basis for a large-scale subjective experiment exploring the relationship between visual quality and spatial resolution when using three distinct spatial adaptation filters (including a CNN-based super-resolution method). The results demonstrate that while spatial resolution has a significant impact on mean opinion scores (MOS), no significant reduction in visual quality between UHD-1 and HD resolutions for the superresolution method is reported. A selection of image quality metrics were benchmarked on the subjective evaluations, and analysis indicates that VIF offers the best performance.

SOURCE SEQUENCES

DATABASE DOWNLOAD

[DOWNLOAD] subjective data, instructions and related file.

[DOWNLOAD] all videos from University of Bristol Research Data Storage Facility.

 

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing the following paper:

[1] A. Mackin, M. Afonso, F. Zhang and D. Bull, “A study of subjective video quality at various spatial resolutions”, IEEE ICIP, 2018.

[2] A. Mackin, M. Afonso, F. Zhang and D. Bull,”BVI-SR Database“, 2020.

BVI-DVC: A Training Database for Deep Video Compression

Di Ma, Fan Zhang and David Bull

ABSTRACT

Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with relatively limited content coverage. In this paper, a new extensive and representative video database, BVI-DVC, is presented for training CNN-based coding tools. BVI-DVC contains 800 sequences at various spatial resolutions from 270p to 2160p and has been evaluated on ten existing network architectures for four different coding tools. Experimental results show that the database produces significant improvements in terms of coding gains over three existing (commonly used) image/video training databases, for all tested CNN architectures under the same training and evaluation configurations.

SOURCE EXAMPLES

 

[DOWNLOAD] all videos from University of Bristol Research Data Storage Facility.

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing:

[1] Di Ma, Fan Zhang and David Bull, “BVI-DVC: A Training Database for Deep Video Compression“, arXiv:2003.13552, 2020.

[2] Di Ma, Fan Zhang and David Bull,”BVI-DVC“, 2020.

 

 

Comparing VVC, HEVC and AV1 using Objective and Subjective Assessments

Fan Zhang, Angeliki Katsenou, Mariana Afonso, Goce Dimitrov and David Bull

ABSTRACT

In this paper, the performance of three state-of-the-art video codecs: High Efficiency Video Coding (HEVC) Test Model (HM), AOMedia Video 1 (AV1) and Versatile Video Coding Test Model (VTM), are evaluated using both objective and subjective quality assessments. Nine source sequences were carefully selected to offer both diversity and representativeness, and different resolution versions were encoded by all three codecs at pre-defined target bitrates. The compression efficiency of the three codecs are evaluated using two commonly used objective quality metrics, PSNR and VMAF. The subjective quality of their reconstructed content is also evaluated through psychophysical experiments. Furthermore, HEVC and AV1 are compared within a dynamic optimization framework (convex hull rate-distortion optimization) across resolutions with a wider bitrate, using both objective and subjective evaluations. Finally the computational complexities of three tested codecs are compared. The subjective assessments indicate that, for the tested versions there is no significant difference between AV1 and HM, while the tested VTM version shows significant enhancements. The selected source sequences, compressed video content and associated subjective data are available online, offering a resource for compression performance evaluation and objective video quality assessment.

Parts of this work have been presented in the IEEE International Conference on Image Processing (ICIP) 2019 in Taipei and in the Alliance for Open Media (AOM) Symposium 2019 in San Francisco.

SOURCE SEQUENCES

DATABASE

[DOWNLOAD] subjective data.

[DOWNLOAD] all videos from University of Bristol Research Data Storage Facility.

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing the following paper:

[1] A. V. Katsenou, F. Zhang, M. Afonso and D. R. Bull, “A Subjective Comparison of AV1 and HEVC for Adaptive Video Streaming,” 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 4145-4149.

[2] F. Zhang, A. V. Katsenou, M. Afonso, Goce Dimitrov and D. R. Bull, “Comparing VVC, HEVC and AV1 using Objective and Subjective Assessments”, arXiv:2003. 10282 [eess.IV], 2020.