MyWorld: Visual Computing and Visual Communications Research Internships 2025

About

We are excited to announce that 2x funded summer internships will be available in the summer of 2025, supervised by academics at the Visual Information Lab, University of Bristol. Each intern will work full-time for 7 weeks on cutting-edge research in image and video processing, with support from senior researchers in the group.

These internship projects are supported by MyWorld, a creative technology programme in the UK’s West of England region, funded by £30 million from UK Research and Innovation’s (UKRI) Strength in Places Fund (SIPF).

Eligibility of students and Assessment

To be eligible for a summer internship, students must meet the following criteria:

  • Be a full-time student at the University of Bristol.
  • Be in their second or penultimate year of study (not in their first or final year).
  • Be able to work in person at the University of Bristol during the internship period.
  • Have a strong interest in postgraduate research, particularly in image and video technology.

In line with the University’s commitment to promoting equity and diversity, we particularly welcome and encourage applications from students whose ethnicity, gender, and/or background are currently underrepresented in our postgraduate community.

Students will be assessed on:

  • Academic record
  • Interest in postgraduate research

Project 1

Title: Implicit video compression based on generative models

Description:
This project will leverage various generative models to efficiently represent and compress standard and immersive video signals. Unlike traditional compression techniques, which rely on explicit encoding and decoding processes, this type of approach is expected to learn a compact, latent representation of video content, and then reconstruct high-quality video frames from this compressed representation. This approach aims to achieve better compression ratios while maintaining high visual fidelity, making it particularly promising for applications in video streaming, storage, and real-time communication.

Related works:
[1] Kwan, Ho Man, et al. “HiNeRV: Video compression with hierarchical encoding-based neural representation.”, NeurIPS 2023. [Paper]
[2] Gao, Ge, et al. “PNVC: Towards Practical INR-based Video Compression.”, arXiv:2409.00953, 2024. [Paper]
[3] Blattmann, Andreas, et al. “Align your latents: High-resolution video synthesis with latent diffusion models.”, CVPR 2023. [Paper]

Supervisor:
Please contact Dr. Aaron Zhang (fan.zhang@bristol.ac.uk) for any inquiries.

Project 2

Title: Zero-shot learning for video denoising

Description:
This project aims to develop video denoising through the adoption of zero-shot learning techniques, eliminating the need for conventional noisy-clean training pairs. By leveraging deep learning models that can generalise from unrelated data, the project seeks to develop an innovative denoising framework that can effectively improve video quality under a variety of conditions without prior specific examples. This approach not only promises significant advancements in video processing technology but also extends potential applications in real-time broadcasting, surveillance, and content creation, where optimal video clarity is essential.

Related works:
[1] Y. Mansour and R. Heckel, “Zero-Shot Noise2Noise: Efficient Image Denoising without any Data”, CVPR 2023. [Paper]
[2] Y. Shi, et al., “ZERO-IG: Zero-Shot Illumination-Guided Joint Denoising and Adaptive Enhancement for Low-Light Images”, CVPR 2024. [Paper]

Supervisor:
Please contact Dr Pui Anantrasirichai (n.anantrasirichai@bristol.ac.uk) for any inquiries.

Application

  1. Submit your [Application Form] by 31 January 2025.
  2. Shortlisted candidates will be interviewed by 14 February 2025.
  3. Successful students will be notified by 28 February 2025.
  4. Students are provided internship acceptance form to confirm information required by TSS for registration by 14 March 2025.

Payment

Students will be paid the minimum living wage for the duration of the internship (£12.21 per hour in 2025), which equates to approximately £428 (35 hours) per week before any National Insurance or income tax deductions. Please note that payment will be made a month in arrears, meaning students will be paid for the hours worked at the end of each month.

Learning-optimal Deep Visual Compression

David Bull, Fan Zhang and Paul Hill

INTRODUCTION

Deep Learning systems offer state-of-the-art performance in image analysis, outperforming conventional methods. Such systems offer huge potential across military and commercial domains including: human/target detection and recognition and spatial localization/mapping. However, heavy computational requirements limit their exploitation in surveillance applications, particularly airborne, where low-power embedded processing and limited bandwidth are common constraints.

Our aim is to explore deep learning performance whilst reducing processing and communication overheads, by developing learning-optimal compression schemes trained in conjunction with detection networks.

ACKNOWLEDGEMENT

This work has been funded by DASA Advanced Vision 2020 Programme.

 

 

MyWorld set to make SouthWest a Digital Media Leader on Global Stage

Image credit: Nick Smith Photography

We are pleased to advise that a University of Bristol initiative, led by VI Lab’s Professor David Bull, has been awarded £30 million by the UK Research and Innovation Strength in Places fund.

The South West is on track to become an international trailblazer in screen-based media thanks to £30 million funding from UKRI, with a further £16m coming from an alliance of more than 30 industry and academic partners joining forces in the five-year scheme due to start by the end of the year. This will launch a creative media powerhouse called MyWorld and supercharge economic growth, generating more than 700 jobs.

It will forge dynamic collaborations between world-leading academic institutions and creative industries to progress research and innovation, creative excellence, inclusive cultures, and knowledge sharing.

Professor David Bull commented, “The South West is already a creative capital in the UK and MyWorld aims to position the region amongst the best in the world, driving inward investment, increasing productivity and delivering important employment and training opportunities.

“This is the beginning of an exciting journey, which will align research and development endeavours across technology and the creative arts, to help businesses realise their innovation potential, raise their international profile, and maximise the advantages of new technologies.”

The MyWorld Bristol team of investigators has representation from across Engineering, Psychology, Arts and Management and includes:  Professor Andrew Calway, Professor Dimitra Simeonidou, Professor Mary Luckhurst, Professor Iain Gilchrist, Professor Martin Parker and Professor Kirsten Cater.

The full press release is available on the University of Bristol news page.

And for an insight into MyWorld, Strength in Places, watch this video.

A Simulation Environment for Drone Cinematography

Fan Zhang, David Hall, Tao Xu, Stephen Boyle and David Bull

INTRODUCTION

Simulations of drone camera platforms based on actual environments have been identified as being useful for shot planning, training and re­hearsal for both single and multiple drone operations. This is particularly relevant for live events, where there is only one opportunity to get it right on the day.

In this context, we present a workflow for the simulation of drone operations exploiting realistic background environments constructed within Unreal Engine 4 (UE4). Methods for environmental image capture, 3D reconstruction (photogrammetry) and the creation of foreground assets are presented along with a flexible and user-friendly simulation interface. Given the geographical location of the selected area and the camera parameters employed, the scanning strategy and its associated flight parameters are first determined for image capture. Source imagery can be extracted from virtual globe software or obtained through aerial photography of the scene (e.g. using drones). The latter case is clearly more time consuming but can provide enhanced detail, particularly where coverage of virtual globe software is limited.

The captured images are then used to generate 3D background environment models employing photogrammetry software. The reconstructed 3D models are then imported into the simulation interface as background environment assets together with appropriate foreground object models as a basis for shot planning and rehearsal. The tool supports both free-flight and parameterisable standard shot types along with programmable scenarios associated with foreground assets and event dynamics. It also supports the exporting of flight plans. Camera shots can also be designed to pro­vide suitable coverage of any landmarks which need to appear in-shot. This simulation tool will contribute to enhanced productivity, improved safety (awareness and mitigations for crowds and buildings), improved confidence of operators and directors and ultimately enhanced quality of viewer experience.

DEMO VIDEOS

Boat.mp4

Cyclist.mp4

REFERENCES

[1] F. Zhang, D. Hall, T. Xu, S. Boyle and D. Bull, “A Simulation environment for drone cinematography”, IBC 2020.

[2] S. Boyle, M. Newton, F. Zhang and D. Bull, “Environment Capture and Simulation for UAV Cinematography Planning and Training”,  EUSIPCO, 2019

BVI-SR: A Study of Subjective Video Quality at Various Spatial Resolutions

Alex Mackin, Mariana Afonso, Fan Zhang, and David Bull

ABSTRACT

BVI-SR contains 24 unique video sequences at a range of spatial resolutions up to UHD-1 (3840p). These sequences were used as the basis for a large-scale subjective experiment exploring the relationship between visual quality and spatial resolution when using three distinct spatial adaptation filters (including a CNN-based super-resolution method). The results demonstrate that while spatial resolution has a significant impact on mean opinion scores (MOS), no significant reduction in visual quality between UHD-1 and HD resolutions for the superresolution method is reported. A selection of image quality metrics were benchmarked on the subjective evaluations, and analysis indicates that VIF offers the best performance.

SOURCE SEQUENCES

DATABASE DOWNLOAD

[DOWNLOAD] subjective data, instructions and related file.

[DOWNLOAD] all videos from University of Bristol Research Data Storage Facility.

[DOWNLOAD] all videos from MS OneDrive. Please fill a simple registration form to get access. The MS OneDrive verification code will be sent within up to 2 days after we receive the form. Please note the code may be in your Spam box.

 

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing the following paper:

[1] A. Mackin, M. Afonso, F. Zhang and D. Bull, “A study of subjective video quality at various spatial resolutions”, IEEE ICIP, 2018.

[2] A. Mackin, M. Afonso, F. Zhang and D. Bull,”BVI-SR Database“, 2020.

BVI-DVC: A Training Database for Deep Video Compression

Di Ma, Fan Zhang and David Bull

INTRODUCTION

Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with relatively limited content coverage. In this work, a new extensive and representative video database, BVI-DVC is presented for training CNN-based video compression systems, with specific emphasis on machine learning tools that enhance conventional coding architectures, including spatial resolution and bit depth up-sampling, post-processing and in-loop filtering. BVI-DVC contains 800 sequences at various spatial resolutions from 270p to 2160p and has been evaluated on ten existing network architectures for four different coding tools. Experimental results show that this database produces significant improvements in terms of coding gains over three existing (commonly used) image/video training databases under the same training and evaluation configurations. The overall additional coding improvements by using the proposed database for all tested coding modules and CNN architectures are up to 10.3% based on the assessment of PSNR and 8.1% based on VMAF.

SOURCE EXAMPLES

PERFORMANCE

This database has been compared to other three commonly used datasets for training ten popular network architectures which are employed in four different CNN-based coding modules (in the context of HEVC). The figure below shows the average coding gains in terms of BD-rates on JVET test sequences over original HEVC.

USEFUL LINKS

[DOWNLOAD] all videos of this database.

[README] before using the database and for copyright permissions.

If there is any issue regarding this database, please contact fan.zhang@bristol.ac.uk

REFERENCE

If this content has been mentioned in a research publication, please give credit to the University of Bristol, by referencing:

[1] Di Ma, Fan Zhang and David Bull, “BVI-DVC: A Training Database for Deep Video Compression“, arXiv:2003.13552, 2020.

[2] Di Ma, Fan Zhang and David Bull,”BVI-DVC“, 2020.