MyWorld: Visual Computing and Visual Communications Research Internships 2025

About

We are excited to announce that 2x funded summer internships will be available in the summer of 2025, supervised by academics at the Visual Information Lab, University of Bristol. Each intern will work full-time for 7 weeks on cutting-edge research in image and video processing, with support from senior researchers in the group.

These internship projects are supported by MyWorld, a creative technology programme in the UK’s West of England region, funded by £30 million from UK Research and Innovation’s (UKRI) Strength in Places Fund (SIPF).

Eligibility of students and Assessment

To be eligible for a summer internship, students must meet the following criteria:

  • Be a full-time student at the University of Bristol.
  • Be in their second or penultimate year of study (not in their first or final year).
  • Be able to work in person at the University of Bristol during the internship period.
  • Have a strong interest in postgraduate research, particularly in image and video technology.

In line with the University’s commitment to promoting equity and diversity, we particularly welcome and encourage applications from students whose ethnicity, gender, and/or background are currently underrepresented in our postgraduate community.

Students will be assessed on:

  • Academic record
  • Interest in postgraduate research

Project 1

Title: Implicit video compression based on generative models

Description:
This project will leverage various generative models to efficiently represent and compress standard and immersive video signals. Unlike traditional compression techniques, which rely on explicit encoding and decoding processes, this type of approach is expected to learn a compact, latent representation of video content, and then reconstruct high-quality video frames from this compressed representation. This approach aims to achieve better compression ratios while maintaining high visual fidelity, making it particularly promising for applications in video streaming, storage, and real-time communication.

Related works:
[1] Kwan, Ho Man, et al. “HiNeRV: Video compression with hierarchical encoding-based neural representation.”, NeurIPS 2023. [Paper]
[2] Gao, Ge, et al. “PNVC: Towards Practical INR-based Video Compression.”, arXiv:2409.00953, 2024. [Paper]
[3] Blattmann, Andreas, et al. “Align your latents: High-resolution video synthesis with latent diffusion models.”, CVPR 2023. [Paper]

Supervisor:
Please contact Dr. Aaron Zhang (fan.zhang@bristol.ac.uk) for any inquiries.

Project 2

Title: Zero-shot learning for video denoising

Description:
This project aims to develop video denoising through the adoption of zero-shot learning techniques, eliminating the need for conventional noisy-clean training pairs. By leveraging deep learning models that can generalise from unrelated data, the project seeks to develop an innovative denoising framework that can effectively improve video quality under a variety of conditions without prior specific examples. This approach not only promises significant advancements in video processing technology but also extends potential applications in real-time broadcasting, surveillance, and content creation, where optimal video clarity is essential.

Related works:
[1] Y. Mansour and R. Heckel, “Zero-Shot Noise2Noise: Efficient Image Denoising without any Data”, CVPR 2023. [Paper]
[2] Y. Shi, et al., “ZERO-IG: Zero-Shot Illumination-Guided Joint Denoising and Adaptive Enhancement for Low-Light Images”, CVPR 2024. [Paper]

Supervisor:
Please contact Dr Pui Anantrasirichai (n.anantrasirichai@bristol.ac.uk) for any inquiries.

Application

  1. Submit your [Application Form] by 31 January 2025.
  2. Shortlisted candidates will be interviewed by 14 February 2025.
  3. Successful students will be notified by 28 February 2025.
  4. Students are provided internship acceptance form to confirm information required by TSS for registration by 14 March 2025.

Payment

Students will be paid the minimum living wage for the duration of the internship (£12.21 per hour in 2025), which equates to approximately £428 (35 hours) per week before any National Insurance or income tax deductions. Please note that payment will be made a month in arrears, meaning students will be paid for the hours worked at the end of each month.

Terrain analysis for biped locomotion

Numerous scenarios exist where it is necessary or advantageous to classify surface material at a distance from a moving forward-facing camera. Examples include the use of image based sensors for assessing and predicting terrain type in association with the control or navigation of autonomous vehicles. In many real scenarios, the upcoming terrain might not just be flat but may also be oblique and vehicles may need to change speed and gear to ensure safe and clean motion.

Blur-robust texture features

Videos captured with moving cameras, particularly those attached to biped robots, often exhibit blur due to incorrect focus or slow shutter speed. Blurring effects generally alter the spatial and frequency characteristics of the content and this may reduce the performance of a classifier. Robust texture features are therefore developed to deal with this problem. [Matlab Code]

Terrain classification from body-mounted cameras during human locomotion

A novel algorithm for terrain type classification based on monocular video captured from the viewpoint of human locomotion is introduced. A texture-based algorithm is developed to classify the path ahead into multiple groups that can be used to support terrain classification. Gait is taken into account in two ways. Firstly, for key frame selection, when regions with homogeneous texture characteristics are updated, the frequency variations of the textured surface are analysed and used to adaptively define filter coefficients. Secondly, it is incorporated in the parameter estimation process where probabilities of path consistency are employed to improve terrain-type estimation [Matlab Code]. Figures below show the proposed process of terrain classification for tracked regions and a result. [PDF]

Label 1 (green), Label 2 (red) and Label 3 (blue) correspond to the areas classified as hard surfaces, soft surfaces and unwalkable areas, respectively. The size of the circle indicates probabilities – bigger implies higher confidence of classification.

Planar orientation estimation by texture

The gradient of a road or terrain influences the appropriate speed and power of a vehicle traversing it. Therefore, gradient prediction is necessary if autonomous vehicles are to optimise their locomotion. A novel texture-based method for estimating the orientation of planar surfaces under the basic assumption of homogeneity has been developed for scenarios that only a single image source exists, which also includes where a region of interest is too further to employ a depth estimation technique.

References

  • Terrain classification from body-mounted cameras during human locomotion. N. Anantrasirichai, J. Burn and David Bull. IEEE Transactions on Cybernetics. [PDF] [Matlab Code].
  • Projective image restoration using sparsity regularization. N. Anantrasirichai, J. Burn and David Bull. ICIP 2013. [PDF] [Matlab Code]
  • Robust texture features for blurred images using undecimated dual-tree complex wavelets. N. Anantrasirichai, J. Burn and David Bull. ICIP 2014. [PDF] [Matlab Code]
  • Orientation estimation for planar textured surfaces based on complex wavelets. N. Anantrasirichai, J. Burn and David Bull. ICIP 2014. [PDF]
  • Robust texture features based on undecimated dual-tree complex wavelets and local magnitude binary patterns. N. Anantrasirichai, J. Burn and David Bull. ICIP 2015. [PDF]

Mitigating the effects of atmospheric turbulence on surveillance imagery

Various types of atmospheric distortion can influence the visual quality of video signals during acquisition. Typical distortions include fog or haze which reduce contrast, and atmospheric turbulence due to temperature variations or aerosols. An effect of temperature variation is observed as a change in the interference pattern of the light refraction, causing unclear, unsharp, waving images of the objects. This obviously makes the acquired imagery difficult to interpret.

This project introduced a novel method for mitigating the effects of atmospheric distortion on observed images, particularly airborne turbulence which can severely degrade a region of interest (ROI). In order to provide accurate detail from objects behind the distorting layer, a simple and efficient frame selection method is proposed to pick informative ROIs from only good-quality frames. We solve the space-variant distortion problem using region-based fusion based on the Dual Tree Complex Wavelet Transform (DT-CWT). We also propose an object alignment method for pre-processing the ROI since this can exhibit significant offsets and distortions between frames. Simple haze removal is used as the final step. We refer to this algorithm as CLEAR (for code please contact me) (Complex waveLEt fusion for Atmospheric tuRbulence). [PDF] [VIDEOS]

Atmospheric distorted videos of static scene

Mirage (256×256 pixels, 50 frames). Left: distorted sequence. Right: restored image. Download PNG

carheathaze

Download other distorted sequences and references [here].

Atmospheric distorted videos of moving object

Left: Distorted video. Right: Restored video. Download PNG

References

  • Atmospheric turbulence mitigation using complex wavelet-based fusion. N. Anantrasirichai, Alin Achim, Nick Kingsbury, and David Bull. IEEE Transactions on Image Processing. [PDF] [Sequences] [Code: please contact me]
  • Mitigating the effects of atmospheric distortion using DT-CWT fusion. N. Anantrasirichai, Alin Achim, David Bull, and Nick Kingsbury. In Proceedings of the IEEE International Conference on Image Processing (ICIP 2012). [PDF] [BibTeX]
  • Mitigating the effects of atmospheric distortion on video imagery : A review. University of Bristol, 2011. [PDF]
  • Mitigating the effects of atmospheric distortion. University of Bristol, 2012. [PDF]

What’s on TV: A Large-Scale Quantitative Characterisation of Modern Broadcast Video Content

Video databases, used for benchmarking and evaluating the performance of new video technologies, should represent the full breadth of consumer video content. The parameterisation of video databases using low-level features has proven to be an effective way of quantifying the diversity within a database. However, without a comprehensive understanding of the importance and relative frequency and of these features in the content people actually consume, the utility of such information is limited. In collaboration with the BBC, the  “What’s on TV” is a large-scale analysis of the low-level features that exist in contemporary broadcast video. The project aims to establish an efficient set of features that can be used to characterise the spatial and temporal variation in modern consumer content. The meaning and relative significance of this feature set, together with the shape of their frequency distributions, represent highly valuable information for researchers wanting to model the diversity of modern consumer content in representative video databases.

Publications:

Felix Mercer Moss, Fan Zhang, Roland Baddeley and David Bull, What’s on TV: A large-scale quantitative characterisation of modern broadcast video content, ICIP 2016.

for_website

Electron Microscopy Image Segmentation

David Nam, Judith Mantell, David Bull, Paul Verkade, Alin Achim

The following work presents a graphical user interface (GUI), for automatic segmentation of granule cores and membranes, in transmission electron microscopy images of beta cells. The system is freely available for academic research. Two test images are also included. The highlights of our approach are:

  • A fully automated algorithm for granule segmentation.
  • A novel shape regularizer to promote granule segmentation.
  • A dual region-based active contour for accurate core segmentation.
  • A novel convergence filter for granule membrane verification.
  • A precision of 91% and recall of 87% is observed against manual segmentations.

Further details can be found in–

D. Nam, J. Mantell, D. Bull, P. Verkade, and A. Achim, “A novel framework for segmentation of secretory granules in electron micrographs,” Med. Image Anal., vol.18, no. 2, pp. 411–424, 2014.

granulesegmenter

 

Granule Segmenter Download (Matlab)

Optimal presentation duration for video quality assessment

Video content distributors, codec developers and researchers in related fields often rely on subjective assessments to ensure that their video processing procedures result in satisfactory quality. The current 10-second recommendation for the length of test sequences in subjective video quality assessment studies, however, has recently been questioned. Not only do sequences of this length depart from modern cinematic shooting styles, the use of shorter sequences would enable substantial efficiency improvements to the data collection process. This project, therefore, aims to explore the impact upon viewer rating behaviour of using different length video sequences and the consequent savings that could be made in time, labour and money .

Publications:

 

Felix Mercer Moss, Ke Wang, Fan Zhang, Roland Baddeley and David R. Bull, On the optimal presentation duration for subjective video quality assessment, IEEE Transactions on Circuits and Systems for Video Technology, Volume PP, Issue 99, July 2015.

Felix Mercer Moss, Chun-Ting Yeh, Fan Zhang, Roland Baddeley and David R. Bull, Support for reduced presentation durations in subjective video quality assessment, Signal Processing: Image Communication, Volume 48, October 2016, Pages 38-49.
sampleHamster-1200x675