Perceptually Motivated Algorithms for Multimedia

dc.contributor.advisorBasu, Anup (Computing Science)
dc.contributor.authorZhang, Shupei
dc.date.accessioned2025-05-29T03:23:09Z
dc.date.available2025-05-29T03:23:09Z
dc.date.issued2024-11
dc.description.abstractPerceptual factors in vision can facilitate the development of more effective multimedia algorithms. In particular, the wide dynamic range of the human vision system is a motivation for developing image lighting enhancement algorithms. Image lighting enhancement can be achieved by capturing multiple images with different exposure settings and then reconstructing a final image. However, this approach cannot solve the problem of revealing or predicting details in already-captured images. Single-image lighting enhancement is desirable for this scenario, but many challenges remain to be addressed including over-enhancement, noise, and color artifacts due to a lack of understanding of the image content. Another aspect of multimedia algorithms that can benefit from perceptual factors, like the foveation mechanism and perceptual quality, is image and video compression. As the resolution and image quality of modern cameras have increased, the amount of data produced by computational photography has also surged dramatically. This has created a demand for better image/video compression methods that can reduce the data size without compromising the image quality. In this thesis, four perceptually motivated methods are proposed to address the challenges in single-image lighting enhancement and image/video compression. First, we propose an image lighting enhancement method based on a fusion pyramid, which is a traditional contrast-based fusion approach. Second, we propose a self-attention-based learning strategy to reconstruct a properly exposed image from a single input image. We leverage the self-attention mechanism to model the interdependencies between different locations, and design a generative adversarial network (GAN) with a custom HDR loss function to improve the image quality. Third, we propose a novel video compression method that integrates visual saliency information with foveation to reduce perceptual redundancy. This is an innovative approach to subsample and restore the input image using saliency data, which allocates more space for salient regions and less for non-salient ones. Finally, based on the assumption that a group of images can be decomposed into several shared feature matrices, we propose a novel principal component approximation network (PCANet) for image compression. This is the first learning-based method that achieves promising performance while including the size of the network in the bitrate calculation.
dc.identifier.doihttps://doi.org/10.7939/r3-989g-4d98
dc.language.isoen
dc.rightsThis thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
dc.subjectPerceptual Quality
dc.subjectImage Enhancement
dc.subjectImage/Video Compression
dc.subjectMachine Learning
dc.titlePerceptually Motivated Algorithms for Multimedia
dc.typehttp://purl.org/coar/resource_type/c_46ec
thesis.degree.grantorhttp://id.loc.gov/authorities/names/n79058482
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
ual.date.graduationFall 2024
ual.departmentDepartment of Computing Science
ual.jupiterAccesshttp://terms.library.ualberta.ca/public

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_Shupei_202404_PhD.pdf
Size:
9.63 MB
Format:
Adobe Portable Document Format