Fine-grained Importance for Perceptual Video Compression

Fine-grained Importance for Perceptual Video Compression

Author: Evgenya Pergament

Publisher:

Published: 2023

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

The proliferation of videos that are consumed by humans over the internet has accelerated the search for better video compression algorithms. Traditional video compression algorithms reduce video file sizes by removing spatial, temporal, and coding redundancies. Because different spatio-temporal regions of the video differ in their relative importance to the human viewer, there is an opportunity to improve video compression algorithms even further, by removing perceptual redundancy. However, it is challenging to infer what the levels of importance are to the viewer in different areas, or to even collect such fine-grained information. Indeed, such information is often not used during compression beyond low-level heuristics. In this dissertation, we present a framework that facilitates research into fine-grained subjective importance in compressed videos, which we have utilized to improve the rate-distortion performance of existing video codecs. The specific contributions of the work presented in this dissertation are threefold: (1) we designed a novel tool, the Perceptual Importance Map collection Tool (PIMTool), an interactive web-tool which allows scalable collection of fine-grained perceptual importance by having users interactively paint spatio-temporal maps over encoded videos. While users use the tool, the videos presented to the users are constantly updated based on their painted spatio-temporal maps, showing the users the trade-off between improving the importance of certain areas and decreasing the importance of other areas. This tool also allows users to control the magnitude of increase or decrease of the importance in different areas in the video, resulting in detailed relative importance maps; (2) Using PIMTool, we collected a dataset of 178 videos with a total of 14443 frames of human annotated spatio-temporal importance maps over the videos. We call this dataset the Perceptual Importance Map Dataset (PIMD). Via a subjective study, we demonstrate that encoding the videos in our dataset while taking into account the importance maps leads to higher perceptual quality at the same bitrate, with the videos encoded with importance maps preferred 1.8x over the baseline videos; and (3) we used our curated dataset to train a lightweight machine learning model that can predict these spatio-temporal importance regions. We call this model the Perceptual Importance Map Model (PIMM). Our results show that for the 18 videos in our test set, the importance maps predicted by our PIMM model lead to higher perceptual quality videos, 2x preferred over the baseline at the same bitrate.


Image and Video Compression

Image and Video Compression

Author: Madhuri A. Joshi

Publisher: CRC Press

Published: 2014-11-17

Total Pages: 242

ISBN-13: 148222822X

DOWNLOAD EBOOK

Image and video signals require large transmission bandwidth and storage, leading to high costs. The data must be compressed without a loss or with a small loss of quality. Thus, efficient image and video compression algorithms play a significant role in the storage and transmission of data. Image and Video Compression: Fundamentals, Techniques, and Applications explains the major techniques for image and video compression and demonstrates their practical implementation using MATLAB® programs. Designed for students, researchers, and practicing engineers, the book presents both basic principles and real practical applications. In an accessible way, the book covers basic schemes for image and video compression, including lossless techniques and wavelet- and vector quantization-based image compression and digital video compression. The MATLAB programs enable readers to gain hands-on experience with the techniques. The authors provide quality metrics used to evaluate the performance of the compression algorithms. They also introduce the modern technique of compressed sensing, which retains the most important part of the signal while it is being sensed.


Intelligent Image and Video Compression

Intelligent Image and Video Compression

Author: David Bull

Publisher: Academic Press

Published: 2021-04-07

Total Pages: 610

ISBN-13: 0128203544

DOWNLOAD EBOOK

Intelligent Image and Video Compression: Communicating Pictures, Second Edition explains the requirements, analysis, design and application of a modern video coding system. It draws on the authors' extensive academic and professional experience in this field to deliver a text that is algorithmically rigorous yet accessible, relevant to modern standards and practical. It builds on a thorough grounding in mathematical foundations and visual perception to demonstrate how modern image and video compression methods can be designed to meet the rate-quality performance levels demanded by today's applications and users, in the context of prevailing network constraints. "David Bull and Fan Zhang have written a timely and accessible book on the topic of image and video compression. Compression of visual signals is one of the great technological achievements of modern times, and has made possible the great successes of streaming and social media and digital cinema. Their book, Intelligent Image and Video Compression covers all the salient topics ranging over visual perception, information theory, bandpass transform theory, motion estimation and prediction, lossy and lossless compression, and of course the compression standards from MPEG (ranging from H.261 through the most modern H.266, or VVC) and the open standards VP9 and AV-1. The book is replete with clear explanations and figures, including color where appropriate, making it quite accessible and valuable to the advanced student as well as the expert practitioner. The book offers an excellent glossary and as a bonus, a set of tutorial problems. Highly recommended! --Al Bovik - An approach that combines algorithmic rigor with practical implementation using numerous worked examples - Explains how video compression methods exploit statistical redundancies, natural correlations, and knowledge of human perception to improve performance - Uses contemporary video coding standards (AVC, HEVC and VVC) as a vehicle for explaining block-based compression - Provides broad coverage of important topics such as visual quality assessment and video streaming


Perceiving Pixels and Bits

Perceiving Pixels and Bits

Author: Li-Heng Chen

Publisher:

Published: 2022

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

The use of l [subscript p] norms has largely dominated the measurement of distortion in video encoding or loss in neural networks due to their simplicity and analytical properties. However, when used to assess the loss of visual information, these simple norms are not very consistent with human perception. Given the continuously growing demand for online videos, improving the performance of video compression in perceptual ways has become an important, yet challenging problem, as humans are the ultimate receiver of visual signals. The main contribution of this thesis is to provide new directions for optimization of components in video workflows, in which the topics of hybrid video codecs, resizer, and learned image compression models are covered. The first part of this thesis studies the chroma distortions in conventional video compression standards. It is empirically known that the chroma components are less sensitive to human perception, yet has not been studied as much in the application in video compression. To this end, we carried out a subjective experiment to understand the interplay between luma and chroma distortions. We also found that there is room for reducing bitrate consumption in modern video codecs by creatively increasing the compression factor on chroma channels. On the other hand, video downsampling is also a crucial module in adaptive streaming scenarios. This thesis introduces a new data-driven downsampling model realized using deep neural networks. Since the layers of convolutional neural networks can only be used to alter the resolutions of their inputs by integer scale factors, we seek new ways to achieve fractional scaling, which is crucial in many video processing applications. The second part of this thesis explores the perceptual aspect of optimizing learning-based lossy image compression models. Although numerous powerful perceptual models have been proposed to predict the perceived quality of a distorted picture, most other image quality indexes have never been adopted as deep network loss functions, because they are generally non-differentiable. To address this problem, we propose a new "proximal" approach, called the ProxIQA training, to optimize image analysis networks against quantitative perceptual models. We also describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models. Our approach is simple: compose a pair of differentiable downsampling/upsampling layers that sandwich a neural compression model. To determine resize factors for different inputs, we utilize another neural network jointly trained with the compression model, with the end goal of minimizing the rate-distortion objective. Among these, quantitative simulations and subjective quality studies show that the proposed methods yield significant improvements in coding efficiency. The thesis concludes with some remarks on future directions and open problems


Video Compression Handbook

Video Compression Handbook

Author: Andy Beach

Publisher: Peachpit Press

Published: 2018-06-27

Total Pages: 418

ISBN-13: 0134846729

DOWNLOAD EBOOK

Video compression is not a new process; however, it is forever evolving. New standards, codecs, and ways of getting the job done are continually being created. Newcomers to video compression and seasoned veterans alike need to know how to harness the tools and use them for specific workflows for broadcast, the Web, Blu-rays, set-top boxes, digital cinema, and mobile devices. Here to guide you through the multitude of formats and confusing array of specifications, Andy Beach and Aaron Owen use a practical, straightforward approach to explaining video compression. After covering the fundamentals of audio and video compression, they explore the current applications for encoding, discuss the common workflows associated with each, and then look at the most common delivery platforms. The book includes examples from the authors’ projects as well as recipes that offer a way to define some of the best practices of video compression today. This invaluable resource gives you: proven techniques for delivering video online, or via disc or other devices. clear, straightforward explanations that cut through the jargon. step-by-step instructions for using a wide variety of encoding tools. workflow tips for performing either stand-alone or batch compressions. insight and advice from top compression professionals sprinkled throughout.


Image and Video Compression for Multimedia Engineering

Image and Video Compression for Multimedia Engineering

Author: Yun-Qing Shi

Publisher: CRC Press

Published: 2019-03-07

Total Pages: 636

ISBN-13: 1351578650

DOWNLOAD EBOOK

The latest edition provides a comprehensive foundation for image and video compression. It covers HEVC/H.265 and future video coding activities, in addition to Internet Video Coding. The book features updated chapters and content, along with several new chapters and sections. It adheres to the current international standards, including the JPEG standard.


Video Compression Demystified

Video Compression Demystified

Author: Peter D. Symes

Publisher: McGraw-Hill Professional Publishing

Published: 2001

Total Pages: 372

ISBN-13: 9780071363242

DOWNLOAD EBOOK

CD-ROM contains: Encoders and decorders for DCT, Wavelet, and Fractal algorithms -- Video samples.


Intelligence Science and Big Data Engineering

Intelligence Science and Big Data Engineering

Author: Changyin Sun

Publisher: Springer

Published: 2013-11-18

Total Pages: 924

ISBN-13: 3642420575

DOWNLOAD EBOOK

This book constitutes the thoroughly refereed post-conference proceedings of the 4th International Conference on Intelligence Science and Big Data Engineering, IScIDE 2013, held in Beijing, China, in July/August 2013. The 111 papers presented were carefully peer-reviewed and selected from 390 submissions. Topics covered include information theoretic and Bayesian approaches; probabilistic graphical models; pattern recognition and computer vision; signal processing and image processing; machine learning and computational intelligence; neural networks and neuro-informatics; statistical inference and uncertainty reasoning; bioinformatics and computational biology and speech recognition and natural language processing.


Rate-Distortion Based Video Compression

Rate-Distortion Based Video Compression

Author: Guido M. Schuster

Publisher: Boom Koninklijke Uitgevers

Published: 1996-12-31

Total Pages: 312

ISBN-13: 9780792398509

DOWNLOAD EBOOK

This is the first book about the rapidly evolving field of operational rate distortion (ORD) based video compression. ORD is concerned with the allocation of available bits among the different sources of information in an established coding framework. Today's video compression standards leave great freedom in the selection of key parameters, such as quantizers and motion vectors. The main distinction among different vendors is in the selection of these parameters, and this book presents a mathematical foundation for this selection process. The book contains a review chapter on video compression, a background chapter on optimal bit allocation and the necessary mathematical tools, such as the Lagrangian multiplier method and Dynamic Programming. These two introductory chapters make the book self-contained and provide a fast way of entering this exciting field. Rate-Distortion Based Video Compression establishes a general theory for the optimal bit allocation among dependent quantizers. The minimum total (average) distortion and the minimum maximum distortion cases are discussed. This theory is then used to design efficient motion estimation schemes, video compression schemes and object boundary encoding schemes. For the motion estimation schemes, the theory is used to optimally trade the reduction of energy in the displaced frame difference (DFD) for the increase in the rate required to encode the displacement vector field (DVF). These optimal motion estimators are then used to formulate video compression schemes which achieve an optimal distribution of the available bit rate among DVF, DFD and segmentation. This optimal bit allocation results in very efficient video coders. In the last part of the book, the proposed theory is applied to the optimal encoding of object boundaries, where the bit rate needed to encode a given boundary is traded for the resulting geometrical distortion. Again, the resulting boundary encoding schemes are very efficient. Rate-Distortion Based Video Compression is ideally suited for anyone interested in this booming field of research and development, especially engineers who are concerned with the implementation and design of efficient video compression schemes. It also represents a foundation for future research, since all the key elements needed are collected and presented uniformly. Therefore, it is ideally suited for graduate students and researchers working in this field.