H.264 Advanced Video Coding or MPEG-4 Part 10 is fundamental to a growing range of markets such as high definition broadcasting, internet video sharing, mobile video and digital surveillance. This book reflects the growing importance and implementation of H.264 video technology. Offering a detailed overview of the system, it explains the syntax, tools and features of H.264 and equips readers with practical advice on how to get the most out of the standard. Packed with clear examples and illustrations to explain H.264 technology in an accessible and practical way. Covers basic video coding concepts, video formats and visual quality. Explains how to measure and optimise the performance of H.264 and how to balance bitrate, computation and video quality. Analyses recent work on scalable and multi-view versions of H.264, case studies of H.264 codecs and new technological developments such as the popular High Profile extensions. An invaluable companion for developers, broadcasters, system integrators, academics and students who want to master this burgeoning state-of-the-art technology. "[This book] unravels the mysteries behind the latest H.264 standard and delves deeper into each of the operations in the codec. The reader can implement (simulate, design, evaluate, optimize) the codec with all profiles and levels. The book ends with extensions and directions (such as SVC and MVC) for further research." Professor K. R. Rao, The University of Texas at Arlington, co-inventor of the Discrete Cosine Transform
Video coding is complex. YouTube and Netflix use it to deliver great video even at extremely low data rates. Have you ever wondered how they optimize video for low bandwidths? Do technical terms like 'rate distortion optimization', 'predictive coding' or 'adaptive quantization' overwhelm you? Decode To Encode is the only book that answers the hows and whys of elements in AVC (H.264), HEVC (H.265) and VP9. It provides video engineers and students all the compression fundamentals they need to solve problems, conduct research and serve their customers better. Coming from an experienced video codec engineer and product enthusiast, the book is written in a clear language with numerous examples.You will learn about: - digital video fundamentals and the evolution of codecs;- spatial and temporal aspects leveraged to achieve compression in block-based video architecture;- intra and inter coding, GOPs, block partitioning, prediction, transforms, quantization, CABAC, in-loop filtering, rate-distortion optimization and rate control;- bitrate modes, performance metrics and comparisons;- emerging topics like per-title encoding, AV1, 360 Video and VR, and encoding with ML.Why be left behind in today's evolving video landscape? Get the tools you need to understand technical specifications and design video algorithms. Learn the concepts in this book and become a compression expert today. Exude confidence as you walk into your next meeting or start a conversation about video compression
This book provides a unified approach for the study of constrained Markov decision processes with a finite state space and unbounded costs. Unlike the single controller case considered in many other books, the author considers a single controller with several objectives, such as minimizing delays and loss, probabilities, and maximization of throughputs. It is desirable to design a controller that minimizes one cost objective, subject to inequality constraints on other cost objectives. This framework describes dynamic decision problems arising frequently in many engineering fields. A thorough overview of these applications is presented in the introduction. The book is then divided into three sections that build upon each other.
- Treats joint source and channel decoding in an integrated way - Gives a clear description of the problems in the field together with the mathematical tools for their solution - Contains many detailed examples useful for practical applications of the theory to video broadcasting over mobile and wireless networks Traditionally, cross-layer and joint source-channel coding were seen as incompatible with classically structured networks but recent advances in theory changed this situation. Joint source-channel decoding is now seen as a viable alternative to separate decoding of source and channel codes, if the protocol layers are taken into account. A joint source/protocol/channel approach is thus addressed in this book: all levels of the protocol stack are considered, showing how the information in each layer influences the others. This book provides the tools to show how cross-layer and joint source-channel coding and decoding are now compatible with present-day mobile and wireless networks, with a particular application to the key area of video transmission to mobiles. Typical applications are broadcasting, or point-to-point delivery of multimedia contents, which are very timely in the context of the current development of mobile services such as audio (MPEG4 AAC) or video (H263, H264) transmission using recent wireless transmission standards (DVH-H, DVB-SH, WiMAX, LTE). This cross-disciplinary book is ideal for graduate students, researchers, and more generally professionals working either in signal processing for communications or in networking applications, interested in reliable multimedia transmission. This book is also of interest to people involved in cross-layer optimization of mobile networks. Its content may provide them with other points of view on their optimization problem, enlarging the set of tools which they could use. Pierre Duhamel is director of research at CNRS/ LSS and has previously held research positions at Thomson-CSF, CNET, and ENST, where he was head of the Signal and Image Processing Department. He has served as chairman of the DSP committee and associate Editor of the IEEE Transactions on Signal Processing and Signal Processing Letters, as well as acting as a co-chair at MMSP and ICASSP conferences. He was awarded the Grand Prix France Telecom by the French Science Academy in 2000. He is co-author of more than 80 papers in international journals, 250 conference proceedings, and 28 patents. Michel Kieffer is an assistant professor in signal processing for communications at the Université Paris-Sud and a researcher at the Laboratoire des Signaux et Systèmes, Gif-sur-Yvette, France. His research interests are in joint source-channel coding and decoding techniques for the reliable transmission of multimedia contents. He serves as associate editor of Signal Processing (Elsevier). He is co-author of more than 90 contributions to journals, conference proceedings, and book chapters. - Treats joint source and channel decoding in an integrated way - Gives a clear description of the problems in the field together with the mathematical tools for their solution - Contains many detailed examples useful for practical applications of the theory to video broadcasting over mobile and wireless networks
This book provides developers, engineers, researchers and students with detailed knowledge about the High Efficiency Video Coding (HEVC) standard. HEVC is the successor to the widely successful H.264/AVC video compression standard, and it provides around twice as much compression as H.264/AVC for the same level of quality. The applications for HEVC will not only cover the space of the well-known current uses and capabilities of digital video – they will also include the deployment of new services and the delivery of enhanced video quality, such as ultra-high-definition television (UHDTV) and video with higher dynamic range, wider range of representable color, and greater representation precision than what is typically found today. HEVC is the next major generation of video coding design – a flexible, reliable and robust solution that will support the next decade of video applications and ease the burden of video on world-wide network traffic. This book provides a detailed explanation of the various parts of the standard, insight into how it was developed, and in-depth discussion of algorithms and architectures for its implementation.
Introduction to Digital Audio Coding and Standards provides a detailed introduction to the methods, implementations, and official standards of state-of-the-art audio coding technology. In the book, the theory and implementation of each of the basic coder building blocks is addressed. The building blocks are then fit together into a full coder and the reader is shown how to judge the performance of such a coder. Finally, the authors discuss the features, choices, and performance of the main state-of-the-art coders defined in the ISO/IEC MPEG and HDTV standards and in commercial use today. The ultimate goal of this book is to present the reader with a solid enough understanding of the major issues in the theory and implementation of perceptual audio coders that they are able to build their own simple audio codec. There is no other source available where a non-professional has access to the true secrets of audio coding.
This second edition focuses on audio, image and video data, the three main types of input that machines deal with when interacting with the real world. A set of appendices provides the reader with self-contained introductions to the mathematical background necessary to read the book. Divided into three main parts, From Perception to Computation introduces methodologies aimed at representing the data in forms suitable for computer processing, especially when it comes to audio and images. Whilst the second part, Machine Learning includes an extensive overview of statistical techniques aimed at addressing three main problems, namely classification (automatically assigning a data sample to one of the classes belonging to a predefined set), clustering (automatically grouping data samples according to the similarity of their properties) and sequence analysis (automatically mapping a sequence of observations into a sequence of human-understandable symbols). The third part Applications shows how the abstract problems defined in the second part underlie technologies capable to perform complex tasks such as the recognition of hand gestures or the transcription of handwritten data. Machine Learning for Audio, Image and Video Analysis is suitable for students to acquire a solid background in machine learning as well as for practitioners to deepen their knowledge of the state-of-the-art. All application chapters are based on publicly available data and free software packages, thus allowing readers to replicate the experiments.