Contributions to video coding and to the development of the H.264/AVC standard

Video Coding and the H.264/AVC Standard

The constantly growing number and increased resolution of video signals creates a continuing demand for more efficient video compression. The compression of video signals is a key technology for media transmission via broadcast, Internet, mobile networks, communication and for storage applications. During the last decade, this development was primarily influenced by the emergence of the H.264/AVC standard (officially called Recommendation ITU-T H.264 / ISO/IEC 14496-10 MPEG-4 part 10: Advanced Video Coding). H.264/AVC achieves significantly higher compression performance than all previous MPEG-x or H.26x standards. The extensions of H.264/AVC give efficient support for additional functionality such as scalability at the bit stream level as well as stereo and multi-view coding. In the meantime, certainly more than one billion devices have been built that are running H.264/AVC.

The following applications are particularly important:

Mobile telephony: H.264/AVC is shipped with the vast majority of mobile phones that support video.
Mobile TV: H.264/AVC is the only video codec used in this application e.g. in USA, Japan, Korea, Italy, Malaysia, Quatar, Finland and many more countries.
Mobile video players: Apple’s iPod and Sony’s PlayStationPortable are running H.264/AVC in millions of devices.
TV broadcast: H.264/AVC is used in all recent launches of digital TV broadcast, e.g. in France, Norway, Brazil, Estonia. Practically every HDTV broadcast in USA and Europe is based on H.264/AVC. As an example, DirecTV in USA is broadcasting more than 1000 HDTV channels – all coded with H.264/AVC. All German HDTV services are using H.264/AVC, as well as in most other European countries. World-wide, almost any IPTV service is based on H.264/AVC.
Blu-Ray Disc: H.264/AVC is a mandatory codec in all Blu-Ray players, and a large percentage of movies on Blu-Ray Discs are coded in H.264/AVC.
Internet Video: Apple’s Quicktime, Adobe’s Flash as well as Microsoft’s Silverlight and Media Player support H.264/AVC. In particular, Adobe’s Flash is used largely within the Internet, e.g. in YouTube by several 100 million users. Therefore, it can be said that the vast majority of movies sent over the Internet is coded by H.264/AVC.
Video conferencing and Internet chat: All new videoconferencing applications are based on H.264/AVC (e.g. Polycom, Tandberg, LifeSize, Sony) or on its SVC extension (e.g. Vidyo).
Both award winners have significantly contributed to the H.264/AVC standard by their scientific visions, development of technical solutions and by participating in the management team responsible for the standard’s development.

By his early work after 1990, Jens-Rainer Ohm has proven that the usage of hierarchical motion-compensated frame structures in video coding can provide advantages with regard to compression performance compared to the conventional frame-recursive and bi-directional structures used mostly around that time. Further on, he showed that due to the limited error propagation of hierarchical structures, they could efficiently be used for scalable coding of video. Such structures are now a core element of H.264/AVC, as well as its scalable and multi-view extensions.

In the late 1990s, Dr. Ohm together with his group at HHI has applied methods of depthcontrolled view adaptation in the area of stereoscopic / 3D video signal processing and coding, extending methods of image-based rendering to video signals for the first time. It was shown that coding of depth maps in combination with such methods could give significant bit rate savings compared to separate or combined compression of stereo or multi-camera video. Further, the capability of display-adaptive stereo baseline width, which is necessary for optimum user experience, can easily be achieved by such methods.

Since 2000, Prof. Ohm and his group at RWTH Aachen were continuously involved in international video standardization, e.g. by proposing transforms with variable block size in H.264/AVC. In his role as MPEG video chair since 2002 and JVT co-chair since 2005 he has actively helped to initiate new developments such as the scalable and multi-view extensions of H.264/AVC, which include many concepts from the early works described above.

Since 1992, Dr. Ohm was teaching the topic of image and video compression at TU Berlin, giving the first specialized lecture within Germany on that subject. By this and the German textbook ”Digitale Bildcodierung” written on the basis of the lecture, many young engineers in the country have been attracted to the scientific work in this field.

Thomas Wiegand has been a contributor to the field of video coding from the very beginning of his scientific career. In 1995, as a visiting scholar at the University of California at Santa Barbara, he successfully combined Lagrangian optimization techniques with hybrid video coding. Continuing his research as a PhD student at the University of Erlangen-Nuremberg, he investigated the idea of multiple reference pictures for motion compensation. He iteratively coupled Lagrangian techniques with video codec design for the first time. Wiegand submitted the two main approaches of his PhD thesis to the standardization of H.263 and H.26L (the predecessor of H.264/AVC), where they are still present in the standard and reference software. His PhD thesis has been published later as a book with its foreword written by Gary J. Sullivan (at the time chair of video standardization in ISO/EC MPEG and ITU-T VCEG): ”This body of work by Thomas Wiegand and Bernd Girod has already proved to have an exceptional degree of influence in the video technology community, and I have personally been in a position to proudly witness much of that influence”.

In 2000, after finishing his dissertation work, Wiegand joined the Heinrich Hertz Institute as chair of the Image Communication group. Since them, he continued his research in video coding and successfully extended his areas of work towards multimedia networking, semantic image representations, and coding of 3D visual data.

Together with his team at the HHI, with particularly strong contributions of Detlev Marpe and Heiko Schwarz, he was in the position to actively contribute to all phases of H.264/AVC standardization:

H.264/AVC – Phase 1: H.26L, the predecessor of H.264/AVC has been developed in ITU-T VCEG with its starting point in Berlin in 1999. Since 2000, Wiegand was appointed as one of the chairs of ITU-T VCEG and remains active in this role. He has been one of the driving forces for establishing the collaboration between ITU-T VCEG and ISO/IEC MPEG. Start of the collaboration was a test carried out by MPEG. The test bit steams representing H.26L were created by Prof. Wiegand’s group and led to H.26L winning the test. Due to the clear result of the test, the Joint Video Team was established with the task to finalize H.264/AVC,. In 2002, Prof. Wiegand was appointed as the editor of the H.264/AVC standard and remains active in that role to the present day. During this first phase of H.264/AVC (H.26L), Wiegand and his team contributed multi-frame motion compensated prediction, multi-hypothesis prediction, context-based adaptive binary arithmetic coding (CABAC), tree-based macroblock partitioning, as well as Lagrangian based coder control for error-free and error-prone environments. Moreover, staring in february 2002, Dr. Wiegand became the editor of the standard and remains also active in this role. Also, the reference software for H.264/AVC was hosted by Karsten Sühring of Wiegand’s group and is being maintained there to the current day.
H.264/AVC – Phase 2 – FRExt (Fidelity Range Extensions): colleagues Marpe and Schwarz have largely driven this extension of H.264/AVC. Main purpose was to improve H.264/AVC for high-resolution video. The technologies, which were partially based on work of Dr. Matthias Wien, a member of Prof. Ohm’s team, have been completely developed and put to standardization by the HHI team. Notably, High Profile is being used for almost all HDTV transmissions and is being heavily used in the Blu-Ray Disc format. Moreover, as contributor and editor in DVB, Wiegand has drafted the corresponding standardization text for HDTV broadcast.
H.264/AVC – Phase 3 – SVC (Scalable Video Coding): This extension is based largely on the contributions of Prof. Wiegand and his group. After the Call for Proposals was won by the proposal of Schwarz, Marpe, and Wiegand, the first model was created based on it. The final standard has been adopted with the contributions from the first model. The innovations that were introduced in the HHI proposal include inter-layer prediction of motion vectors and prediction error signals, as well as single-loop motion compensation for scalable decoding and key pictures for SNR scalability. These techniques have dramatically improved the efficiency of scalable video coding, making it a viable candidate for video transmission today. Moreover, Prof. Wiegand and his team, in particular Thomas Schierl, have significantly contributed to standards for transmitting SVC via H.222.0 / MPEG-2 transport stream and the Internet through the RTP payload in IETF. Schwarz und Wiegand have also heavily influenced the specification for transmission of SVC in DVB systems, which are stipulating applications such as IPTV, Satellite TV and Mobile TV.
H.264/AVC – Phase 4 – MVC (Multi-view Video Coding): This extension is intended for the transmission of 3DTV with H.264/AVC. The corresponding Call for Proposals was again won by a proposal stemming from Wiegand and his team, in particular Philipp Merkle and Dr. Karsten Müller. Consequently, the first model was again based on the HHI proposal. Except for a few high-level syntax details, the final standard is identical to the original proposal. Today, MCV is the basis for the specification of 3D video in Blu-Ray Discs and broadcast.

Prof. Dr Hans-Joachim Grallert
Heinrich Hertz Institute
Fraunhofer Society, Berlin