Video codec
|
A video codec is a device or software module that enables the use of compression for digital video. The compression usually employs lossy data compression. Historically, video was stored as an analog signal on magnetic tape. Around the time when the compact disc entered the market as a digital-format replacement for analog audio, it became feasible to also begin storing and using video in digital form, and a variety of such technologies began to emerge.
Audio and video call for customized methods of compression. Engineers and mathematicians have tried a number of solutions for tackling this problem.
There is a complex balance between the quality of the video, the quantity of the data needed to represent it (also known as the bit rate), the complexity of the encoding and decoding algorithms, robustness to data losses and errors, ease of editing, random access, the state of the art of compression algorithm design, end-to-end delay, and a number of other factors.
Applications
In daily life, digital video codecs are found in DVD (MPEG-2), VCD (MPEG-1), in emerging satellite and terrestrial broadcast systems, and on the Internet. Online video material is encoded in a variety of codecs, and this has lead to the availability of codec packs - a pre-assembled set of commonly used codecs combined with an installer available as a software package for PCs.
Encoding media by the customers themselves has seen an upsurge with the availability of DVD-writers. Since commercially available DVDs are usually bigger in size (dual-layer) and dual-layer DVD-writers are not as common, so often the material has to be compressed again, sacrificing quality so the media will fit on a single disc.
Video codec design
A typical digital video codec design starts with conversion of camera-input video from RGB color format to YCbCr color format, and often also chroma subsampling to produce a 4:2:0 (or sometimes 4:2:2 in the case of interlaced video) sampling grid pattern. The conversion to YCbCr provides two benefits: 1) it improves compressibility by providing decorrelation of the color signals; and 2) it separates the luma signal, which is perceptually much more important, from the chroma signal, which is less perceptually important and which can be represented at lower resolution (hence the 4:2:0 or 4:2:2 sampling grid use).
Some amount of spatial and temporal downsampling may also be used to reduce the raw data rate before the basic encoding process.
The input video is then typically broken up into macroblocks, which are 16x16 blocks of luma data accompanied by corresponding blocks of chroma data. Block-wise Motion compensation is used to predict the value of the input video from data in previous pictures. A block transform or a subband decomposition is applied to reduce spatial statistical correlation. The most popular such transform is the 8x8 discrete cosine transform (DCT). The output of the transform is then quantized and entropy encoding is applied to the quantized values. When a DCT has been used, the coefficients are typically scanned using a zig-zag scan order, and the entropy coding typically combines a number of consecutive zero-valued quantized coefficients with the value of the next non-zero quantized coefficient into a single symbol, and also has special ways of indicating when all of the remaining quantized coefficient values are equal to zero. The entropy coding method typically uses variable-length coding tables.
The decoding process consists of performing, to the extent possible, an inversion of each stage of the encoding process. The one stage that cannot be exactly inverted is the quantization stage. There, a best-effort approximation of inversion is performed. This part of the process is often called "inverse quantization" or "dequantization", although quantization is an inherently non-invertible process.
Video codec designs are often standardized - i.e., specified precisely in a published document. However, only the decoding process needs to be standardized to enable interoperability. The encoding process is typically not specified at all in a standard, and implementers are free to design their encoder however they want, as long as the video can be decoded in the specified manner. For this reason, the quality of the video produced by decoding the results of different encoders that use the same video codec standard can vary dramatically from one encoder implementation to another.
Commonly used codecs
A variety of codecs can be implemented with relative ease on PCs and in consumer electronics equipment. It is therefore possible for multiple codecs to be available in the same product, avoiding the need to choose a single dominant codec for compatibility reasons. In the end it seems unlikely that one codec will replace them all. Some widely-used video codecs are listed below, starting with a chronological-order list of the ones specified in international standards.
H.261: Used primarily in older videoconferencing and videotelephony products. H.261, developed by the ITU-T, was the first practical digital video compression standard. Essentially all subsequent standard video codec designs are based on it. It included such well-established concepts as YCbCr color representation, the 4:2:0 sampling format, 8-bit sample precision, 16x16 macroblocks, block-wise motion compensation, 8x8 block-wise discrete cosine transformation, zig-zag coefficient scanning, scalar quantization, run+value symbol mapping, and variable-length coding. H.261 supported only progressive scan video.
MPEG-1 Part 2: Used for Video CDs, and also sometimes for online video. The quality is roughly comparable to that of VHS. Notable is that VCD unlike VHS is digital technology, therefore the quality does not deteriorate. If the source video quality is good and the bitrate is high enough, VCD can look better than VHS, and all in all very good, but VCD requires high bitrates for this. However, to get a fully compliant VCD file, bitrates higher than 1150 kbit/s and resolutions higher than 352 x 288 should not be used. For the most part of it though, this incompatibility problem is basically only a problem with some standalone VCD-players like DVD-players. Includes the *.mp3 standard. When it comes to compatibility, VCD has the highest compatibility of any digital video/audio codec. Almost every computer in the world can play this codec, and very few DVD players do not support it. In terms of technical design, the most significant enhancements in MPEG-1 relative to H.261 were half-pel and bi-predictive motion compensation support. MPEG-1 supported only progressive scan video.
MPEG-2 Part 2 (a common-text standard with H.262): Used on DVD and in another form for SVCD and used in most digital video broadcasting and cable distribution systems. When used on a standard DVD, it offers good picture quality and supports widescreen. When used on SVCD, it is not as good but is certainly better than VCD. Unfortunately, SVCD will only fit around 40 minutes of video on a CD, VCD will fit an hour. Will also be used on HD-DVD and Blu-Ray. In terms of technical design, the most significant enhancement in MPEG-2 relative to MPEG-1 was the addition of support for interlaced video. MPEG-2 is now considered an aging codec, but has tremendous market acceptance and a very large installed base.
H.263: Used primarily for videoconferencing, videotelephony, and internet video. H.263 represented a significant step forward in standardized compression capability for progressive scan video. Especially at low bit rates, it could provide a substantial improvement in the bit rate needed to reach a given level of fidelity.
MPEG-4 Part 2: An MPEG standard that can be used for internet, broadcast, and on storage media. It offers improved quality relative to MPEG-2 and the first version of H.263. Its major technical features beyond prior codec standards consisted of object-oriented coding features and a variety of other such features not necessarily intended for improvement of ordinary video coding compression capability. It also included some enhancements of compression capability, both by embracing capabilities developed in H.263 and by adding new ones such as quarter-pel motion compensation. Like MPEG-2, it supports both progressive scan and interlaced video.
MPEG-4 Part 10 (a technically aligned standard with H.264 and often also referred to as AVC). This emerging new standard is the current state of the art of ITU-T and MPEG standardized compression technology, and is rapidly gaining adoption into a wide variety of applications. It contains a number of significant advances in compression capability, and it has recently been adopted into a number of company products, including for example the PlayStation Portable, the Nero Digital product suite, and the upcoming Mac OS X v10.4, as well as HD-DVD/Blu-Ray.
DivX, XviD and 3ivx: Video codec packages basically using MPEG-4 Part 2 video codec, with the *.avi, *.mp4, *.ogm or *.mkv file container formats.
WMV (Windows Media Video): Microsoft's family of video codec designs including WMV 7, WMV 8, and WMV 9. It can do anything from low resolution video for dial up internet users to HDTV. Files can be burnt to CD and DVD or output to any number of devices. It is also useful for Media Centre PCs. WMV can be viewed as an enhancement of the MPEG-4 codec design. The latest generation of WMV is now in the process of being standardized in SMPTE as the draft VC-1 standard.
RealVideo: Developed by RealNetworks. A popular codec technology a few years ago, now fading in importance for a variety of reasons.
Sorenson 3: A codec that is popularly used by Apple's QuickTime. Many of the Quicktime Movie trailers found on the web use this codec.
Cinepak: A very early codec used by Apple's QuickTime.
All of the codecs above have their qualities and drawbacks. Comparisons are frequently published. The tradeoff between bit rate and fidelity (including artifacts) is usually considered the most important figure of technical merit.zh:视频编解码器