Intro to Video Codecs
This page provides a survey of the current video/audio codecs, file formats, streaming tech, for streaming multimedia.
I need to setup embedded videos on my site as well sites i work for. So, today, i started a comprehensive study on video streaming. Here's some learning notes. (The last time i worked with video is in 1995 to 1997. At the time video streaming is still pretty much science fiction)
The technology involved for playing a video from a browser from a site, such as youtube, involves several separate technologies. Here's a summary of the basics:
- video codec. The video file must be encoded into a standard format (i.e. sequence of bits). Usually this means using a particular compression scheme. The encoding and decoding algorithm and format is called a codec. (examples are: H.264, VP6, VP8, MPEG-4, WMV 9, DVD-Video, etc)
- audio codec. The audio part for the video must also be encoded, usually treated and stored separately from the video. (examples: MP3, AAC, WAV, AC-3 (Dolby Digital), FLAC, etc)
- multimedia container format. The encoded video/audio file(s) is saved in a container file format. That is, a file format that contains video, audio, and other items such as subtitle. (examples: QuickTime's “.mov”, Microsoft's “.avi”, “wmv”, Adobe Flash, DVD, mpeg, etc)
- streaming protocol. The file must be served in a special network protocol, for example, by a streaming server. Because, it's not a normal file. For example, You want users be able to jump to any location and play from there.
- application support. The web browsers must have special code/plugin for movie files so that for example a movie file will display a snapshot of its content even when the movie is not playing. And it have to show user interface elements such as the play/pause buttons, view in full screen widget, volume change control, etc. (popularly done with Adobe Flash plugin. Or Java, or HTML5's video tag.)
Here's the most popular or important ones.
H.264 (aka MPEG-4 part 10, AVC (Advanced Video Coding)). It is currently the most popular video codec. It is used in Blu-ray Disc, YouTube, iTunes, and many nation's broadcasting and other video related applications.
First version of H.264 was completed in 2003-05.
VP8 is a competitor to H.264. Currently owned by Google and released as open source.
The associated container format with VP8 is WebM. WebM format is also open source, and is based on the free Matroska container format. WebM file format is competing to be the default video format for HTML Video.
[see HTML Video Tag]
Windows Media Video (WMV), refers to several video codecs from Microsoft, but mostly the latest WMV 9 (aka VC-1). WMV 9 is released to standard body and standardized as VC-1, in 2006.
It is widely supported, and is used in Blue-ray Discs, Xbox 360, PlayStation 3. It's a competitor to H.264.
Besides WMV 9, there's 2 other codecs: WMV Screen and VMV Image. The screen one is optimized for screenshots, for example, tutorials on using a application. The Image one is optimized for slideshows.
Another widely used one is from Apple's QuickTime. (see below)
Sorenson codec refers to 2 proprietary codecs.
- Sorenson Video (aka Sorenson Video Codec, Sorenson Video Quantizer, SVQ). Used by Apple's Quick Time, but is phased out in ~2005.
- Sorenson Spark (aka Sorenson H.263). Used by Adobe Flash, but is phased out in ~2008.
Theora is a free lossy video compression codec. The technical quality of Theora is not as good as H.264 or VP8. It is based on VP2 format of 2002, then a proprietary format by On2 released as free. It is not widely supported. Theora is usually stored in the Ogg container format, together with the free lossy audio codec Vorbis.
DivX and xvid
DivX started as a open source project in ~2000 but became proprietary, and Xvid is forked from it. Most pirated movies from DVDs are saved in DivX or Xvid formats. Both do not particularly define new codecs or container formats, rather, they are based on some subset of MPEG-4 standard and other container formats. It began as a reverse-engineer of Microsoft's MPEG-4 version 3 codec.
There are tens of audio codecs, some are lossy, some lossless. Here's some popular ones.
- MP3, from the standard MPEG-1 Audio Layer 3. Most popular. Started the digital music era in late 1990s.
- AAC, lossy. Used in iTunes, iPod, iPhone, etc. Much better than mp3.
- Windows Media Audio (WMA). Microsoft's answer. WMA is part of Microsoft's Windows Media framework. WMA can refer to 4 codecs: WMA, WMA Pro, WMA Lossless, WMA Voice.
- Vorbis. Open source. Typically used together with the Ogg container format. Superior to mp3, and probably inferior to AAC.
- FLAC. Typically compress a music file by 50%.
- WAV by Microsoft. Started in early 1990s. ()
- AIFF from Apple. Started in early 1990s.
wav and aiff both support compression, but audio stored in these formats are almost always not compressed.
Audio Formats Do Not Matter Much Anymore
Note that 300 kilo bits per sec gets you CD quality audio (using a lossy compression). While a DVD quality video is about 5 mega bits/s. That's about 17 times more.
The need for audio codec research has past. Computer storage and processing power today can deal with audio no problem, and use of lossless codec for audio is increasingly popular. So, for issues of movie streaming, the video part is the primary concern.
Multimedia Container Formats
QuickTime (“.mov” or “.qt”) is Apple's container format. Widely used.
AVI is Microsoft's tech, fairly old, started in early 1990s. Widely used.
Advanced Systems Format (ASF) is Microsoft's container format, part of the Microsoft's Windows Media framework.
MPEG-4 Part 14 (.mp4) is a standardized container format. Variant file name extensions are M4V (.m4v) for video, “.m4a” for files containing audio only. The “.m4v” and “.m4a” suffixes are popularized by Apple's iTunes, iPod, iPhone etc.
3GP and 3G2 (.3gp, .3g2) are video/audio container formats for mobile phones. They are also based on MPEG-4 Part 14 (.mp4).
Adobe Flash is a widely used for online video. Flash files have several extension variations, most popular is “.swf”. (Adobe Flash is a application, which supports multimedia. Adobe Flash by itself isn't just a container format.) Flash files dedicated to video are Flash Video files, which is container format, with these extensions: “.flv”, “.f4v”, “.f4p”, “.f4a”, “.f4b”. Flash Video is used by for example YouTube, Google Video, “Yahoo! Video”. The FLV format is old, started in 2002 and is being phased out due to limitations. FLV is also commonly used with older codecs such as Sorenson and VP6. Now the recommended format is F4V, created in 2007.
|File Extension||Internet Media Type||Meaning|
|.f4v||video/mp4||Video for Adobe Flash Player|
|.f4p||video/mp4||Protected Video for Adobe Flash Player|
|.f4a||audio/mp4||Audio for Adobe Flash Player|
|.f4b||audio/mp4||Audio Book for Adobe Flash Player|
Matroska (“.mkv”) is free container format. Recently adopted by Google and re-branded as WebM, to be used together with VP8, proposed to be a standard video file format for html5.
Ogg is another free multimedia container format. Its tech quality is often in dispute. It is used by Wikipedia.
QuickTime (QT) is Apple's multimedia framework. It supports audio and video, as well as interactive panoramic images, and including such things as midi. It supports many codecs for audio and video.
The file format of QT is “.mov”. Quote:
The QuickTime (.mov) file format functions as a multimedia container file that contains one or more tracks, each of which stores a particular type of data: audio, video, effects, or text (For example, for subtitles). Each track either contains a digitally-encoded media stream (using a specific codec) or a data reference to the media stream located in another file. Tracks are maintained in a hierarchical data structure consisting of objects called atoms. An atom can be a parent to other atoms or it can contain media or edit data, but it cannot do both.
QT 7.x is around from 2005 (OS X 10.4) to version 7.6 in 2009 (OS X 10.6). After that, the next version is QT X (10), which is supposedly completely rewritten for 64-bit computing and somewhat incompatable with past QT versions. Though, QT X relies on QT 7 for dealing with older codecs and other files such as MIDI.
Some more Wikipedia quotes:
QuickTime X is a combination of two technologies: QuickTime Kit Framework (QTKit) and QuickTime X Player.
… many Apple products (such as iTunes and Apple TV) still use the older QuickTime 7 engine.
QuickTime Streaming Server (QTSS) is a server or service daemon built into Apple's Mac OS X Server that delivers video and audio on request to users over a computer network, including the Internet. Its primary GUI configuration tool is QTSS Publisher and its web-based administration port is 1220.
QuickTime Broadcaster is an audio and video RTP/RTSP server by Apple Computer for Mac OS X. It is separate from Apple's QuickTime Streaming Server, as it is not a service daemon but a desktop application.
FFmpeg and VLC
FFmpeg is a open source project on video and audio tech. Three notable component from FFmpeg are:
- libavcodec, an audio/video codec library used by several other projects.
- libavformat, an audio/video container mux and demux library
- ffmpeg command line program for transcoding multimedia files.
One interesting thing about the project is that it has a command line tool “ffmpeg” that lets you convert one video format to another.
VLC is a movie player. Originally designed as a server/client for streaming multimedia, but now is just a single application the VLC. Was at one point used by Google at Google Video until they switched to Flash. VLC can also be used on the command line.
The following are the most commonly used protocols for Streaming media. Each with a particular purpose:
- RTSP. For example, send the play, pause, request from client.
- RTP. For example, the streaming media payload.
- RTCP. For example, monitor transmission statistics and QoS information.
The above combo are usually referred to as “RTSP/RTP”.
Adobe Flash uses its own Real Time Messaging Protocol (RTMP).
Microsoft was using Microsoft Media Server (MMS), but is preprecated in 2003. Now Microsoft uses RTSP.
HTTP Live Streaming is Apple's tech, new in 2009 with QuickTime X. It is different than others because it is HTTP based. Proposed as a internet standard.
Here's a list of video hosting services: Comparison of video services. Contains some detail of what protocol they use.
See also: Comparison of streaming media systems .
Besides Wikipedia, here's some other articles i used for this article.
Comparison of codecs:
- First Look: H.264 and VP8 Compared , by Jan Ozer. At http://www.streamingmedia.com/Articles/Editorial/Featured-Articles/First-Look-H.264-and-VP8-Compared-67266.aspx
- The first in-depth technical analysis of VP8 , by Jason Garrett-Glaser. (a x264 and ffmpeg developer; college student) At
- Video on the Web , by Till Halbach. (comparison of Dirac, Dirac Pro, Theora, H.264) At
- [whatwg] H.264-in-<video> vs plugin APIs , by Chris DiBona (google employee) At
Comparison of container formats:
- Ogg objections , by Mans Rullgard (ffmpeg developer). At http://hardwarebug.org/2010/03/03/ogg-objections/
- In Defense of Ogg's Good Name , by Christopher Montgomery (ogg designer). At http://people.xiph.org/~xiphmont/lj-pseudocut/o-response-1.html
Audio codecs comparison:
- comparison of lossless audio codecs, by Josh Coalson (FLAC author). At sourceforge.net
- Lossless comparison at hydrogenaudio.org. At hydrogenaudio.org
- Apple proposes HTTP streaming feature as IETF standard , by Chris Foresman. At arstechnica.com