An Overview of Audio and Video Transcoding

Blog 7 min read | Jun 22, 2010 | JW Player

Share:

Update:

JW Player’s proven video hosting and management cloud platform helps customers streamline their workflow with high efficiency! Our video hosting solution encodes tens of terabytes each week and encodes from any source to MP4 and HLS with high HD quality. Check out our comprehensive hosting features here.

For customers searching for more information about transcoding and video hosting, please reference our Platform API page on our developer site.

————————————————————————————————————————————————

This post will try to peel away some of the layers of confusion surrounding media conversion by describing how media are stored, why you might want to convert from one format to another, and tools you can use to do it.

What Is Transcoding?

Transcoding is usually something that happens behind the scenes. For example, when you take a video with a Flip or other handheld video camera and upload it to YouTube, the file is transcoded by YouTube into various formats for distributing and displaying to viewers. You don’t see this happening, but it is why videos are not immediately viewable on the site after uploading them. Once the transcoding has finished, you are able to show the video to your users or friends.

To understand what transcoding is, you need to first understand how digital media are stored. A digital media file generally consists of a container with metadata information like the dimensions and duration of the file, along with any number of tracks. Commonly a media file contains an audio track, a video track, and sometimes a subtitle track. Each of these tracks has been encoded (using a codec) into a format that tries to maximize quality while minimizing file size. These encoded tracks are interleaved (or multiplexed) into the container, meaning that they are stored as something like this: a chunk of audio, a chunk of video, the next chunk of audio, the next chunk of video, and so on.

Transcoding is the process of taking digital media, extracting the tracks from the container, decoding those tracks, filtering (e.g. remove noise, scale dimensions, sharpen, etc), encoding the tracks, and multiplexing the new tracks into a new container. Transcoding is most commonly done to convert from one format to another, e.g. converting a DivX AVI file to H.264/AAC in MP4 for delivery to mobile devices, set-top devices, and computers. The basic pipeline looks like the following:

/ decode audio -> filter -> encode demultiplex -> decode video -> filter -> encode -> multiplex decode subtitles -> filter -> encode /

Why Transcode?

There are a number of reasons for transcoding your media. You may want to convert a high-quality original edit to a digital distribution format easily sent to customers over the Internet, like H.264/AAC in an MP4 container. Or you may want to convert your high-quality music library, stored in AAC or Vorbis, for your music player that only supports MP3 files. Often, you may want to target a specific platform or device, like Adobe Flash, that supports a limited set of formats and thus need to convert your media library to a suitable format for proper delivery. You may even have old MPEG2 HDV tapes that you want to transcode to H.264 High Profile to save 40% of the storage space while losing no noticeable quality.

Some things to keep in mind about transcoding:

Transcoding always lowers quality*

Transcoding can take a long time, depending on formats and settings

The newest formats and codecs are not always best

Remember your intended audience and their decoding ability (e.g. phones)

* Quality is not lost with lossless formats, but the vast majority of formats are not lossless!

Common Formats and Codecs

You will undoubtedly run into many new and confusing terms as you explore the digital media landscape. It’s probably best to try and familiarize yourself with some of the more popular containers and codecs.

Containers

MP4

MPEG4 system container, used by Quicktime and Adobe Flash. This container is quite versatile and has excellent support almost everywhere (from phones to computers).

Extensions: mp4, mov, m4v, m4a, m4b, m4p, f4v, 3gp, 3g2

Synonyms: MPEG4 Part 14, ISO/IEC 14496-14

WebM / Matroska

A versatile container similar in concept to the MPEG4 system container. WebM uses a subset of Matroska to create a container optimized for web media and HTML5. Adobe has also recently announced plans to support WebM in the Flash player.

Extensions: webm, mkv

AVI

Microsoft’s generic container format. This can generally store anything and everything. It has excellent support on most computers.

Extensions: avi

FLV

Adobe Flash media container which is useful for storing legacy Flash content and for low-latency live streaming. Adobe’s newer format is F4V, a subset of the MPEG4 system container.

Extensions: flv

Video Codecs

H.264

Widely considered the best video codec in the world. This video codec is what powers YouTube, Bits on the Run, and more.

Synonyms: MPEG4 Part 10, MPEG4-AVC, AVC, ISO/IEC 14496-10

VP8

A new video codec for web media, completely free and comparable to H.264 in quality. It was originally developed by On2 but then bought and released for free by Google. This codec is part of the WebM Project.

MPEG4 Video

Last-generation video codec, still widely used in the piracy and home DVD player scene. It requires more file space than H.264 for the same quality, but encodes and decodes faster.

Synonyms: DivX, Xvid, MPEG4-ASP, MPEG4 Part 2, ISO/IEC 14496-2

MPEG2 Video

Video codec generally used on DVDs. It is a few generations old, but given a sufficiently high bitrate the quality is quite good. MPEG2 is also one of the official video codecs for Blu-ray.

Synonyms: MPEG2 Part 2, ISO/IEC 13818-2

Flash Video

Sorenson H.263 for Flash, used before Flash supported H.264. This is also an older generation codec, but is useful for low-power devices that cannot support H.264 or VP8.

Audio Codecs

AAC

Advanced audio codec, widely used by Apple and many portable devices. It can support mono, stereo, and surround sound. Currently the highest quality widely-used lossy audio codec.

Synonyms: MPEG2 Part 3, MPEG4 Part 3, ISO/IEC 13818-3, ISO/IEC 14496-3

Vorbis

Completely free and open audio codec comparable to AAC. Widely used by game developers and Linux/BSD users, this codec is now also part of WebM which will probably mean wider adoption.

MP3

Older but still extremely widely used audio codec. It has the advantage of being supported everywhere and by most every device in existence.

Synonyms: MPEG1 Layer 3 Audio, MPEG1 Part 3, MPEG2 Part 3, ISO/IEC 11172-3, ISO/IEC 13818-3

How to Transcode

There are literally dozens of commonly used formats and many, many software packages that can handle converting between different formats for you, though the speed, quality, and supported input and output formats differ between much of the software. There are even online services setup specifically to transcode media for a fee. If you are willing to get your hands dirty it is quite easy to get free and open source tools that will handle most any format you throw at them.

There exist many third party services online that will transcode files (many for a small fee), such as Movavi Online Converter, Media Convert, Zamzar, and more. These services require that you upload the media file to them, then later download the transcoded result back to your system.

It is also possible to convert media files on your own computer, using both free / open source and proprietary software. Free tools such as Media Coder, Handbrake, and VirtualDub will let you convert and even do some basic editing. Quicktime, Any Video Converter, and many other software packages that can be bought offer these features as well. If you are more interested in getting your hands dirty, you can go right to the source of the tools most of these products use to do the actual transcoding: FFmpeg, FAAC, x264, and WebM.

If you’re comfortable with the advanced tools, then using Free and Open Source software like FFmpeg and libx264 is highly recommended. You get a community of video experts helping you, and they can be very friendly when you try to help back. You save money by not licensing large proprietary tools and you get some of the best quality output in the industry.

Hopefully you found this brief overview helpful. Watch this space for more in-depth discussion on this subject including: transcoding your first video, selecting the optimal formats & codecs, choosing the right tools, and more.   Learn more about Transcoding Best Practices here.