Transcoding is something of an art form whereby one must balance dozens of requirements, formats, parameters and more. Sometimes this can seem daunting for those that just want to know a little more information or want to step into the world of digital media. What follows are a culmination of best practices developed while building Bits on the Run over the last few years. This is by no means an exhaustive list but should give a good idea of some things to watch out for or remember after reading the basic Overview of Transcoding.
Always encode for a specific quality rather than relying on bitrates. With bandwidth availability increasing across the board there is no need for using a target bitrate unless you are targeting a specific limited device or the quality you wish to achieve is unrealistic within your bitrate constraints (in which case try lowering your quality expectations). To give concrete example using constant rate factor to encode H.264:
ffmpeg -i infile.avi -vcodec libx264 -vpre default -crf 21 -acodec libfaac -ab 128k output.mp4
The following two videos also demonstrate nicely the difference in bitrates required for the same dimensions to get the same visual quality from two very different videos (with a difference of almost 800kbps!) These were produced by a command very similar to the one above.
Avatar Trailer 720px wide: 1800kbps
Diggnation excerpt 720px wide: 1050kbps
As can be seen above these two very different videos can look the exact same quality with very different bitrates. It is a good example of how specific bitrates are not a good indicator of quality given changing dimensions, framerates, picture complexity and complexity of movement over time. When dealing with a large set of videos of varying qualities, sizes and complexities it is a good idea to always use constant quality settings.
There is no reason that you should ever scale video dimensions to be larger than the original input. You cannot get better quality by transcoding, and scaling up will do nothing but blur the video. The exception to this rule is of course a device with strict dimension limits, such as a phone where your video absolutely must be 800x480 pixels. Also keep in mind that video players will scale video to fit screens for you, so there is no need in general for you to do so during transcoding.

Upscaling 320px wide video to 480 and 720 pixels blurs the video
Try to always provide sane defaults that offer a good compromise between quality, size, transcoding time, etc. Be realistic and remember what devices and available bandwidths you are targeting. If your audience is mostly low-power phones and netbooks then don't encode 1080P content and expect them to be able to play it. Don't listen to random bloggers (including me) and use your own eyes and ears to choose defaults that work for you and your customers.

By default things should look acceptable
While this is really a general philosophy of Bits on the Run, this is extremely important when it comes to transcoding. Users will hang themselves if given enough rope. We've seen impossibly high and ridiculously low bitrate selections, dimensions, and other options. Along with sane defaults you should try to make the options as simple as possible, but provide enough to appease power users.

Handbrake: too many complex options

Bits on the Run template options: simple
As much as you might want to show up to the second statistics on transcode jobs and exact reasons for failures or issues, try to resist the temptations. Most people don't care, get confused, and really just want stuff to work. Focus on preventing failures and making sure all media transcode properly.
The following best practices pertain more specifically to services like Bits on the Run where many files are processed by a service-type backend.
You are likely to want to use really smart, really clever queue processing and management logic to make a transcoding cloud behave in the most efficient manner possible. My advice: don't. Stick with simple logic that won't get out of hand and treats most users fairly. Simple, fast, and easy to understand is key. You will never hit 100% efficiency and the more complex the queue logic becomes the harder it is to predict how small changes will affect the system. A good starting point for the logic is as follows:
When processing a lot of media on a farm of servers you need to find a compromise of quality, speed, and the number of servers in the farm. Using the slowest and best quality is not worth it in such a situation, because this costs you servers and time. The time spent encoding with a preset to save an extra 3% bandwidth has the potential to scare a customer away to faster services. Err on the side of too little quality or too few servers processing and they may leave as well.
Using Free and Open Source software like FFmpeg and libx264 is highly recommended. You get a community of video experts helping you, and they can be very friendly when you try to help back. You save money by not licensing large proprietary tools and you get some of the best quality output in the industry.
At Bits on the Run we regularly make use of the above as well as many other open source projects. Much of the transcoding system is written in Python. It's a great way to glue various components together, interact with databases and APIs, and to provide interactive object-oriented shells for job and queue management.
Hopefully this will give you a good idea of some of the best practices to follow when transcoding media after understanding the basic Overview of Transcoding. Remember that quality is more important than bitrate, unless your application requires a specific bitrate. Upscaling should always be avoided when possible. Provide sane defaults for users and make sure things work well without needing tweaks. If you have a queue system, keep it simple to save yourself future headaches. Have any other tips to leave? Add a comment below!
All copyrighted content is owned by the respective copyright holders and used in this post under fair use to show examples. The Avatar trailer and images are copyright Apple, Inc. The Diggnation exerpt is copyright Diggnation and Revision3. The Big Buck Bunny image is copyright Blender Foundation and used under the Creative Commons Attribution license.
Comments
Nice article. I feel like you should specify though that your recommendation for using constant rate factor is focused on one-pass encoding. If you can spend the time doing two-pass, you still use -b and -bt to specify your target bit rate. And isn't the consensus that if you do two-pass you can get an equivalent quality video with a more constant bit rate? At least, that was my understanding. Do you disagree?
Submitted by Sam on Tue, 2010-09-14 14:45.
Sam,
Thanks for the comment! I had actually left multi-pass encoding out of this post intentionally, and set my example command to only use one pass to lead by example. In retrospect maybe I should have covered it somewhat. You are correct that constant rate factor encoding should use only a single pass, and that multi-pass encoding is generally used with a bitrate and an acceptable variance. Constant rate factor means each frame is being encoded to match a visual quality instead of bitrate, so multiple passes, while possible, aren't going to do you any good.
I'm honestly not sure how much truth there is to the argument that multi-pass bitrate-based encoding is going to provide equivalent quality at a more constant bitrate. It entirely depends on the encoder, on the settings used, the input video, etc. It certainly won't be mathematically equivalent in quality if the bitrate is different for different frames, but visually it can seem equivalent to the constant rate factor encoding. If you use strict, relatively small limits for the acceptable variance in bitrate then it's possible that the overall variance will be less but scenes with very high or very low complexity are likely to suffer by using too few or too many bits to encode them, respectively. By the way, it should be possible to write an encoder that accepts both a target quality and strict bitrate variance limits, where a frame would always strive to reach the requested quality while being limited to a more constant bitrate because of variance limits.
I hope that helps to answer some of your questions. If not, or if you have more, please don't hesitate to post them!
Submitted by Daniel on Tue, 2010-09-14 16:03.
Thanks for good article and promoting CRF vs a target bitrate. Hopefully you can explain why all the dynamic streaming settings are based on indicating a target bitrate in the script? do we just kind of guess? Does it really matter?
thanks!
Submitted by rg on Mon, 2010-09-27 18:43.
Interesting post. I'm curious as to how you define "Constant Quality"? Measuring coded video quality is still a very inexact science. Do you mean "Constant QP"?
- Iain @onecodec
Submitted by Iain Richardson on Thu, 2010-12-16 16:22.
@RG, thanks for the praise. The reason most of the dynamic streaming stuff uses target bitrates is because they need to use these bitrates to compare the capabilities of your connection to decide what you can play. The logic for this is much easier if you know an approximate (average) or constant bitrate. It's a limitation of video still taking so much bandwidth and many people in the world not having a fast connection to the Internet.
@LAIN, In this particular case we mean CRF, Constant Rate Factor, which is a setting of the x264 encoder. Rate control is a function of the particular encoder you use, so some may instead use constant QP. Either way the goal is to have a fairly accurate metric for visual quality and stay constant within that by varying the bitrate.
Submitted by Daniel on Thu, 2010-12-16 16:46.
Post new comment