The World of Pain that is HTML5 Video

January 17, 2012 in Tech

With the continuing momentum of HTML5, industry expectations of the online video space continue to grow. Unfortunately, there are many misconceptions about HTML5 video, and I find myself in the position of explaining to my clients (many already heavily invested in Flash-based video solutions) that current HTML5 capabilities are not on par with those of Flash Player 9, released over five years ago, much less the current Flash Player 11. And so, I’m doing more of the same exercises I did when native video playback was first introduced in Flash Player 6, in 2002: managing my clients’ expectations of what can be achieved with current technologies. More importantly, when my clients say they want to support HTML5 compliant video, the subtext is, “I want my video to work on iPhones and iPads.”

Given the current state of affairs, I wanted to assemble some notes on the state of HTML5 video, as a helpful introduction to those of you who need to start conceiving, planning and building solutions that necessarily involve aspects of HTML5 video. I’ve had to consider all of these factors and more as I’ve built my own video encoding and hosting service at over the last two years.

Introduction to the Problems (or Pain)

Many web developers already believe that HTML5’s <video> support can do away with plugin-supported video delivered by Adobe Flash Player or other runtimes. My three biggest concerns with using HTML5 <video> for “real world” video deployment revolve around:

  1. Current cross-browser implementations and adoption rates
  2. Cross-platform mobile optimizations
  3. Codec support
Note: There are more than three reasons to proceed with caution using HTML5 video. For example, video playback capabilities such as true full screen support with graphic and text overlays that online audiences have come to expect from online experiences delivered with Flash-based video players aren’t available in customized HTML5 video players. I’ll compare Flash and HTML5 video capabilities in a future post.

Cross-Browser Implementations

I’ll start with current desktop browser usage statistics. Microsoft Windows is still the primary desktop operating system in use today. Microsoft has decided to only release Internet Explorer 9 (IE9) and higher on Windows Vista, Windows 7, and future versions of the operating system, despite the fact that 46.52% of the desktop market is still using Windows XP, which can only run IE8 or lower. The HTML5 <video> tag is only supported in IE9 or higher. And by most counts, Windows XP has several years left as a viable operating system. As Table 1 illustrates, nearly 30% of desktop browser usage in North America from December 2011 to January 2012 could not interpret the HTML5 <video> tag. By default, then, if you want to reach a significant portion of desktop viewers with browser-based online content, you need to have a solution that more than likely will require the Flash Player. If you don’t believe me, just look at YouTube, which still uses a Flash-based player as the default video player wherever Flash is supported—even when an HTML5 <video> ready browser is detected.

Note: Windows XP can run other non-Microsoft browsers such as Mozilla Firefox or Opera that are HTML5 <video> compliant. However, as I later discuss, these popular browser alternatives do not support H.264 video playback.
Table 1: Browser Market Share – North America
Browser Market Share <video> support Codec support
IE8 24.83%
Chrome 16 19.83% H.264*, WebM, Theora
IE9 15.75% H.264*
Safari 5.1 5.9% H.264*
IE7 4.16%
Firefox 3.6 2.95% ✓  Theora
IE6 0.89%
Safari Mobile (iOS) 45.89% H.264*
Android Browser 36.88% H.264*, WebM**
Blackberry 7.33%

* Hardware acceleration available

** Not available in early versions of Android browser.

Mobile Optimization

While the desktop browser market may not be HTML5 video ready, I frequently hear that one segment of the market is HTML5-ready and has been for sometime now: the smartphone and tablet markets. By and large, it’s true—nearly every Android and iOS device supports HTML5 <video> playback as shown in Table 1. But, HTML5 <video> options here ironically are not good enough for the mobile platforms that so desperately need enhanced attributes. Mobile devices require special video care: specific video formats (H.264 Baseline for widest support), and video formats that are hardware accelerated to avoid unnecessary battery usage. Cross-browser HTML5 <video> support is limited to progressive downloads, which aren’t exactly mobile friendly. Yet most video delivered on the web (and to mobile!) still use progressive download over HTTP as the primary delivery method.

Apple is the only browser vendor to introduce a specification, called Apple HTTP Live Streaming (or HLS), that’s available in all iOS and now some Android devices. But, this specification is not adopted by other major vendors including Microsoft, Mozilla Firefox, and Opera. Google is the only other vendor to implement HLS with the native Android browser, universally in 3.0 versions and with limited support (varying by device manufacturer and carrier) in 2.3 and higher versions. Another streaming specification that’s been introduced is MPEG-Dash, which has many promising mobile and bandwidth friendly features but is not yet adopted by browser vendors. However, a streaming protocol is just one aspect of being mobile (and desktop) friendly. The more important aspect of an HTTP streaming protocol is adaptive streaming, which enables the browser to determine the appropriate video quality (or bitrate) to play. Adaptive streaming requires the content producer to create two or more versions of the video per device target, each version having a specific bitrate that will play well for a given Internet connection speed. Apple requires any iOS application accessing video over cellular networks to use HLS adaptive streaming for any video playback within the application that is over 10 minutes in duration or over 5 MB of data in a five minute period (read more).

Codec Support

Lastly, there’s the video (and audio) codec issues with current HTML5 browsers. IE9, Safari Mac 3.1 and nearly all of the native smartphone browsers support the industry standard AVC/H.264 video codec, as well as the AAC audio codec. However, Firefox 4+, Opera 10.6+, and Chrome 10.1+ support the WebM video codec, a newly open-sourced video codec, and the open source Vorbis audio codec. (Firefox 3.5 and Opera 10.5 browsers also support an older open source video codec, Theora, which is not optimized for high quality video at lower bitrates as modern video codecs are.)

If you want to adhere to current HTML5 video standard(s) and not rely on plug-in based playback, then, you’ll need multiple versions of a video, each encoded with the codecs necessary across all HTML5 browsers. You’ll also need multiples of those multiples, if you want optimal mobile playback with adaptive streaming. Developers focused on “standards only” would have you believe it’s best to encode your video in four or five different versions. This approach adds considerable time and expense to any video distribution workflow for a content producer that has more than just a couple of videos to manage on their site.

Standards versus Business Goals

In many respects, a strict standards approach for online video delivery is not financially responsible for business. With the current state of HTML5 video, there are too many codecs to support, and limited options for delivery protocols. WebM is far from an industry standard, and we’re years away from seeing WebM hardware acceleration across devices and desktop that we currently enjoy with AVC/H.264. While HTML5 is “video ready” for smartphones, WebM as a viable codec option is not. Apple iOS doesn’t support WebM, and for reasons similar to Apple’s resistance to the Flash runtime, I don’t see Apple adopting WebM anytime soon. Simply put, if company X is producing the hottest and best-selling devices to the same folks you want to see your video content, then company X’s implementation is going to dictate how you publish that content. Right now and for the foreseeable future, company X, of course, is Apple. Apple and Microsoft have both rallied behind H.264 as the preferred video codec for video playback, from device to desktop to TV.

If you view the current online video landscape from a business perspective, you likely want to achieve one or more of the following:

  • Maximum viewership of your video. You want your video to play across mobile and desktop browsers with a consistent experience as much as possible.
  • SEO (Search Engine Optimization) friendliness. You want your video content recognized by as many search engines as possible.
  • Efficient encoding workflow. Video, unlike other HTML visual assets, requires substantially more time, effort, and storage. You want a manageable workflow that can reuse assets across delivery mechanisms.
  • Wide range of delivery options, from live streaming to protected content. Business stakeholders will increasingly want to monetize their online video content, and HTML5 video doesn’t provide methodologies to limit access to the video source files. Apple HLS and MPEG Dash are the only options to stream and encrypt content, but only the former option is available now and only on iOS and limited Android devices.

The following factors should not, and likely will not, influence the business of online video:

  • Standards compliance. In the best of all possible worlds, we’d see a simple <video> tag that works consistently with the same format and delivery options everywhere. We won’t have that option anytime soon. First and foremost, sites that have video content want their target audience(s) to be able to watch the video with as little fuss on both sides of the equation: reduced encoding and deployment effort for publishers, and reduced playback headaches for viewers.
  • Open source reliance. While WebM and Vorbis are open source codecs, the industry standard for high quality video production is H.264. It’s consistently used for video from professional video capture to Blu-ray discs to online video. Just as the video disc market couldn’t support HD DVD and Blu-ray simultaneously (or Beta and VHS), there’s likely only going to be one winner in the online video codec battle. While H.264/AAC codecs can incur royalty charges for subscription and pay-per-view use cases, the vast majority of online content is under the royalty-free clause of MPEG-LA‘s licensing terms. And if you’re lucky enough to be one of the few content creators that successfully monetizes their video online, your royalty obligations for H.264 usage to MPEG-LA won’t prohibit you from being successful.
Note: As a quick example, Louis CK’s online-only sale of his standup special for $5.00 would likely only incur a royalty of $0.02 per purchase. For 100,000 copies sold (or $500,000 in gross revenue), he’s looking at $10,000 in licensing fees for H.264 usage. Would he have been so successful with his venture had he only offered a license-free WebM version instead which had limited playback support? (For more information on licensing terms of H.264, read this PDF on the MPEG LA site.)

A business goal-oriented approach deals with the problem of viewership and what’s widely available for your playback requirements. Facts support that Flash Player is very viable on desktop and on browsers that don’t play nice with the predominant (read: Apple) mandates of H.264.

H.264, HTML5, and Flash: A Solution to All Online Video Problems

So, here we are. We have HTML5 standards telling us to create three versions of our video (H.264, WebM and Theora) or more if you’re using adaptive streaming and/or H.264 profiles and bitrates specific to smartphone, tablet, and desktop deployment. We have a significant portion of browsers in use today that don’t implement the <video> tag at all. We have a perception that the Flash Player is on its way out, and that HTML5 can do most if not all of the things that Flash can do. The solution to the problem is relatively simple: base your solution around H.264 and forget about supporting WebM and Theora. If you consider yourself a professional web solutions expert and want to offer your clients (and, more importantly, your target audience) the best mobile and desktop experience, you’ll also encode and deploy for adaptive streaming on iOS, Android (where supported), and the Flash Player.

What an unfortunate mess of confounding options that we solution providers must present to decision makers. In my next post, I’ll outline the simplest and most effective solution to reduce the pain of distributing your video online.