which is a rarity in my 'Technology' section, but nonetheless, after my experiences of the last five weeks, a necessity for me to blow off some steam. If you're not a programmer you may want to skip this. But here goes:
The company which employs my services has recently come out with a new camcorder - one that does HD video. This encodes the video using a compression technology called H264, which is, of course, patented. And we are, of course, stuck with paying the patent license.
Now, as such things go, this one is not particularly onerous - the first 100,000 units are royalty free (I believe) with a slight charge per unit shipped after that. Our problem in this case, however, wasn't particularly the patent. It was the lack of an open source or otherwise free codec for the PC that was the problem.
Commercial suppliers all wanted more than $1 per unit for shipping a codec - and that's not inclusive of the patent license. And the only open source solution for Windows was ffdshow (short for ffmpeg and DirectShow, the M$ video technology), which not only includes the H264 codec, but damn near every other codec under the sun as well. Which would mean that we would have to pay patent license fees on those codecs as well, even though we weren't using them. Ridiculous. So my task was to modify ffdshow to use only the H264 codec. Should be easy enough, I surmised, after all, the complete source was available...
Some wag once extended Bismarck's statement that the two things people should not see being made are laws and sausages to include software. I think that fellow had an OSS project such as ffdshow in mind when he said that. The code base was originally began in 2002 and originally did only XviD decoding. When that developer dropped it in 2005 a bunch of other coders from an online forum picked up the ball, added "tryout" to the end of the project name and proceeded to willy nilly throw everything but the kitchen sink into the code.
Remarkably, it actually works - mostly. But it is the worst pile of spaghetti code I've ever seen. It is the exact antithesis of what Eric Raymond claimed as the benefit of open source when he wrote The Cathedral and the Bazaar - obfuscated, uncommented, hacked. Monolithic doesn't begin to describe it - there is no way to even figure out program flow by reading the code - a debugger is required just to see what goes where, when! The use of object abstraction has been carried to a ridiculous extreme - even the simple "LoadLibrary" Windows call has been abstracted into it's own class. Variable names mean nothing, and the use of macros was so extensive as to defy belief. There are macros that undef other macros just to set up a constant enumerated list, for example. Just loony ...
After about a week of trying to figure it out, I gave up and decided to roll my own DirectShow codec from scratch. As ffdshow was based on ffmpeg, which had the actual codecs, I didn't count on too much trouble. Whew!
Unlike the ffdshow codebase, ffmpeg is well designed, in straight C, with a reasonably straightforward API. The authors, however, have more than a bit of religious devotion to C, and insist on using a C99 compliant compiler to build the code. That cuts out both M$ and Codegear products (although Codegear C++ Builder comes close). But ffdshow has a MSVC solution file for the portion of ffmpeg that they used (called libavcodec), so I figured I could use that. And it built fine, and seemed to work.
Next I downloaded a DirectShow Filter wizard for Visual Studio. And sure enough, it generated a transform filter shell. But it was only a shell, and the example only copied one video stream to another.
Still, a quick look at some older ffdshow code gave me some direction, and I soon had a DirectShow wrapper for ffmpeg. It turned me blue. Literally - a video of myself made me look like Papa Smurf. And it ran incredibly slow - the maximum frame rate I could get out of the thing was 5 fps, and my videos were at 30.
Microsoft has a uniform reputation for horrific developer documentation, and they didn't let me down with DirectShow. Interfaces piled on interfaces, pure virtual methods that weren't very virtual and certainly anything but pure. Macros out the wazoo. No real examples, and nobody else seems to have ever wrapped ffmpeg except the ffdshow folks. The only relevant book on the subject was published by Microsoft Press in 2003, rapidly went out of print, and has one copy available on Amazon for $215!
Google revealed several cryptic references to using ffmpeg in a DirectShow driver, but nothing solid. I was lost. I banged away on things for a week, looking at everything from calling conventions to MMX optimizations. Nada. Zilch. Slower than molasses running uphill in February.
So I finally posted to a couple of groups - one on Usenet, one on Yahoo, and dropped a note to a French DirectShow trainer. The trainer got back to me immediately, offering his services for 1200e per day! No way the company was going to pony up for that, but I think he was in the process of taking pity on me, and asked if I'd bundle and send him the code and he'd take a quick look and make some suggestions.
It was in the process of bundling it that I got my first break. In looking over the finished builds, I noted that the libavcodec hadn't been rebuilt the last time I compiled the whole project. I had an error in the build setup for Visual Studio - the release build was being put up in the wrong place! (It was too many dots, so to speak.) I quickly fixed it and suddenly I was getting frame rates approaching 15fps! An improvement by a factor of three! Still not fast enough, but at least I was moving again.
Being used to either makefiles or the old C++ Builder build environment, I could see how easy it was to make this mistake. I can't believe that Codegear has actually gone to the M$ Build technology in their newer compilers. There are so many places for settings, most all of which are hidden down tree views, that it's almost impossible to be sure that you have things set right by looking over it. But whatever.
Another few days of putzing with no results led me to try a more radical approach. I found a makefile in the ffdhsow code to build libavcodec with Msys and GCC. I built it that way, outside the M$ Visual Studio environment, and bingo! Full speed ahead!
Nowhere in the limited documentation for ffdshow did it even remotely give a hint that compiling libavcodec under MSVC would fail. Nowhere. So much for open source transparency.
But I still had my colorspace issue. I still looked like Papa Smurf. I also noticed that some other H264 encoded videos that I downloaded had an interlace problem. A few experiments with GraphEdit also showed that if I put the stock colorspace converter between my filter and the video renderer the problems disappeared.
Another day of Googling showed a problem - some functions in ffmpeg have a problem with some forms of YUV color encoding. As soon as I changed the default output of my filter to a different YUV colorspace, all the problems disappeared, and I was running full tilt in living color.
That's how I spent the last five weeks. The resulting H264 decoder will be released under the GPL in a week or so.
All of my headaches could've been spared had other coders taken the time to document or at least comment their code. The judicious application of lessons learned in computer science classes would've helped too - there's no way ffdshow can be maintained once the current crop of developers leaves the scene. And I think I'll go back to make and the command line for building - at least until M$ and Codegear get their shit together and put a GUI on top that's actually usable and makes sense.
Ah, it's a programmer's life for me!
And what I wouldn't give to be a full time farmer.
12:00 /Technology | 1 comment | permanent link