Not long ago I had a chance to visit the big data center at 365 Main Street in San Francisco. I was invited by friends to help them install the first servers for their startup, which is still in stealth mode. The data center was enormous, though my friends occupied only a small part of one rack not far from the Oakland Raiders and one floor up from Bebo, the social network bought not long ago by AOL. Bebo is a big hit in the UK and I found it odd that all those British profiles are hosted in San Francisco, eight time zones away.
We installed the servers — three little boxes and one big one — then fired them up. In moments the site was live, though with only a few alpha clients. The application, which I am sworn not to mention yet, is clever, but not particularly resource intensive. It’s just like any other web site only a little different in several good ways. The three little web servers just sat there blinking, doing their jobs. But the bigger box (a 3u, versus the 1u web servers, if you are keeping track) was wailing right from the moment of booting. Inside were three 15,000-rpm drives, a bunch of processor cores, and a ton of RAM — all of it thrashing away despite the small load. What was going on?
What was going on was the twilight of enterprise application development as we know it today. That 3u box was a database server that sits behind the three little web servers and makes this new enterprise work. And work hard, apparently, for all the drive thrashing and lights blinking.
We’re at an interesting point in the development of computer technology. Processors, having been failed somewhat by Moore’s Law in their attempt to become more powerful by widening data paths and raising clock speeds alone, have now resumed or even accelerated their performance growth by replacing one processor core with 2, 4, 8, and eventually hundreds of cores, most of them not really needed.
Processing power isn’t what binds enterprise or Internet applications today. I/O and disk access do that. Servers have one or two gigabit Ethernet connections, each of which could be easily saturated by an old Pentium 4. It’s the pipe that limits us, not our ability to pump bits through that pipe. Thanks to the gamers, I suppose, and to a surreal and not particularly useful competition between Intel and AMD the main server CPUs are barely sweating even though they are running the core business logic of the application. It’s the database server with its disk drives that is working so hard, grabbing data to feed the web servers seemingly just in time. But don’t blame the hardware here or even the disk drives — blame the database.
We’re at the apex of SQL database development. It’s 1890 and we make the best darned database buggy whips on Earth.
There is a better way to handle large volumes of data and that better way has been established, not surprisingly, by Google with its BigTable semi-structured database that essentially caches the entire Internet. HBase from Hadoop is the Open Source version of BigTable and both are rapidly making old SQL databases like Oracle and DB2 obsolete for certain users.
Amazon.com runs on an Oracle database, but one that was extended and optimized at a cost of more than $150 million. Amazon probably represents the most that one can do with SQL in terms of scalability. Anything bigger requires a completely new approach like BigTable.
Or maybe it isn’t so new at all. I recall something very analogous to BigTable during the network operating system wars of the 1980s. Microsoft had a couple dozen OEMs working on network operating systems based on the hierarchical file system of DOS 2.0 (Paul Allen’s last technical contribution to Microsoft). While a hierarchical file system may have made some sense for a workstation it made little to no sense for a server accessed by dozens of workstations in the view of the programmers at Novell, where Netware was being born at the time. Those guys ignored the hierarchy and wrote the entire File Allocation Table for each drive to memory as a single flat file called an Indexed Turbo FAT. Where the DOS-based network operating systems had to search the disk for files, Netware had the entire index loaded in memory and instantly knew where the target data could be found. The system was easily 100 times as fast. BigTable takes this a step further, I suppose, by ignoring the distinction between index and data, dramatically expanding the memory footprint but, at the same time, completely eliminating a retrieval step.
An irony of BigTable and Indexed Turbo FATs is that both Google and Novell were pretty upfront about what they were doing and why, yet competitors have remained bound to lower performing technologies because, well just because.
Which brings us back once again to Oracle buying Sun, a deal that has continued to bug me because it didn’t make sense… UNTIL I thought about it in terms of the scalability of SQL architectures and market positioning.
Right now almost every web application has an Apache server fronting a database box running MySQL or its closed source equivalent like Oracle, DB2, or SQL Server. The data bottleneck in all those applications is the SQL box, which is generally doing a very simple job in a very complex manner that made total sense for minicomputers in 1975 but doesn’t make as much sense today. Five years from now the situation will be very different with HBase running everywhere, the dedicated SQL box eliminated completely, and the database shared across redundant web servers like a micro-Google.
Where does this leave Oracle?
It leaves Oracle bleeding its big stupid corporate customers for another decade but eventually losing both the bottom half of the market and the very top where applications scale to tens of thousands of servers.
Part of the distinction here is between running a mobile phone billing system in one case and Facebook in another. In the mobile phone example you’d better get all those minutes or money will be lost. But in the Facebook example reality is more approximate and if an update propagates slower than expected, well big deal, so you missed Little Johnny’s birthday pictures for an extra 20 seconds. There are even business software cases where this philosophy applies. Progressive Insurance, for example, is always ready to give you a comparison price quote for auto insurance not because they can generate that quote (and the price quotes of their major competitors) on the fly, but because THEY GENERATE A SPECULATIVE PRICE QUOTE FOR EVERY CAR IN AMERICA EVERY NIGHT. They don’t generate a quote when you call, they just access it because it is already done.
So Oracle keeps the mobile phone company as a customer but doesn’t keep Progressive in this example. And in the long run there’s enough data redundancy built into the loosey-goosey HBase model that it becomes just as reliable as the more rigorous SQL model that it is inexorably replacing. That’s when Oracle loses the mobile phone company, too.
Larry Ellison won’t like that.
So what’s to be done? Buy Sun. Get into the database appliance business. Start selling highly-tuned database appliances that achieve the simultaneous goals of vertical integration (making profit on the hardware as well as the software), obfuscation (keeping the customers out of the lower-level code by encasing it in an appliance), and increased overall performance (putting off the inevitable loss of market dominance for another three years through a hardware tour du force).
IBM, as the other big SQL company, doesn’t really share Oracle’s problem, because IBM makes money from the hardware already. If DB2 gives way to something like HBase, IBM will run HBase on its premium iron — a luxury Oracle can’t share without buying Sun.
As hardware gets cheaper we extend performance by distributing software across more and more machines. But that distribution in itself undermines the lucrative software licensing system. So we introduce a new level of abstraction — the database appliance. Prices will go up a little while performance will go up a lot. Customers will think they are getting more for their money and they will be. But the ultimate comparison that has been at least postponed is between paid and free, where free always wins in the end.
And THAT’s why Oracle NEEDS Sun — to extend its current run by another three years, buying Larry time to write an Act II for his company.