April 3, 2007
You can find a presentation by an architect at eBay here. For eBay the database is the bottleneck in the entire operation: 26 billion SQL queries every day! They go to heroic efforts to split the DB into smaller chunks and take the load of the DBs. For example, the implement transactions manually. Transactions are very expensive because the DB records everything that happens just in case it needs to rollback after a failure. But 99% the transaction succeeds and all that work is thrown out. So they are very careful to implement minimal transactions manually without involving the DB. Second, they implement joins, sorts and other goodies in code, not in the DB. The DB is now just a dumb container for data without any fancy features. Everything is hidden by a data access layer. Third, they cache objects at every level of their enterprise. It is cheaper and easier to horizontally scale the web farms and application servers, not the DBs. Whenever a write operation goes to a database, they use multicast to invalidate those cache lines. You don’t ever want the caches to poll the DB for changes. Fourth, everything is asynchronous and loosely-coupled. Obvious.
Most of these features can be implemented with existing technologies, except implementing DB features in the data access layer. Only the top 10 websites would need to do that; the rest of us can rely on the DB features. We can learn from the other lessons, though. Caching is vital. Using memory caches is better than hitting a slow disk. Some sites will fill a machine with memory just to use as a cache (which works great with 64bits). You can use reliable multicast (queues) to notify caches of changes in the DB, though this is likely slower than eBay’s hand-rolled solution. ASP.NET can cache web pages and you can use the state server process as an in-memory cache. You don’t want to use SQL server as the cache because now you’re putting more work on the DB. The goal is always to reduce the load on the DB. There are many other lessons in eBay’s design which one can see in other system architectures.