April 2, 2007
I’d like to build a large scale web site that sits on top of Amazon’s EC2 and S3 service. The novelty is that it should scale dynamically to handle peak loads, rather than require lots of manual fussing. By “dynamically” I mean it should start up new servers to handle increased load, and shutdown those servers when the load decreases. It should launch different types of servers depending on the type of load. Everything should be automatic. Here are the phases it might transition between:
- A single machine with a Apache and MySQL. Apache will cache as much as possible.
- Split #1 into a web server and another DB machine
- Launch Squid proxy and dynamically add more web servers as demand increases. Neeed to measure the load on all web servers such that if all have a sustained 80% load, then add another server automatically. The new sever must be dynamically added to mod_proxy’s list.
- Add more Squid caches between the proxy and web servers.
- Split database into hot and cold data storage.
I imagine that a master machine would continuously monitor all the site and make decisions about what new servers must be added/subtracted. All I need is an idea for a large web site to test this thing on.