August 17, 2007
Two security experts described some ways to improve voting machines. In designing a secure voting machine, the central assumption should be that the machines will be hacked. Therefore, it’s foolish to rely on one machine from one company. Instead, as these guys describe, one machine should generate a machine-readable paper record of the vote that can be verified by the voter and scanned in by another machine. A voting site might have a dozen voting machines and 1 optical scanner. It’s still possible to hack the optical scanner to manipulate votes. So I propose that each voting machine map the candidate names to a set of random numbers: (Machine1 Bush 8204). Every machine is different. After voting, the receipt will say you voted for Bush, but the optical scanner will read in the number 8204 (and 2947 from machine 2, 1695 from machine 3, etc.). The scanner tabulates the number of votes for 8204, but it has no idea what that means. Therefore, a hacker wouldn’t be able to manipulate votes because he doesn’t know what these numbers represent. The scanner sends these random numbers to the central computer. Again, the central computer has no idea what these numbers represent. Finally, the voting machines send their mappings and votes to another central computer. Now the central computers can map (8204 -> Bush) to determine the final tally and verify it against each voting machines records. The central computer can send a report back to each voting station for poll workers to manually verify the total against the voting machines’ numbers.
How can a hacker defeat this system? If he hacks the optical scanner or the first central computer, he’ll only see a bunch of meaningless random numbers. If he hacks the second central computer (the one that gets the mappings and votes), he can’t do anything because the results must match the first central computer. If he hacks both central computers, he still can’t do much because the numbers will be double-checked against the voting machines. If he hacks all the voting machines, he could assign Bush’s number to Gore; therefore, the voter see “Gore” on his paper ballot, but optical scanner reads Bush’s number. However, the poll workers can spot check the numbers to ensure they are different. This seems like a decent solution, though it’s rather complicated. It needs to be simplified because poll workers are fairly dumb. You can’t rely on them to do anything right, especially when there are several independent steps. Nevertheless, I don’t think it’s difficult to build a secure electronic voting system. It would certainly be different from the one we’ve got today.
August 14, 2007
WS-BPEL is an XML specification for a distributed programming language for coordinating web services. With this you can write a “business process” that sends data to A, then B, then collects the response. Unfortunately, it looks like all the intermediate data will go through the WS-BPEL execution engine. That makes the engine the bottleneck in a distributed system. I think a WS-BPEL spec should be converted into CPS and executed by the services themselves, not by the execution engine. That is, you can send data to A; A sends its output to B; B sends it output back to you. See? You’ve avoided receiving A’s intermediate data. In a sequence of a dozen services, you can avoid receiving a dozen intermediate values. With existing technologies (AFAIK) like BizTalk, you receive all the intermediate values; therefore, BizTalk is the bottleneck in a distributed system. Nevertheless, WS-BPEL, now that I sorta’ understand, is really useful and powerful.
August 8, 2007
Stream programming is a style where one describes a graph of interconnected actors that process a stream of data. See StreamIt. The data can be split and joined; therefore, it’s not the typical simple linear stream found in I/O libraries. There are three types of parallelism to consider:
- Task parallelism: two tasks running in parallel on separate data streams
- Data parallelism: a task that can be applied independently to every element in a data stream
- Pipeline parallelism: a linear set of tasks that can be merged together
I’d like to build a stream programming language that runs on top of Hadoop. This is a project at Apache that implements something like Google’s distributed filesystem, MapReduce and BigTable. MapReduce is, I think, a subset of stream programming (though I’m not sure about “reduce”). Anyway, I need a benchmark that is large enough to exercise a group of machines on Amazon’s EC2.
July 10, 2007
Dare Obasanjo is doing what my lazy ass can’t do, he’s developing a Facebook app just to see what the architectural hurdles are for large scale web sites. In this post, he claims that Amazon’s AWS won’t work because it doesn’t provide persistent storage. If your virtual instance containing your DB goes down, you lose all your data unless you explicitly back it up in S3. Remember, the hard drive in EC2 is virtual and transitory. This seems like a problem that must have a reasonable solution.
On the Amazon Web Service blog, Jeff suggests having one DB instance store incremental backups into S3. For recovery, a DB instance rolls up all the incremental logs into a new snapshot of the DB. I’d suggest running this recovery instance frequently so you regularly have an up-to-date DB snapshot. That way you can launch more instances using this fresh data, the recovery time will be much quicker, and you don’t end up with terabytes of incremental logs in S3. Most of the complaints in the comments are easy to solve. There are some issues which are MySQL problems (transaction logs?), which would trip you up on your own server farm.
It seems like people are trying to get existing software to work on AWS without much fiddling. But AWS is significantly different from a conventional server farm, thus requiring different solutions. Most of the startups (even big ones) presenting at New York’s Tech Meetup are using AWS and swear it’s the greatest thing ever. And I think people should consider combining AWS with their own hardware. For example, run the masters on your own server farm, but run the slaves and caches in AWS. I finally got into the beta program for AWS, so I hope to try some of this out someday.
July 6, 2007
I finally think I understand the performance difference between the various VMware products. Obviously they stratify their feature set so they can charge more. From slowest to fastest:
- VMware Server allows VMs to run headless and you can connect remotely. This means the graphics are always running “remotely”, like VNC or Remote Desktop, which is quite slow.
- VMware Player is like server but the graphics are run directly, so it’s faster. The CPU performance is the same, AFAIK.
- VMware Workstation installs specialized drivers in the host OS that improves access to networking and hard drives. Overall performance is better, but I don’t know if CPU perfs are better.
- VMware ESX Server is a hypervisor, which allows VMs to run very fast. The specialized Linux host OS doesn’t get in the way too much.
Installing VMware tools into the guest OS turns this into paravirtualization, e.g. the guest OS is modified to run better within the VM bubble. I wish someone would post the relative performance difference between these different implementations. I’ll run Passmark’s performance test in player and server and post them sometime.
June 6, 2007
At a series of tech talks, many startups have indicated they are building their apps on top of Amazon’s web services (AWS). Everyone loves AWS. I wonder if anyone has implemented a dynamic version of Linux Virtual Server (LVS) on AWS? By “dynamic” I mean a cluster that grows and shrinks based on demand. The heartbeat process would need to monitor the load on the cluster and dynamically start a new server whenever demand increases. It would then need to modify the load balancer to include the new server. Since the web servers are identical, it should be easy to reuse the same image. The only tricky part is to connect all these machines together at runtime using the generated IP addresses from Amazon. It would be nice if there were an open-source fairly automatic solution for this.
April 26, 2007
Here’s a summary of the evolution of the MySpace architecture as described here:
- Initial system: ColdFusion on 2 web servers + 1 SQL Server, added more web servers as load increased
- >500K users: Add more DBs. 1 master for all writes, many slaves for reading (initially 2 slaves).
- >1M: DB I/O maxed out. Vertical partition of DB, i.e. separate DBs for different functions. Switch to SAN.
- >3M: Single “logical” distributed database. Giant tables are split into chunks running on different machines.
- >9M: Replace ColdFusion with ASP.NET/C#.
- >10M: SAN I/O maxed out. Switch to SAN from 3PARdata.
- >17M: Added caching servers in front of DBs.
- >26M: Switch to 64bit computing w/ SQL 2005 and Windows 2003. Machines are loaded with 64G of memory.
It seems like a new web site should start with 64bit servers, a compiled language (Java/C#), a distributed DB design, and caches everywhere. This basic design approach should scale for quite a while by simply throwing more hardware at the problem. Maybe.