Open Source Software in E-voting

(This is a highly abbreviated version of a forthcoming paper)

Approximately one year ago, California Secretary of State Kevin Shelley handed down the first open source software mandate (PDF) of any U.S. government official. This open source mandate came in response to a specific subsystem of electronic voting machines: the system that verifies the selections on an AVVPAT for disabled voters must run on open source software. As well, Rush Holt’s bill (HR 2239, 108th Congress) require that “No voting system shall at any time contain or use undisclosed software.”

There have been attempts in the past at coding open source election software for Internet voting. For example: Lorrie Cranor’s SensusMIT’s EVOX and Jason Kitcat’s GNU.FREE. All of these systems are now unmaintained, and voting over IP (a/k/a the other VoIP) has been fundamentally discredited by David Jefferson, Avi Rubin, Barbara Simons and David Wagner in their report examining the Pentagon’s SERVE project.

However, there hasn’t been a serious effort at developing an open platform for polling place voting until recently. In the past few years, the Open Voting Consortium (OVC) has developed software and a design for an open voting platform. The software, EVM2003, is written in python using XML ballot specifications and can run on a standard PC using a CD-bootable GNU/Linux called Knoppix. The OVC is currently in the middle of a fundraising campaign to get “1111 subscribers by 11/11″ to raise money for their efforts from small, grass-roots donations of $10 a pop; if you have a decent salary and less than two kids, consider give them a few bucks. (None of which should be confused with the Open Vote Foundation which intends to fork the open source Australian e-voting software called eVACS that is no longer open source)

Why open source?

Elections have a long history of being open processes. Openness is one of the central aspects that lends legitimacy to the electoral process. Even with lever-based election machines, which have been around since the last few decades of the 19th century, it was still possible for election officials to hire a reasonably confident engineer to open the machine up and verify that, in fact, the right gears were turning the proper amount of times.

Over the past two decades, computerized technology has become a growing element of election administration and many parts of voting technology are now enshrouded in mystery. Computer software is subject to all types of intellectual property protection (copyright, patents, trade secrets and trademark) and electronic voting machine vendors are notoriously protective of their products. They do business in a small and highly competitive market that has just seen a large injection of $3.9 billion from HAVA. While any trade secret that they may hold is in no way rocket science, you can imagine that their implementation of software and hardware would give their competitors an edge if known publicly.

Now that we are seeing serious concerns in the areas of vote tabulation and human factors from Tuesday’s election, there will be a need, as David Wagner suggests below, for comprehensive investigation into the source of these problems. Undoubtedly, this will involve examinations of source code and attempts to reproduce problems on the same machines used in the election. The examination of any vendor’s source code is a particularly sensitive topic filled with NDAs and negotiation; for example, it took the California office of the SoS 6 months to negotiate the terms of an independent source code examination of California’s four EVM vendors (Diebold, ES&S, Hart Intercivic and Sequoia).

Another benefit of open source, open standards, is also highly desirable in election administration. As you can imagine, having different types of raw vote data (encrypted or not) in proprietary formats, makes combining results from different vendors at the state level a major pain in the ass. In fact, I’ve got anecdotal evidence that the “official” canvass here in California involves a few employees of the office of the Secretary of State entering in county totals by hand into a spreadsheet precisely because there is no interoperability between data formats of our 4 vendors. Talk about an environment ripe for human error… This is why it is heartening to see IEEE standards work (IEEE 1622) and the OASIS Elections and Voter Services TC working on voting data interchange.

What are the risks of open source in e-voting?

Jason Kitcat, author and maintainer of GNU.FREE, wrote a piece for the October issue of the Communications of the ACM where he described why he had ceased to specifically advocate for open source software and had come to recognize that it only brings modest improvements at best (“Source Availability and Evoting: An Advocate Recants” Communications of The ACM October 2004/Vol. 47, No. 10, 65-67). He thesis can broadly be stated as arguing that disclosure in software does add some benefits for e-voting in terms of security and transparency, but not enough to outweigh the inherent difficulties in “creating a secure, private, reliable and anonymous system that provably records voters intentions accurately.”

There are a few risks of open-source voting that have been pointed out (some rebuttals are in italics):

  • Disclosure or withholding vulnerabilities: By having the code disclosed, attackers are free to examine the code at their leisure and exploit bugs and vulnerabilities that are not found before election day. This will never be solvable with open source code, but extensive third-party policing – like by coders with Verified Voting or another nonprofit – could go a long way towards ensuring very good code. As well, if a serious flaw is found right before an election, this might have adverse consequences on voter turnout. If the alternative is allowing a compromised election, I personally think this is a good thing.
  • Reduced competitive advantage: If vendors are required to open their source, this would mean new competitors could enter the market and “free-ride” off of the years of work that vendors have put into their software. Of course, this depends on licensing terms mandated by the regulating body. Of course, if the licensing terms, as in the Holt Bill, are merely disclosure-oriented, there’s still something called copyright folks.
  • Reduced market supply: If vendors are not allowed to keep software and interfaces proprietary, there will be reduced incentive to enter the market or create new products. I have a feeling that most of the money in election systems is in service- and maintenance-oriented contract work. Plus, vendor lock-in is always a good thing for the vendor and always a bad thing for the customer.
  • Lack of participation: (This is Kitcat’s “Transparency goes only so far” argument) E-voting isn’t very sexy and as such, will not attract talented (or any) coders to contribute. However, community source consortia-based projects like SAKAI are a better model for e-voting software development, whereby in order to be able to contribute and make changes to the application an institution has to devote a certain amount of coders and resources to the project.
  • End users compromising security: The end-users of open source e-voting software, the counties, could change the code, recompile it and implement it in very insecure ways not knowing the details of its design and the assumptions behind the model. This could be alleviated by having technicians in counties certified to make changes or requiring that no changes are made after a code-freeze date (which was properly scheduled to allow the code to be certified in time for use in the election in question). This would ensure that counties run the code through the ringer on the systems they intend to use and reconcile any bugs our vulnerabilities before the code-freeze date.

Of course, I’m sure I’ve left plenty out… please leave comments or send me an email and I’ll incorporate your arguments/suggestions (and acknowledge you in my paper, of course!).

A Quick Note on Licensing

Unfortunately, CA SoS Shelley never specified what license or what licensing terms would satisfy his “open source” mandate. Open source software licensing terms range from the complex (GPL) to the simple (modified BSD) to the just barely open source (VoteHere’s disclosed source or MS’s shared source). What’s the right license that ensures the public is allowed to vet the source on their own, that experts can