« What Do You Need to Know About Computational Thinking? | Main | Bad Decisions About CS Education in Ohio »

Election Data and Socially Relevant Computing

There is a fair bit of discussion these days about "socially relevant computing" and how connecting computing to current issues might make it more interesting to our students. I have been involved in a project with the League of Women Voters of SC to look at the election data and reconcile the official counts with the counts that are supported by the data collected and stored by the election commission.

We have obtained under the Freedom of Information Act the actual vote image files from several counties, including my own (Note: I think there are some states where this is actually illegal!). We have, as we expected, found some errors, and I am assigning some programs to my second semester students to have them write the code that would find the same errors. The vote image file is an ASCII printer file, so it's a good exercise in string manipulation just to convert the text strings into usable data. There is also some amusement value that can come from looking at the write-in votes. I am assigning to my students, for example, the question of which duck (Daffy or Donald) got more votes in Richland County last November. (Note: Some of the write-in votes will use NotSafeForWork words. In a college classroom I don't find this a problem, but you would have to be prepared for this in high school situations.)

I also excerpted three precincts, including my own (although I don't know that I can recognize my own vote because I don't know that I remember who got my vote for Soil and Water Commissioner). It turns out to be a really cool use of the Java TreeMap to count votes in one pass. We don't, for example, have a list of all the candidates and contests—we build that from the data. Rather than put the votes in a spreadsheet and then either sort several times or make several passes, we can use contest and candidate as the key value for a TreeMap. The first time we pull up the value associated to the key, it's null, and we store the first vote. The rest of the time we add in the vote and store the (key, value) pair back. This lets us count all the votes in one pass over the data and is a good lesson on the value of the right data structure. It's a good problem of handling variable sized tables inside the data and data that isn't sorted to begin with. I will get maybe three homework programs out of this as we build to a program that will in fact count all the votes from the data file.

And there is a good message here. I have given the students the data from my own precinct, where the counts are correct. When we get to the assignment that will have them count votes, I will have them cross-check against the official totals on the state website. In the other two precincts there were 1127 total votes that didn't get included in the certified count for November 2. They may hear about it from press releases (or this blog!) but I don't intend to tell the students this little item before the assignment is made. I suspect there will be a lesson, when we get to this assignment, about "socially relevant computing", when they find more votes than got counted. And it will be a nice message to the media that second semester undergraduates are fully capable of writing code that would find problems in the vote counts for the November election.

Note: Donald Duck received the most votes.

Duncan Buell
CSTA Board of Directors

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)