OpenNews fellows liberate crucial California elections data

New web scrapers harvest data withheld by the state

A new release of our software for analyzing the millions of dollars spent to influence California elections now features crucial data withheld by the state.

The upgrade comes thanks to members of the 2015 class of Knight-Mozilla OpenNews fellows, who joined us last month in Los Angeles to improve our open-source system for downloading and parsing the California Secretary of State’s CAL-ACCESS database.

The 2015 OpenNews fellows at the Los Angeles Times

While the bulk download provided by the state provides the foundation for our project, it is incomplete.

Missing is the roster of candidates that participated in primaries and advanced to general elections, as well as the list of ballot measures that special interests have routinely spent millions to support or oppose.

Those missing connections are crucial. Without them, analysts must fish out the small number of fundraising committees that matter from a crowded pool of thousands of filers.

State officials declined our request to add the records to their dump. But thanks to Juan Elosua, Francis Tseng and the OpenNews fellows the story didn’t end there. They developed scripts that scrape the data off the state’s website and load it into our database.

The elections data in our database

Their work is now available in version 0.1.1 of django-calaccess-campaign-browser, our Django app to refine and investigate the raw campaign finance data.

It was originally released last August by the California Civic Data Coalition, a loosely coupled team from the Los Angeles Times Data Desk, The Center for Investigative Reporting and Stanford’s Computational Journalism Lab.

If you’d like to join the effort, dozens more tickets are waiting in our GitHub repository. And stay tuned for more efforts to improve our project as we try to level up the library ahead of the 2016 elections.