Introducing the California Civic Data Coalition
We’re here with two Django applications ready made to make California campaign finance data easier to access
We are the California Civic Data Coalition, a loosely coupled team of reporters and developers from the Los Angeles Times Data Desk, The Center for Investigative Reporting and Stanford’s new Computational Journalism Program.
Our aim: To make California’s public data easier for power users to access. Even though we represent rival media outlets, we’d rather compete at analyzing the data than downloading and parsing it.
Our inspiration: Raw data from CAL-ACCESS, the state of California’s campaign finance and lobbying activity database, is being published online for the first time.
Our opportunity: A two-day summit sponsored by OpenNews last month where we sprinted on two new open-source tools we’re ready to release today.
django-calaccess-raw-data: A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State’s CAL-ACCESS database
django-calaccess-campaign-browser: A Django app to refine, review and republish campaign finance data drawn from the California Secretary of State’s CAL-ACCESS database
Both are designed and packaged according to our experimental “pluggable data” method, which you can read about at greater length here. But here’s how to get hacking as soon as possible.
Assuming you have a Django project already setup, installation is simple.
Currently we only support MySQL databases that allow bulk loading via
LOAD DATA INFILE (that might sound annoying but it’s pretty handy), so make sure you have that configured in
settings.py as well.
Now, sync your database and download that data:
You’ve just installed 76 database tables and nearly 35 million records, including all of the campaign finance and lobbying activity records collected by California government stretching back more than a decade. Visit
http://localhost:8000 and you can start exploring them right away.
Taking it to the next level
django-calaccess-raw-data for folks who wanted to build applications on top of CAL-ACCESS. It doesn’t provide much abstraction, and still comes with a bring-your-own-analysis prerequisite, but it makes the database easier to consume.
We also wanted to build a secondary tool to help folks move more quickly. That’s where
django-calaccess-campaign-browser comes in. It goes the next step and begins to clean, regroup, filter and transform the massive, hairy state database into something more legible. Installation is just as simple.
Now, sync your database and build the new, associated tables:
The campaign browser now provides a simple interface to look up individual filers and search for individual campaign contributions. You can search for a candidate and see all of their associated committees they created to run for a specific office. And if you want the data for that specific committee, all you have to do is click the download tab and select your preferred format.
This code base is still a work in progress, however, and its analysis should be considered as provisional until it is further tested and debugged. We’re working better map out the state’s complex system and bulletproof our figures, but we’re not there yet.
Where you come in
This release represents a milestone for our team, but we still have a lot of work to do. This includes but is not limited to:
- Bulletproofing the analysis process of the campaign browser
- Expanding our documentation to more fully explain the contents of the raw CAL-ACCESS database
- Bringing the campaign browser’s approach to the lobbying activity data also provided by CAL-ACCESS (Already underway but far from complete at django-calaccess-lobbying-browser)
- And, most importantly, generating journalism that demonstrates the power of automating away access to this valuable data set.
Keep an eye out on the California Civic Data Coalition website for more updates on our progress.