This blog will be dedicated to examining and promoting civic data in Chicago, Cook County and Illinois.
WBEZ is partnering with the Smart Chicago Collaborative to promote civic data. This blog will be part of that collaboration.
We'll post original data sets @OpenSocrata
This week the city of Chicago added new information on traffic violations to the data portal with the addition of location for speed and red light cameras, to go with violation data they posted last month.
For example, here’s a map of all the speed camera locations in Chicago with total and mean violation counts.
This issue got a huge amount of attention recently with the Chicago Tribune’s investigation into red light camera tickets, which showed that many drivers were getting suspicious tickets during anomalous spikes.
The data on violations only goes back to July 1, so it’s not possible to go through the Tribune’s process with these numbers. Of course it is possible with the Tribune’s own data, which they’ve made available for download.
Cook County also released new data this week, covering foreclosures and mortgages, and three sets around tax codes.
The tax data sets include listings of tax codes, agencies and rates for the more than 1,800 individual taxing districts in Cook County (which is crazy one its own), with rate information going back to 2006. The housing data covers listings of foreclosures, mortgages, and quit claim deeds from 2013 to August 2014.
The new release is part of Cook County’s ongoing push to add more information to their data portal.
Over the past few years there has been more and more pressure on government to treat technology more like companies in the private sector. Cook County took a step in that direction in April when they named Simona Rollinson as Chief Information Officer, after 15 years with Follett Software. Her task: Update an antiquated web of IT systems, something often easier said than done in county government. She joins us to discuss what she’s doing to overhaul their outdated IT systems.
Want to know how many grenade launchers law enforcement agencies in your county have? There’s a database for that now.
This week NPR released a database of military surplus purchases by local law enforcement agencies around the country from the Pentagon’s Law Enforcement Support Office. Included are purchase order for guns, ordinance robots and MRAP’s (that would be mine-resistant, ambush-protected vehicles), among other things. There’s also construction equipment, portable generators and musical instruments.
Purchases are broken down by county, not agency, so it’s not clear whether a purchase came to the city of Chicago, Cook County or another municipality, but does give an idea of how much equipment is available in the area for law enforcement. (Agency data is available for a few states, including Indiana but not Illinois, based on separate information requests the NPR team submitted.)
For example, there are 19 MRAP’s in Illinois and Indiana, with three in Cook County alone. If you’re curious what exactly an MRAP is, here’s a video of a few rolling down Lake Shore Drive.
The database also includes what the military originally paid for the item, though not what the new owner paid to get it. When the Northwest Indiana Regional SWAT team got its MRAP, the agency paid $1, though the original price tag to the military was $412,000.
In Illinois, Lake and Cook counties have received the most value, ranking 20 and 21 nationally. Both sit behind Clark County, Indiana, though. The Southeast Indiana county has acquired nearly $14 million in military surplus equipment since 2006. That’s $127 worth for each of the county’s 110,100 residents.
Still, that doesn’t mean Clark County (and specifically the Sheriff’s Department, which received the majority of the equipment) is awash in hi-tech weaponry. Weapons actually only account for 0.4 percent of the total value.
Instead, Clark County is stocking up on vehicles, construction and materials handling equipment, and tractors. Those four categories account for the majority of the military surplus value they’ve received since 2006.
If you’re curious and want to dig into the data yourself, the files are available on Google Docs, and NPR also has a github repo explaining how to set up your own database.
Divvy is your Chicago bike sharing system with thousands of bicycles available to you 24/7.
This week Divvy, Chicago’s bikeshare, released its second set of data, covering the first half of 2014. The first release in February had the the service’s first six months, so now there’s a whole year worth of information to look through.
Since 2001, fewer and fewer Chicago students are attending their neighborhood elementary school. The increase in school choice – including charter, gifted and magnet schools – have meant that all schools are part of the choice system in CPS. Over that time, the percentage of CPS students who attended their neighborhood school dropped from 74 to 62 percent.
Yesterday we published a story on the trend, including a map charting the change around the city over the past decade. In addition to all that, you can also download all the data we used in putting the story together.
The spreadsheet has information since 2001 on nearly 600 schools – including those that have opened and closed in that time – on the number of children attending and counts on how many are from the school’s attendance boundaries and how many aren’t.
Specifically, here’s what’s included:
Name of School: Common name of the school
Address: Most recent address for this school. Some schools have changed location over this time period so this does not reflect its address for all years.
UNIT: CPS unit number
CPS School ID: School ID assigned by Chicago Public Schools
Total Attending: Total number of students enrolled in the school .
Residing Attending: Number of students attending the school who reside in the school’s attendance boundary. Schools without an attendance boundary will display 0.
Attending Not Residing: Number of students attending the school who do not reside in the school’s attendance boundary. For schools without an attendance boundary this is equal to TA.
Residing Not Attending: Number of CPS students who reside in a school’s attendance boundary but do not attend the school. Schools without an attendance boundary will display 0.
Total Residing: Total number of CPS students residing in the school’s attendance boundary .
Residing Attending/Total Attending: Percentage of the school’s students who reside in the attendance area.
Attending Not Residing/Total Attending: Percentage of the school’s students who do not reside in the attendance area.
Residing Attending/Total Residing: Percentage of CPS students in the school’s attendance area who attend that school.
Residing Not Attending/Total Residing: Percentage of CPS students in the school’s attendance area who do not attend that school.
Our story and map focused on the percentage of CPS students in an area choosing to go to their neighborhood school as a proxy for community buy-in to the school, but there are other questions you could look at with the data. The other side to the question we looked at is the percentage of students at a school who live in the area (Residing Attending/Total Attending).
The data also capture the opening of closing of schools, as well as the increase in charters and other schools without attendance boundaries. You can find those schools by filtering for schools that have a Total Attendance number (meaning the school is open that year) but without a Total Residing number (which means no attendance boundary).
We hope others interested in the topic will find questions and answers we hadn’t even thought of.
As of today the mid-term elections are only three months away. Between now and November 4th, campaigns will be frantically raising and spending as much money as they can. While candidates do need to disclose their spending, this information isn’t that easy to find, and it can be even harder to analyze. That process got a little easier last week with the launch of ElectionMoney.org.
The site collects information on candidates, contributions, spending and more, and makes it free to download. It’s intended for people who are serious about finding out information about campaign finance, especially journalists, researchers and analysts.
Rayid Ghani has been one of the big names in the U.S. civic data movement. As chief scientist for Obama for America, he helped make analytics a popular topic for governments, non-profits and other groups that never realized the potential value their information held.
Now as the Chief Data Scientist for the University of Chicago’s Urban Center for Computation and Data, Ghani has continued to push and get data scientists interested in societal problems.
Last year he founded Data Science for Social Good, a fellowship through UChicago that matches up students with data skills with organizations looking for help solving problems with data.
In its second year, DSSG now has 48 fellows working with groups not only in Chicago but throughout the country and even an organization in Mexico. A second DSSG group has even started in Atlanta.
We visited DSSG offices recently and spoke with Ghani about what data science really means, what makes Chicago’s data community different and where he sees the field going.
One of the fellows mentioned to me that on the first day you put statements on the board and they had to agree or disagree, and one was “Data science is not a real field, it’s just statistics done by people with weird hair.” For you, how do you define data science and is it a relevant term or a buzzword?
There are going to be trendy buzzwords for every area that’s in demand. In my mind, data makes organizations and people rational, and whatever you call it, that way of rational decision making is not hype, it’s not a buzzword. It’s always been there, it’s just not as widespread because a lot of organizations were not collecting enough data. Now that they’ve collecting data, of course they want to allocate their resources more efficiently, of course they want to do things better, and data is a really good way to start down that path.
The phrase data science may be a buzzword right now, but the people behind it aren’t new people just coming up and saying we do data science. It’s people who were doing related things with computational tools and analytical tools to do better decision making, now they’re calling themselves data science. The phrase may be hype, but there’s a lot of solid science behind it that’s not hype.
The Atlanta group is a great start. A single program in Chicago can only grow so big, so the only way to scale this is a lot of people who are doing this in a distributed way, but connected in a community where they can share. We had a skype session with them. Some of the people applied to our program, we asked if they want to go to their program, some of the projects in both programs can be shuffled around. I see a larger setup happening where there are lots of these programs, not just summer programs but semester-long classes or informal research groups at universities or meet up groups like Code for America trying to do similar things.
There is a core that is interested in these problems, a larger informal network that starts to come together where people have the same goals, they have different skills, so how do you share the overhead? That wasn’t initially the goal, but as we see interest from everybody else we want to make sure we can help them grow these programs and build this larger network.
The fellows have been out at a lot of different meet ups and groups. What do you see as the fellowship’s role in the larger Chicago data community?
Chicago has an interesting data community. People who are in Chicago and interested in data are really interested in deeper, more tangible problems. They’re not as interested in building a web app to find you a date. That’s sort of the typical data uses. They more interested in problems that are deeper, tangible, rooted in lots of data.
One of the things we wanted to do was make that more known. The rest of the country doesn’t really realize that. Part of the goal of bringing these fellows here is that they’ll see the community that is starting in Chicago and choose this as a place to do their work. One of the reasons I’m doing the program in downtown and not Hyde Park where the university is I want these fellows to interact with the local tech community, local nonprofits, local government communities, go to meetup groups. We do happy hours every Friday. The idea is to get them exposed and mingling. Then they become part of this community.
As someone who has been involved in the civic data community for a while, looking at your project and all the things that are popping up there seems to be a lot of attention around it. One, is it moving in the right direction, and two, is it moving fast enough to solve the problems you want to solve?
It’s never going to be fast enough. Problems will always grow faster than the solutions until you reach a certain tipping point. Right now what’s happening is there’s a lot of fascination with data, and not enough fascination with problems. That’s very natural in any new community; you get fascinated by the new shiny things which are data. The problems are old. We still have problems with education, problems in health care, problems with sustainability, problems with community development and crime, safety. The problems are not new, the data is the new thing and I think initially people get attracted to new things, but when that initial hype settles done it comes down to solving the real problems that you have.
We’re still in the data fascination stage. We’re moving to the problem phase, where what’s really important right now is for people who have the problems to make sure that people with the skills to solve these problems know about these problems. If you’re not telling people about your problems, you can’t get mad at them for not solving. If they’re building a web app that’s not very useful to you, maybe you should tell them what you’re problems are.
A lot of time I spend with nonprofits and cities is to try and illicit what are your biggest problems. How do I communicate to, not just the fellows here, but meet up groups and different people who have the skills to solve these problems, how do I translate these problems to them so they find them exciting and motivating and start working on the. Because without that guidance they’ll start working on problems that interest them and may not be the most useful problems for society in general.
The enterprising people at DataMade just released Election Money, where you can download 170MB of bulk campaign finance records from the Illinois Board of Elections. My copy is still downloading, but excited to start digging through this.
Over the past few years governments and non-profits have gotten more and more data about the problems they’re trying to solve. The resources to analyze and act on that information hasn’t grown nearly as fast. Recognizing the disconnect, last year Rayid Ghani decided to do something about it. The former chief scientist for Obama for America and current University of Chicago fellow created Data Science for Social Good. The program works to connect organizations with students who can solve problems with data.
The city of Chicago released a bunch of new sets to the city data portal recently. From the city’s Chicago Digital blog:
The City of Chicago has released a handful of new datasets which pertain to several parts of daily life in Chicago. The public will be able to explore the water quality at Chicago beaches, find who and which vehicles are licensed to carry passengers, activities for Chicago’s Micro-Market Recovery Program, and the geographic areas targeted by the City’s Broadband Innovation Challenge.
The most interesting aspect of the new batch is the addition of pedicab licenses to the list of licensed Public Chauffeurs. The city started regulating the pedicab industry in June. In addition to requiring a license, pedicabs were banned from operating in the loop.
This release has the first 25 approved pedicab licenses, but also the first four denied applications and 10 more inactive ones.
Not surprisingly, the first license issued went to T.C. O’Rourke, a Chicago Pedicab Association board member. O’Rourke told Streetsblog after the ordinance passed he was in favor of the license regulations but not the geographic restrictions. He also gave his thoughts on what it all meant for his business. Go and read it.
Another interesting note: Of all the applicants, only one is female. That would be Joanne Marie Werling, who had her license approved June 11.
Switching gears (please forgive me the pun), the city’s post at Digital Chicago has a graph showing all the models of active cabs in Chicago (spoiler: cabbies love Camrys). That led to a long time sorting and filtering the makes and model of the cabs and livery vehicles.
While sorting through vehicle manufacturers I noticed Tesla listed. Indeed, there are two Teslas registered as livery vehicles, though the data portal has them coded as gasoline vehicles. Need to check in on that, but it may be an incorrect categorization.