This blog will be dedicated to examining and promoting civic data in Chicago, Cook County and Illinois.
WBEZ is partnering with the Smart Chicago Collaborative to promote civic data. This blog will be part of that collaboration.
We'll post original data sets @OpenSocrata
Since 2001, fewer and fewer Chicago students are attending their neighborhood elementary school. The increase in school choice – including charter, gifted and magnet schools – have meant that all schools are part of the choice system in CPS. Over that time, the percentage of CPS students who attended their neighborhood school dropped from 74 to 62 percent.
Yesterday we published a story on the trend, including a map charting the change around the city over the past decade. In addition to all that, you can also download all the data we used in putting the story together.
The spreadsheet has information since 2001 on nearly 600 schools – including those that have opened and closed in that time – on the number of children attending and counts on how many are from the school’s attendance boundaries and how many aren’t.
Specifically, here’s what’s included:
Name of School: Common name of the school
Address: Most recent address for this school. Some schools have changed location over this time period so this does not reflect its address for all years.
UNIT: CPS unit number
CPS School ID: School ID assigned by Chicago Public Schools
Total Attending: Total number of students enrolled in the school .
Residing Attending: Number of students attending the school who reside in the school’s attendance boundary. Schools without an attendance boundary will display 0.
Attending Not Residing: Number of students attending the school who do not reside in the school’s attendance boundary. For schools without an attendance boundary this is equal to TA.
Residing Not Attending: Number of CPS students who reside in a school’s attendance boundary but do not attend the school. Schools without an attendance boundary will display 0.
Total Residing: Total number of CPS students residing in the school’s attendance boundary .
Residing Attending/Total Attending: Percentage of the school’s students who reside in the attendance area.
Attending Not Residing/Total Attending: Percentage of the school’s students who do not reside in the attendance area.
Residing Attending/Total Residing: Percentage of CPS students in the school’s attendance area who attend that school.
Residing Not Attending/Total Residing: Percentage of CPS students in the school’s attendance area who do not attend that school.
Our story and map focused on the percentage of CPS students in an area choosing to go to their neighborhood school as a proxy for community buy-in to the school, but there are other questions you could look at with the data. The other side to the question we looked at is the percentage of students at a school who live in the area (Residing Attending/Total Attending).
The data also capture the opening of closing of schools, as well as the increase in charters and other schools without attendance boundaries. You can find those schools by filtering for schools that have a Total Attendance number (meaning the school is open that year) but without a Total Residing number (which means no attendance boundary).
We hope others interested in the topic will find questions and answers we hadn’t even thought of.
As of today the mid-term elections are only three months away. Between now and November 4th, campaigns will be frantically raising and spending as much money as they can. While candidates do need to disclose their spending, this information isn’t that easy to find, and it can be even harder to analyze. That process got a little easier last week with the launch of ElectionMoney.org.
The site collects information on candidates, contributions, spending and more, and makes it free to download. It’s intended for people who are serious about finding out information about campaign finance, especially journalists, researchers and analysts.
Rayid Ghani has been one of the big names in the U.S. civic data movement. As chief scientist for Obama for America, he helped make analytics a popular topic for governments, non-profits and other groups that never realized the potential value their information held.
Now as the Chief Data Scientist for the University of Chicago’s Urban Center for Computation and Data, Ghani has continued to push and get data scientists interested in societal problems.
Last year he founded Data Science for Social Good, a fellowship through UChicago that matches up students with data skills with organizations looking for help solving problems with data.
In its second year, DSSG now has 48 fellows working with groups not only in Chicago but throughout the country and even an organization in Mexico. A second DSSG group has even started in Atlanta.
We visited DSSG offices recently and spoke with Ghani about what data science really means, what makes Chicago’s data community different and where he sees the field going.
One of the fellows mentioned to me that on the first day you put statements on the board and they had to agree or disagree, and one was “Data science is not a real field, it’s just statistics done by people with weird hair.” For you, how do you define data science and is it a relevant term or a buzzword?
There are going to be trendy buzzwords for every area that’s in demand. In my mind, data makes organizations and people rational, and whatever you call it, that way of rational decision making is not hype, it’s not a buzzword. It’s always been there, it’s just not as widespread because a lot of organizations were not collecting enough data. Now that they’ve collecting data, of course they want to allocate their resources more efficiently, of course they want to do things better, and data is a really good way to start down that path.
The phrase data science may be a buzzword right now, but the people behind it aren’t new people just coming up and saying we do data science. It’s people who were doing related things with computational tools and analytical tools to do better decision making, now they’re calling themselves data science. The phrase may be hype, but there’s a lot of solid science behind it that’s not hype.
The Atlanta group is a great start. A single program in Chicago can only grow so big, so the only way to scale this is a lot of people who are doing this in a distributed way, but connected in a community where they can share. We had a skype session with them. Some of the people applied to our program, we asked if they want to go to their program, some of the projects in both programs can be shuffled around. I see a larger setup happening where there are lots of these programs, not just summer programs but semester-long classes or informal research groups at universities or meet up groups like Code for America trying to do similar things.
There is a core that is interested in these problems, a larger informal network that starts to come together where people have the same goals, they have different skills, so how do you share the overhead? That wasn’t initially the goal, but as we see interest from everybody else we want to make sure we can help them grow these programs and build this larger network.
The fellows have been out at a lot of different meet ups and groups. What do you see as the fellowship’s role in the larger Chicago data community?
Chicago has an interesting data community. People who are in Chicago and interested in data are really interested in deeper, more tangible problems. They’re not as interested in building a web app to find you a date. That’s sort of the typical data uses. They more interested in problems that are deeper, tangible, rooted in lots of data.
One of the things we wanted to do was make that more known. The rest of the country doesn’t really realize that. Part of the goal of bringing these fellows here is that they’ll see the community that is starting in Chicago and choose this as a place to do their work. One of the reasons I’m doing the program in downtown and not Hyde Park where the university is I want these fellows to interact with the local tech community, local nonprofits, local government communities, go to meetup groups. We do happy hours every Friday. The idea is to get them exposed and mingling. Then they become part of this community.
As someone who has been involved in the civic data community for a while, looking at your project and all the things that are popping up there seems to be a lot of attention around it. One, is it moving in the right direction, and two, is it moving fast enough to solve the problems you want to solve?
It’s never going to be fast enough. Problems will always grow faster than the solutions until you reach a certain tipping point. Right now what’s happening is there’s a lot of fascination with data, and not enough fascination with problems. That’s very natural in any new community; you get fascinated by the new shiny things which are data. The problems are old. We still have problems with education, problems in health care, problems with sustainability, problems with community development and crime, safety. The problems are not new, the data is the new thing and I think initially people get attracted to new things, but when that initial hype settles done it comes down to solving the real problems that you have.
We’re still in the data fascination stage. We’re moving to the problem phase, where what’s really important right now is for people who have the problems to make sure that people with the skills to solve these problems know about these problems. If you’re not telling people about your problems, you can’t get mad at them for not solving. If they’re building a web app that’s not very useful to you, maybe you should tell them what you’re problems are.
A lot of time I spend with nonprofits and cities is to try and illicit what are your biggest problems. How do I communicate to, not just the fellows here, but meet up groups and different people who have the skills to solve these problems, how do I translate these problems to them so they find them exciting and motivating and start working on the. Because without that guidance they’ll start working on problems that interest them and may not be the most useful problems for society in general.
The enterprising people at DataMade just released Election Money, where you can download 170MB of bulk campaign finance records from the Illinois Board of Elections. My copy is still downloading, but excited to start digging through this.
Over the past few years governments and non-profits have gotten more and more data about the problems they’re trying to solve. The resources to analyze and act on that information hasn’t grown nearly as fast. Recognizing the disconnect, last year Rayid Ghani decided to do something about it. The former chief scientist for Obama for America and current University of Chicago fellow created Data Science for Social Good. The program works to connect organizations with students who can solve problems with data.
The city of Chicago released a bunch of new sets to the city data portal recently. From the city’s Chicago Digital blog:
The City of Chicago has released a handful of new datasets which pertain to several parts of daily life in Chicago. The public will be able to explore the water quality at Chicago beaches, find who and which vehicles are licensed to carry passengers, activities for Chicago’s Micro-Market Recovery Program, and the geographic areas targeted by the City’s Broadband Innovation Challenge.
The most interesting aspect of the new batch is the addition of pedicab licenses to the list of licensed Public Chauffeurs. The city started regulating the pedicab industry in June. In addition to requiring a license, pedicabs were banned from operating in the loop.
This release has the first 25 approved pedicab licenses, but also the first four denied applications and 10 more inactive ones.
Not surprisingly, the first license issued went to T.C. O’Rourke, a Chicago Pedicab Association board member. O’Rourke told Streetsblog after the ordinance passed he was in favor of the license regulations but not the geographic restrictions. He also gave his thoughts on what it all meant for his business. Go and read it.
Another interesting note: Of all the applicants, only one is female. That would be Joanne Marie Werling, who had her license approved June 11.
Switching gears (please forgive me the pun), the city’s post at Digital Chicago has a graph showing all the models of active cabs in Chicago (spoiler: cabbies love Camrys). That led to a long time sorting and filtering the makes and model of the cabs and livery vehicles.
While sorting through vehicle manufacturers I noticed Tesla listed. Indeed, there are two Teslas registered as livery vehicles, though the data portal has them coded as gasoline vehicles. Need to check in on that, but it may be an incorrect categorization.
While resources like the city of Chicago data portal have a lot of great information, it’s also good to step back and think about where the numbers came from.
Like most 311 sets, the report has the date it was filed, when it was completed (if it was) and the location of the report. Abandoned vehicles also have some novel categories, such as make and model, license plate information and even the color of the car.
Those are relatively straight forward (though there are 50+ ways noting a car doesn’t have plate info). A car is a Honda or a Ford. It’s tan or red.
Less clear is the “How many days has the vehicle been reported as parked?” field. On its face it seems like we could just sum all the numbers and get an average numbers of days cars sit in every neighborhood.
The numbers range from zero (typically if a city worker finds a car without a report) to 10,000,000, which would be approximately 27,397 years.
No matter how long it may seem to someone on the block, so car has been abandoned in Chicago for 27,397 years.
While that’s likely a simple typo (or someone expressing their annoyance at said vehicle), there’s some other interesting patterns in how long people think the cars have been left.
The most common time reported is 30 days, and it’s not close. Of the nearly 60,000 completed incidents with a days parked reported, a quarter are for 30 days.
Here’s the list of all days reported with at least 1,000 mentions:
Basically that top five can be read as: one month, one week, two weeks, three weeks, two months. After that are round numbers, numbers divisible by 30 and numbers divisible by 5. When asked, people estimate time frames they know. It’s why the 3 and the zero on my microwave always wear out first.
If you’re using city data it’s important to know how the data were gathered and what the possible biases could be. In this case numbers are more like a survey with a margin of error than an actual measurement. While that shouldn’t stop someone from using it as a guide, it’s important not to draw too much from it without asking more questions first.
Some of my favorite bits of data journalism this year have come out of the MIT “You Are Here” project. Recently they took a look at Chicago transportation, calculating how long it takes to get from each spot in the city to every other. The map then tells you whether it would be faster to walk, ride, drive or take public transit. You can even start to see the L and major bus lines start to appear as you click around the map.
Here’s the statement from Illinois Gov. Pat Quinn on his veto of HB3796, which would have added restrictions for large FOIA requests. The bill also spelled out a fee structure for electronic requests based on the size of the file, averaging about $10/MB. After the bill passed, the Chicago Headline Club and the Citizen Advocacy Center came out and asked Quinn to veto the bill.
This seemed important as Chicago Public Schools just announced around 1,000 layoffs yesterday, citing drops in attendance.
To be clear, these aren’t attendance numbers, but population figures from the Census Bureau’s American Community Survey. While not a direct measure, it allows for comparisons across districts.
Over the past five years Chicago has seen an 8 percent drop for those 17 and younger, compared to a 1.25 percent drop in the total population.
That ranks 10th among Illinois school districts over that time:
We embedded the interactive version of the map below, so you can check the numbers on all the Illinois school districts with at least 10,000 residents.
Used in this post: