In the data science world, getting insight means endless hours of fine-tuning models, optimizing code, pursuing faster compute resources, and finding the best way to visualize the data. Call it inertia, but in contrast we rarely spend time to think about the research question that propels all of this work. Regularly stepping back to question the question behind your research project is nevertheless essential. That's because your data, analytics and insight are as good as your research question.
A good example of questioning the research question is our recent project with WBEZ Chicago Public Radio. It started with two seemingly straightforward questions: To what extent have gun seizures dropped in the city? And what has contributed to the observed trend? To answer these questions, we focused on gun seizure data and examined crime and police activity indicators. But after stepping back to assess the results we revisited the question. As new qualitative evidence was gathered, it started pointing to another line of inquiry. The research question for this project has entirely transformed. It led us to analyze different data sets, and eventually drove new, unexpected insight about stop-frisk strategies in Chicago. Check out our results below and listen to the story. Or skip ahead to learn about questioning the research question.
Stop Cards and Gun Seizures in Chicago
Research backgroundThis project started with Chip Mitchell, WBEZ's investigative reporter, who received anecdotal evidence about lowered morale among CPD cops. This was after the release of the Laquan McDonald shooting video and the subsequent firing of CPD chief Garry McCarthy. How can this atmosphere effect policing in the city? Chip decided to look at gun seizures as a way to examine proactive policing in the city. Because he was interested in examining other possible factors that could affect police work, he also asked for murder data, police resources, measured as police staffing, as well as police activity measured as murders solved (clearance rates).
A qualitative-quantitative insight cycleMaking sense of all these data was not trivial task. So Research Done was brought into the project. Our part was to provide data wrangling services: handling various messy data sets to enable analysis, as well as to validate data, identify and obtain additional data sources, analyze the data and visualize results. For example, we quickly noticed gaps in early years of the gun seizure data. These were confirmed by CPD to be unreliable, so we dropped them from our analysis. To put results into perspective, we also identified and compared CPD’s data with data from other police agencies in major cities as reported to the FBI and the ATF. We triangulated data from different sources including Illinois crime data, annual CPD reports (when available), and we merged textual data identifiers and codes with the quantitative data to focus on relevant data categories. Finally, we offered a way to simplify a comparison across these different variables, as percent change from the first year in the data. And to visualize the data, we developed interactive charts that handle considerably different number-scales and allow exploring many comparative trends at the same time.
While both police activity and crime trends were overall negative, it was difficult to discern any direct correlation among them and gun seizures. Perhaps the question needed to be a different one. Based on these results, Chip relaunched his qualitative investigation in pursuit of an alternative question. Looking for additional informants, he then started hearing about the importance of contact cards, formal records of stop-and-frisk instances. He asked CPD for data on contact cards since they were digitized. Days later, we obtained all of the contact card data, and processed several million records from 2003 to the first months of this year (2016). The findings were evident. In years when the number of contact cards went up (2003-2007), the number of gun sized went down. In the three years when the number of contact cards went down (2007-2009), gun seizures went up. By how much? See for yourself in the charts above. Our analysis cannot suggest that the push for contact cards has caused a drop in gun seizures. Nor does it indicate that stop-and-frisk is a poor policing strategy. Instead, our analysis has revealed that the two are negatively correlated in Chicago.
The take home from this project is the iterative process of questioning the research question. It started with one set of research questions, modeling quantitative data, and then pursuing additional qualitative data that led to new research questions and data analytics. It has yielded a surprising and important finding, an insight that this research project did not initially seek to find.