BOSTON, Mass. — The best crime-fighting tools would, in theory, prevent illegal activity from ever happening. Some police departments have adopted tools they hoped would accomplish this. They analyze data on past crimes with an eye toward predicting where more crime might occur. Then they send extra officers to patrol such places. But there’s a risk to that approach: The data used may not reflect the true rates of local crime. And this could threaten to derail efforts at effective crime prevention, experts now contend.
Such faulty data also risk reinforcing racist attitudes and stereotypes, statisticians said at a forum, here, on February 19. The program was held at the annual meeting of the American Association for the Advancement of Science.
Police departments want to do a good job at fighting crime. But the data they work from may be flawed. For instance, those data may not reflect the full range of people committing crimes. Or they may focus on certain neighborhoods while ignoring others where illegal activities also occur.
Such biased data can inappropriately suggest that some areas as especially crime-ridden. Or if certain types of criminals are less likely to be caught — such as wealthy or well-educated people — police might focus their attention on poorer or less-educated individuals.
Kristian Lum is a statistician, someone who collects and analyzes numbers-based data. She works with the Human Rights Data Analysis Group in San Francisco, Calif. In one of her recent projects, she used a computer to analyze where drug crimes had been occurring in the nearby town of Oakland, Calif. The computer then predicted — or modeled — where police should deploy patrols to head off future drug activity.
This computer model concluded that most drug crimes took place in areas with high numbers of low-income people and where many people were not white. So that’s where more police should go in the future, the model recommended.
In fact, Lum argues, it’s not clear how well this model worked at depicting the situation in Oakland. Those data on drug crimes were biased, she now reports. The problem was not deliberate, she says. Rather, data collectors just missed some criminals and crime sites. So data on them never made it into her model.
Story continues below image
One problem: Many crimes don’t get reported
Past research has suggested that drug abuse and crimes take place at roughly the same rates among different ethnic groups and income levels, Lum says. Poor neighborhoods with many ethnic minorities, though, may get more police attention. So officers may see and collect more crime data from such places.
The police and others may be less likely to notice and report drug crimes in well-to-do areas. Officers also may be less likely to stop someone for suspicious behavior in certain neighborhoods. They might instead focus on where crowds are likely and where people on the streets often interact. If crimes also happen elsewhere, the police may not know about them.
Therefore, if a model is based only on police data, it won’t include all crimes, Lum argues. “To know all crimes, you’d have to live in some sort of surveillance state,” she says. By that, she means the government would watch your every move. And no one wants that.
“You’re going to find things where you’re looking,” Lum says. If you mostly look in inner-city black or Latino neighborhoods, most of the crimes you find will be there. And any computer model working with such skewed data would mistakenly predict that these areas are hot spots for future crime. Such a model would reinforce the idea that certain groups of people are more likely to be criminals — even if they aren’t.
That idea also runs afoul of basic U.S. constitutional law. (A new, February 22 Supreme Court decision holds that people cannot be deemed especially dangerous solely on the basis of race.)
The model used in Lum’s study is called PredPol. It is available to police departments. Race is not supposed to play a role in its results. Yet her study on the Oakland area found that it would target black people at roughly twice the rate of whites.
Lum says it’s important to ask: “How does that affect police-community reactions or relationships?” Might it lead police to behave differently in some neighborhoods or with particular groups of people? That result could reinforce stereotypes against people in poor and minority areas, she notes. And this could lead to more tension in society, she adds. Lum doesn’t know if this is happening now. “But these are things that worry me,” she says.
Greg Ridgeway is a statistician and criminologist (Krim-ih-NAAL-oh-gist). He works at the University of Pennsylvania in Philadelphia. He says that he, too, would worry if police departments used the type of models that Lum described.
But not all police departments focus on officers’ crime reports to figure out where crimes might occur in the future, he points out. Many look at where they get calls for help. In those cases, they’re “trying to anticipate where the public is going to demand their services tomorrow,” he explains.
Such calls for help “are completely representative of where the public is demanding their services,” he observes. “So there’s no bias problem” for those kinds of models. In fact, he adds, it might be negligent if the police didn’t plan to use their resources in such a way.
Clusters of crime
Raid Amin is a statistician at the University of West Florida in Pensacola. He and a graduate student recently used data to map which parts of the United States had clusters of violent crimes between 2000 and 2012. These reported data allowed them to home in on apparent “hot spots” of violence.
“In a nutshell, counties with a high crime rate are likely to have more people living in poverty, a higher percent of minorities and less people with a high school degree,” Amin found. Mapping those crime clusters is “a first step,” he says. Other experts should then investigate how to address crime there. Amin compares the process to the way health officials follow up on reports about groups of cancer cases or outbreaks of disease.
Spotting crime clusters “certainly is a place to start,” Lum agrees. But, she adds, “you shouldn’t only rely on the police data.” Information from social workers, hospitals and other sources could help as well, she says. That’s because people don’t always tell the police about some violent crimes, such as sexual attacks or physical abuse by a date or family member.
Ridgeway also agrees that Amin’s work is helpful. Criminologists already know that poor and minority areas have high crime rates. Combing through the numbers on these, though, can challenge some popular notions.
For instance, he notes, “People still think Los Angeles has a lot of crime.” But Amin’s work didn’t highlight that city as a hot spot. In fact, Ridgeway notes, Los Angeles has become “really a low-crime big city now.” On the other hand, St. Louis, Mo., did get flagged as a hot spot. “It’s got serious crime problems,” Ridgeway notes. Yet people don’t tend to think of it a high-crime area.
Comparing crime fighters
One of Ridgeway’s recent studies took a different approach to crime statistics. Instead of looking at criminals, he focused on New York City police officers involved in shootings. His project compared features of officers who fired their guns against those at the same scene who had not. That way, the only variables were traits of the different officers.
The study included data from 106 incidents over a three-year span. Male and female officers were just as likely to shoot, Ridgeway found. Black officers were more likely to fire their guns than were whites. Officers with more disciplinary problems in their files also were more likely to shoot. But those officers who joined the force at an older age were less likely to fire their guns in an incident.
Story continues below graph
Ridgeway suggests a police department could use his team’s findings to cut down on unnecessary shootings. For example, it could try to recruit slightly older officers. It also could pay sharp attention to cops with discipline problems. Perhaps they should be moved to a desk job.
It’s not clear yet whether the findings about New York City will hold up nationwide. “One of the big challenges here is data,” Ridgeway says. In other cities, “the non-shooters are either not documented well or not documented at all.”
In short, the data have their limits. Sometimes, as Ridgeway notes, the numbers are incomplete or not available. Sometimes more digging is needed to understand what’s happening, as Amin has found. Sometimes data can be misleading or biased, as Lum warns. And sometimes statistics can be misleading if only part of the picture is considered.
Earlier this month, for example, President Donald Trump talked about a “double-digit” rise in the murder rate for the largest U.S. cities. That was true for one year. But that wasn’t the whole story. There had been a massive decline in the murder rate over the past 30 years, Ridgeway notes. There had been an uptick only in the last year. Overall, he notes, “that certainly would indicate that crime is way down.”
“Nevertheless, police should still be trying everything they can to be efficient, to be responsive and to serve communities,” he concludes. Ridgeway, Lum and Amin all hope their work with looking at crime “by the numbers” can help police do that job better.
behavior The way a person or other organism acts towards others, or conducts itself.
bias The tendency to hold a particular perspective or preference that favors some thing, some group or some choice. Scientists often “blind” subjects to the details of a test (don’t tell them what it is) so that their biases will not affect the results.
computer model A program that runs on a computer that creates a model, or simulation, of a real-world feature, phenomenon or event.
criminology A research field that focuses on understanding crime and criminals. Experts in this field are known as criminologists.
data Facts and/or statistics collected together for analysis but not necessarily organized in a way that gives them meaning. For digital information (the type stored by computers), those data typically are numbers stored in a binary code, portrayed as strings of zeros and ones.
human rights The basic rights that should be available to all people of just treatment; respect and dignity; a nationality; the ability to marry; basic freedoms (of religion, political opinions, speech); protection from slavery, torture and persecution; and the ability to earn a living that will provide basic food and shelter for families. Many details of this are spelled out in the United Nations’ Universal Declaration of Human Rights .
model A simulation of a real-world event (usually using a computer) that has been developed to predict one or more likely outcomes.
outbreak The sudden emergence of disease in a population of people or animals. The t erm may also be applied to the sudden emergence of devastating natural phenomena, such as earthquakes or tornadoes.
recruit (in research) New member of a group or human trial, or to enroll a new member into a research trial. Some may receive money or other compensation for their participation, particularly if they enter the trial healthy.
risk The chance or mathematical likelihood that some bad thing might happen. For instance, exposure to radiation poses a risk of cancer. Or the hazard — or peril — itself. Among cancer risks that the people faced were radiation and drinking water tainted with arsenic.
social (adj.) Relating to gatherings of people; a term for animals (or people) that prefer to exist in groups. (noun) A gathering of people, for instance those who belong to a club or other organization, for the purpose of enjoying each other’s company.
society An integrated group of people or animals that generally cooperate and support one another for the greater good of them all.
statistics The practice or science of collecting and analyzing numerical data in large quantities and interpreting their meaning. Much of this work involves reducing errors that might be attributable to random variation. A professional who works in this field is called a statistician.
stereotype A widely held view or explanation for something, which often may be wrong because it has been overly simplified.
surveillance A term for watching or keeping track of the behavior of others, usually in a stealthy manner or from a distance.
variable (in mathematics) A letter used in a mathematical expression that may take on different values. (in experiments) A factor that can be changed, especially one allowed to change in a scientific experiment. For instance, when researchers measure how much insecticide it might take to kill a fly, they might change the dose or the age at which the insect is exposed. Both the dose and age would be variables in this experiment.
Supreme Court Decision: Buck v. Davis, director, Texas Department of
Criminal Justice, Correctional Institutions. February 22, 2017.
Meeting: R. Amin. “Geographic clusters of reported violent crimes in the United States: 2000-2012.” American Association for the Advancement of Science 2017. February 19, 2017. Boston, Mass.
Meeting: K. Lum. “Societal consequences of biased data in predictive policing.” American Association for the Advancement of Science 2017. February 19, 2017. Boston, Mass.
Meeting: G. Ridgeway. “Officer risk factors associated with police shootings: A matched case-control study.” American Association for the Advancement of Science 2017. February 19, 2017. Boston, Mass.
Journal: K. Lum and W. Isaac. “To protect and serve?” Significance. Vol. 13, October 2016, p. 14. doi: 10.1111/j.1740-9713.2016.00960.x.
Journal: G. Ridgeway. “Officer risk factors associated with police shootings: A matched case control study.” Statistics and Public Policy. Vol. 3, 2016. (Published online December 18, 2015). doi: 10.1080/2330443X.2015.1129918.