• Subscribe

Big data breaks the incarceration cycle

Speed read
  • Many inmates of local jails struggle with multiple problems
  • Data modeling identifies individuals at greatest risk of re-incarceration
  • Reducing incarceration rates improves communities and saves money

On an average day in the US, more than 700,000 people are confined in county and city jails. Over half of these inmates struggle with mental illness, substance abuse, chronic health issues, or some combination of the three. A number of them will return to jail at least once within three years of their release; some will return multiple times.

These highly vulnerable people cycle repeatedly through not just local jails but also emergency medical services, shelters, and other public systems, receiving fragmented care that delivers substandard outcomes at a great cost to the community. <strong>Big data, good data. </strong> Officials from Johnson County, Kansas partnered with data scientists from the University of Chicago to apply big data insights to save money for local law enforcement and reduce re-incarceration rates. Courtesy Salomon, Bauman, et al.

In Johnson County, Kansas, as in many jurisdictions, mental health, EMS, and jail systems rarely share data. Lack of coordination between agencies makes it difficult to discern patterns that could predict future system contacts, meaning that individuals with complex needs often return to jail. 

In 2016, Data Science for Social Good researchers from the University of Chicago identified individuals who had contact with multiple systems. They used this model to predict individual jail bookings.  

The team of Erika Salomon, Matt Bauman, Kate Boxer, Tzu-Yun Lin, and Hareem Naveed connected with Johnson County through the Obama White House’s Data Driven Justice Initiative, taking earlier initiatives focused on frequent visitors to hospital emergency departments as a starting point. 

“By connecting patients with more comprehensive and proactive care, many hospitals have been able to reduce readmissions and improve long-term outcomes,” says Bauman, data science fellow with the Center for Data Science and Public Policy. “We hope to do something similar for jail time.”

Modeling risk

Johnson County is a suburban area near Kansas City with a population of nearly 600,000. In 2010, over 100,000 individuals had contact with emergency medical services. In the same year, 50,000 individuals were booked into the county jail. 

<strong>Matt Bauman </strong> is using his data scientist superpowers for good. Courtesy University of Chicago; Data Science for Social Good.

Hoping to reduce that number, the team identified individuals at risk of going to jail but who would be better served by receiving mental health or other services.

“At-risk individuals with complex problems are less likely to have police encounters if they are connected to services,” says Robert Sullivan, criminal justice coordinator for Johnson County.

The team began by analyzing more than six years of individual-level historical data from the county’s criminal justice, mental health, and ambulance transport records.

After consultation with county agencies about risk factors, the researchers deployed the scikit-learn package in Python to develop a model to score individuals on their likelihood of entering jail within the next one-year period. (Open source code for the prototype model is available on GitHub.)

“Having the data is just the very first step,” says Bauman. “We’ve been highly dependent upon discussions with practitioners to determine how we can best analyze the data and how they may be able to shape a resulting intervention.” 

Using the scores to identify the 200 individuals at highest risk for future incarceration, the team applied the next five years of data to refine the model, checking their predictions against the following year’s actual jail bookings.

Detaining low-level, nonviolent offenders in local jails costs taxpayers billions of dollars each year — a modest investment in data-driven programs can drastically reduce those costs.  

The final version of the model performs with a prediction accuracy of 51 percent— about 25 percent better than knowledgeable human classification.

“Nothing will ever replace the human compassion and insight that is required to meet someone where they are, connect with them, and figure out how best to walk alongside them in their journey,” says Bauman. “But our model enables caseworkers to effectively prioritize their outreach efforts.”

For the public good

Of those individuals identified as being at highest-risk for re-incarceration in the final year of the study, slightly more than half spent time in jail later that year, adding up to a combined total of nearly 18 years of jail time.  <strong>Jail time. </strong> Number of inmates in US prisons, jails, and juvenile facilities 1920-2014. Courtesy US Bureau of Justice Statistics; Office of Juvenile Justice and Delinquency Prevention; Arkyan,<a href= 'https://creativecommons.org/licenses/by-sa/3.0/legalcode'>(CC_BY_SA 3.0)</a>.

Detaining low-level, nonviolent offenders in local jails costs taxpayers billions of dollars each year and frays the fabric of communities, impacting employment rates, homelessness, and the welfare of children. Yet a modest investment in data-driven programs can drastically reduce those costs.  

“What excites me most about this kind of work is that I view it as ‘helping people to do good better,’” says Bauman. “I’d love to see an effective intervention that helps folks live happier, more productive lives. And I’d much rather see my tax dollars funding such a program than being used to lock people up.” 

Bauman looks forward to the day when similar models are employed across the country.

“Our system could definitely scale to a much larger city,” says Bauman. “Computers are easy and cheap to scale. Once the data are collected, it’s just as easy to prioritize care.” 

Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2021 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.