• Subscribe

Decomposing Bodies

Speed read
  • Nineteenth-century prison cards form a pre-digital database
  • Complex handwritten records test machine learning abilities
  • XSEDE connects humanists with supercomputing infrastructure

Paris, 1885: A man is arrested for theft. At the police station, a clerk records his name, photographs his face and profile, measures his body in fourteen places, and notes any scars or tattoos.

Has he been arrested before? It’s difficult to say.

By this time, the population of Paris is nearly 2 million, and police can’t recognize every criminal personally. National ID cards don’t yet exist. Fingerprinting won’t be accepted for another twenty years.

<strong>Wild card.</strong> Bertillon cards contained a record of 14 measurements, notes of body peculiarities, and standardized photographs for easy recognition by law enforcement. Courtesy Head of Service Régional d'Identité Judiciaire de Paris.

So the prisoner’s body is broken apart — decomposed — into data and recorded on a specialized Bertillon card. Filed according to a complex system, Alphonse Bertillon’s method effectively creates a pre-digital database of recidivism.

The Bertillon method soon spreads beyond Paris to the United States, where completed cards still exist in archives and historical societies.

What can these cards of long-dead prisoners tell us today? 

A lot, says Alison Langmead, professor of art history and director of the Visual Media Workshop at the University of Pittsburgh.

With the help of scientists from the Extreme Science and Engineering Discovery Environment (XSEDE), she and art history colleague Josh Ellenbogen are analyzing over 12,000 nineteenth-century cards from the Ohio State Reformatory and Ohio State Penitentiary.

“Building this dataset is like planting a seed in the ground,” says Langmead. “The dataset will flower into many different subject areas and tell us more about the human experience.”

What computers don’t see

The collected cards, which contain a wealth of sociological information about prisons, including a prisoner’s race, nationality, crime committed, eye color, and even ear shape, account for the vast majority of Ohio’s state prisoners from 1897 to 1911.

Langmead and her students visited the Ohio History Connection in Columbus and photographed both sides of 12,500 cards.<strong>Throw the book at 'em.</strong> Frontispiece from Bertillon's Identification anthropométrique (1893), demonstrating the measurements one takes for his anthropometric identification system.

But transforming the resulting 25,000 images from raw historical artifacts into a usable dataset is a challenging task: The cards are old, warped, and the ink is blotchy.

And then there’s the nineteenth-century handwriting.

“All these steps that we usually assume are easy, once you add a little bit of a noise, recognition becomes a very difficult problem,” says Paul Rodriguez, research analyst at the San Diego Supercomputer Center (SDSC).

Rodriguez, who is using the cards to enhance cursive-offline handwriting recognition capabilities on the Comet supercomputer, explains that many machine-learning image-recognition research works with data that has been pre-cleaned.

While the Bertillon cards contain text of predictable types in a constrained form, humans are virtually unconstrained in how they fill them out.

“In many ways, the negative results are more interesting than the positive results,” says Rodriguez. “This project let us explore what these so-called ‘deep’ computer machine-learning methods are capable of and how far they can go for us.”

To store and manage the variations on the front side card images, Langmead looked to Sandeep Puthanveetil Satheesan, research programmer in the Innovative Software and Data Analysis Division at the National Center for Supercomputing Applications. 

Puthanveetil Satheesan advised the team to use the web-based, scalable, open source research data management software called Clowder. They then processed the images on the Bridges supercomputer at the Pittsburgh Supercomputing Center, drawing on extractor services from the Brown Dog project.

“The importance of identifying as many variations in data and as early as possible in the lifespan of a project can’t be emphasized enough, especially when the processing algorithms are dependent upon the grouping of data based on such variations,” says Puthanveetil Satheesan.

Who needs supercomputing?

Langmead, Rodriguez, and Puthanveetil Satheesan were brought together by Alan Craig, digital humanities specialist for the XSEDE project.

<strong>Ear today, gone tomorrow.</strong> Bertillon saw ears as a means to identify individuals. His system collared 241 repeat offenders in 1884 and caught the notice of police across Europe. Courtesy The Wellcome Trust. <a href= 'https://creativecommons.org/licenses/by/4.0/legalcode'>(CC Attribution 4.0.)</a>

“My role is to engage new and different kinds of projects, particularly from the arts and humanities,” says Craig. “I find the technical people to work with them and make sure the computing resources are available.”

Academics in all fields can make the mistake of believing that humanists don’t need technology to complete their research. But that’s not so, says Langmead.

“There are lots of reasons why humanists would need supercomputing infrastructure. In my case, I had really interesting data and colleagues who wanted to try out these systems on it. It’s a match made in the twenty-first-century academy.”

“We’re all scientists, we’re all looking for knowledge,” she says. “The computer is a tool — it can be used for whatever we need.

All humanists are already digital humanists in some small measure. Some of us also take it as our charge to become much more highly engaged with the ways that computing can be used to facilitate our research.”

Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2017 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.

Republish

We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.