Archaeologists vs. Computer: New Study Fuels Sorting Competition

When it came to the tedious task of categorizing pottery fragments, a deep-learning model was found to be just as accurate, and far more efficient, as four human experts.,


Continue reading the main story

Supported by

Continue reading the main story

A key piece of an archaeologist’s job involves the tedious process of categorizing shards of pottery into subtypes. Ask archaeologists why they have put a fragment into a particular category and it’s often difficult for them to say what exactly had led them to that conclusion.

“It’s kind of like looking at a photograph of Elvis Presley and looking at a photo of an impersonator,” said Christian Downum, an anthropology professor at Northern Arizona University. “You know something is off with the impersonator, but it’s hard to specify why it’s not Elvis.”

But archaeologists have now demonstrated that it’s possible to program a computer to do this critical part of their job as well as they can. In a study published in the June issue of The Journal of Archaeological Science, researchers reported that a deep-learning model sorted images of decorated shards as accurately — and occasionally more precisely — as four expert archaeologists did.

“It doesn’t hurt my feelings,” Dr. Downum, one of the study’s authors, said. Rather, he said, it should improve the field by freeing up time and replacing “the subjective and difficult-to-describe process of classification with a system that gives the same result every time.”

The study focused on Tusayan White Ware, a type of painted hand-formed pottery used for serving food and storing water in the canyons and mesas of northeastern Arizona between 825 and 1300. In the 1920s, archaeologists figured out that Tusayan White Ware pieces have consistent patterns depending on the time period in which they were created.

ImageImages of Tusayan White Ware with distinctive design elements that made the various types identifiable.
Images of Tusayan White Ware with distinctive design elements that made the various types identifiable.Credit…Leszek Pawlowicz and Christian Downum/ Northern Arizona University

The researchers recruited four of the most experienced analysts of this particular type of pottery. Each had spent 30 or more years analyzing ceramics and had previously classified tens of thousands of Tusayan White Ware fragments.

They also spent about four hours training a neural network, a complex mathematical system that can learn specific tasks by analyzing vast amounts of data, to sort photographs of Tusayan White Ware.

Human and machine were each tasked with categorizing thousands of images into one of nine known types and evaluated on the accuracy of their answers.

The neural network tied two of the human analysts for accuracy and beat the other two, the researchers found.

The machine was also far more efficient. Because the task was dull, none of the human analysts wanted to go through all 3,000 photographs without stopping, Dr. Pawlowicz said. So even though they probably could have completed the task in three hours, each conducted the analysis through several sessions over three to four months.

The neural network whipped through thousands of images in a few minutes.

Not only was the computer program more efficient and as accurate as the archaeologists, it was also able to better articulate why it had categorized shards a certain way compared with its living, breathing competitors. In one case, the computer offered up a smart sorting observation that was new to the researchers: It pointed out that two similar types of pottery with barbed line design elements could be distinguished by whether the lines connected at right angles or were parallel, said Leszek Pawlowicz, an adjunct faculty member at Northern Arizona University and another author of the study.

Machine also outshined humans in offering only one answer for each classification; the participating archaeologists often disagreed on how items were categorized, a known issue that often slows archaeological projects, the authors said.

Phillip Isola, an electrical engineering and computer science professor at M.I.T. who was not involved in the study, said he was not surprised that the neural network performed as well as — or sometimes better than — the archaeologists.

“It’s the same story we’ve heard a few times now,” Dr. Isola said. In the field of medical imaging, for example, researchers have found that neural networks rival radiologists at identifying tumors. Academics are also using similar tools to categorize plant and bird types.

This is also far from the first time archaeologists have turned to artificial intelligence. In 2015, researchers in France applied machine learning to classifying medieval French ceramics. A group of archaeologists and computer scientists from five countries is also developing a digital tool to categorize pottery shards. Neither of these projects explicitly pits human against machine, however.

Since the study began to circulate, some archaeologists have shared concerns with the authors that they will be replaced by machines. Dr. Downum and Dr. Pawlowicz said they were not worried about such a thing happening.

“We’re the ones that decide what’s important to study,” Dr. Downum said.

Leave a Reply